WO2022068640A1 - Method and device for broadcasting voice information in multi-user voice call - Google Patents

Method and device for broadcasting voice information in multi-user voice call Download PDF

Info

Publication number
WO2022068640A1
WO2022068640A1 PCT/CN2021/119542 CN2021119542W WO2022068640A1 WO 2022068640 A1 WO2022068640 A1 WO 2022068640A1 CN 2021119542 W CN2021119542 W CN 2021119542W WO 2022068640 A1 WO2022068640 A1 WO 2022068640A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
virtual
information
target
users
Prior art date
Application number
PCT/CN2021/119542
Other languages
French (fr)
Chinese (zh)
Inventor
程翰
Original Assignee
上海连尚网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海连尚网络科技有限公司 filed Critical 上海连尚网络科技有限公司
Publication of WO2022068640A1 publication Critical patent/WO2022068640A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information

Definitions

  • the present application relates to the field of communications, and in particular, to a technology for playing voice information in multi-person voices.
  • multi-person voice communication means that multiple users use clients on terminal devices such as mobile phones and PCs to use voice in real time.
  • a common multi-person voice communication scheme is that each client receives real-time voice information from multiple other clients, and then mixes multiple received real-time voice information locally to obtain a local Mix the voice message and play it.
  • An object of the present application is to provide a method and device for playing voice information in a multi-person voice.
  • a method for playing voice information in a multi-person voice applied to a network device comprising:
  • a network device for playing voice information in a multi-person voice comprising:
  • One-to-one module for the target user among the multiple users participating in the multi-person voice, to determine the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information of the other users in the multiple users position information, to generate virtual sound field information corresponding to the target user;
  • the first and second modules are configured to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position information of each of the other users in the target virtual sound field. Play the user's voice message.
  • a device for playing voice information in a multi-person voice wherein the device includes:
  • memory arranged to store computer-executable instructions which, when executed, cause the processor to:
  • a computer-readable medium storing instructions that, when executed, cause a system to:
  • the present application can determine the virtual position information of other users in the virtual sound field corresponding to the user for each user of the multiple users participating in the voice, and then according to the virtual sound field corresponding to the user by other users in the user.
  • the virtual location information in the server plays the voice information of other users, so that each user can clearly and accurately distinguish each person's voice in the multi-person voice, and can intuitively and quickly know which other user is currently speaking.
  • FIG. 1 shows a flowchart of a method for playing voice information in a multi-person voice applied to a network device end according to an embodiment of the present application
  • FIG. 2 shows a structural diagram of a network device for playing voice information in a multi-person voice according to an embodiment of the present application
  • FIG. 3 illustrates an exemplary system that may be used to implement various embodiments described in this application.
  • the terminal, the device serving the network, and the trusted party all include one or more processors (for example, a central processing unit (CPU)), an input/output interface, a network interface, and RAM.
  • processors for example, a central processing unit (CPU)
  • CPU central processing unit
  • RAM random access memory
  • Memory may include non-persistent memory in computer-readable media, random access memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash memory ( Flash Memory).
  • RAM random access memory
  • ROM read only memory
  • Flash Memory Flash Memory
  • Memory is an example of a computer-readable medium.
  • Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology.
  • Information may be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, Phase-Change Memory (PCM), Programmable Random Access Memory (PRAM), Static Random-Access Memory (Static Random-Access Memory, SRAM), Dynamic Random Access Memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically-Erasable Programmable Read -Only Memory, EEPROM), flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD), or other optical storage , magnetic tape cartridges, magnetic tape-disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PCM Phase-Change Memory
  • PRAM Programmable Random Access Memory
  • SRAM Static
  • the equipment referred to in this application includes, but is not limited to, user equipment, network equipment, or equipment formed by integrating user equipment and network equipment through a network.
  • the user equipment includes, but is not limited to, any mobile electronic product that can perform human-computer interaction with the user (for example, human-computer interaction through a touchpad), such as a smart phone, a tablet computer, etc., and the mobile electronic product can use any operation. system, such as Android operating system, iOS operating system, etc.
  • the network device includes an electronic device that can automatically perform numerical calculation and information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application specific integrated circuits (ASICs) ), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.
  • the network device includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud formed by a plurality of servers; here, the cloud is formed by a large number of computers or network servers based on cloud computing, Among them, cloud computing is a kind of distributed computing, a virtual supercomputer composed of a group of loosely coupled computer sets.
  • the network includes but is not limited to the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless ad hoc network (Ad Hoc network), and the like.
  • the device may also be a program running on the user equipment, network equipment, or a device formed by user equipment and network equipment, network equipment, touch terminal or network equipment and touch terminal integrated through a network.
  • FIG. 1 shows a flow chart of a method for playing voice in a multi-person voice applied to a network device according to an embodiment of the present application, and the method includes step S11 and step S12.
  • step S11 for the target user among the multiple users participating in the multi-person speech, the network device determines the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual location information, and generate virtual sound field information corresponding to the target user; in step S12, the network device sends the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can The virtual position information of each user in the virtual sound field plays the user's voice information.
  • step S11 for the target user among the multiple users participating in the multi-person speech, the network device determines the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information to generate virtual sound field information corresponding to the target user.
  • the target user is each of the multiple users participating in the multi-person speech.
  • the virtual sound field is a relative coordinate system
  • the relative coordinate system may be a two-dimensional plane coordinate system or a three-dimensional space coordinate system
  • each user corresponds to a virtual sound field
  • the virtual position refers to other users
  • the virtual position information is the coordinate value corresponding to the coordinate point
  • the virtual position corresponding to the user in the user's virtual sound field is the coordinate origin.
  • the virtual position information corresponding to User1 is (0,0)
  • the virtual position corresponding to User2 is information (0,1)
  • the virtual position information corresponding to User1 is ( 0, -1)
  • the virtual location information corresponding to User2 is (0,0).
  • the coordinate axis unit of the virtual sound field corresponding to a certain user is a predetermined distance interval, for example, 1 cm, 10 cm, 1 meter, etc.
  • the coordinate axis direction is a predetermined direction relative to the user
  • the positive direction of the X-axis is to the right of the user
  • the positive direction of the Y-axis is the front of the user.
  • relative distance information and relative direction information between two users can be obtained according to virtual position information corresponding to one user in the virtual sound field of another user, and the coordinate axis unit and coordinate axis direction of the virtual sound field. .
  • the virtual sound field information corresponding to the user includes but is not limited to the coordinate axis direction and coordinate axis unit of the virtual sound field of the user, and the information of each other user in the user's virtual sound field. Corresponding virtual position information in the virtual sound field (that is, the coordinate value of the coordinate point).
  • step S12 the network device sends the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position of each of the other users in the target virtual sound field.
  • the message plays the user's voice message.
  • the voice information of the other users may be sent from the user equipment corresponding to the other users to the user equipment corresponding to the user via the network device, or may also be sent from the other users.
  • the corresponding user equipment is sent to the user equipment corresponding to the user through the p2p connection established between the two user equipments.
  • the coordinate axis direction and coordinate axis unit of the virtual sound field can obtain relative distance information and relative direction information of the other user relative to the user, and play the voice information according to the relative distance information and relative direction information.
  • the positive direction of the X-axis is to the right of User1
  • the positive direction of the Y-axis is the front of User1
  • the unit of the X-axis and the Y-axis is 1 meter
  • the virtual position information corresponding to User1 is (0 ,0)
  • the virtual position information corresponding to User2 is (0,-2)
  • it can be concluded that User2 is 2 meters behind User1, and plays the voice information according to the relative distance information and relative direction information .
  • the manner of playing the voice information according to the relative distance information and the relative direction information may be to filter and delay the voice information through a head-related transfer function (HRTF), and then output the voice information to the speaker of the user equipment for playback, Therefore, in the multi-person voice, the user can clearly and accurately distinguish each person's voice when multiple other users are speaking at the same time, and the user can intuitively and quickly know which other user is currently speaking when each other user is speaking. , which can provide great convenience for users in multi-person voice.
  • HRTF head-related transfer function
  • determining the virtual position information of other users among the multiple users in the virtual sound field corresponding to the target user includes step S13 (not shown), step S14 (not shown) and step S15 (not shown).
  • step S13 the network device determines the virtual scene information corresponding to the multi-person voice; in step S14, the network device determines the virtual location corresponding to each of the multiple users according to the virtual scene information;
  • step S15 the network device determines virtual position information of the other users in the virtual sound field corresponding to the target user according to the virtual position corresponding to the target user and the virtual positions corresponding to the other users.
  • the virtual scene may be a virtual two-dimensional scene or a virtual three-dimensional scene, for example, a virtual conference room, a virtual classroom, and the like.
  • the virtual scene information includes, but is not limited to, visualization information of the virtual scene, configuration information of the virtual scene, etc.
  • the visualization information of the virtual scene is used to intuitively present the virtual scene to the user by means of a 2D scene image or a 3D scene model.
  • the virtual scene may be selected by a voice-initiated user among multiple default virtual scenes, or may be at least one target virtual scene selected from multiple default virtual scenes by at least one user among the multiple users, Then, from at least one target virtual scene, the target virtual scene that has been selected the most times by the user is determined as the virtual scene corresponding to multiple voices.
  • the virtual position of each user in the virtual scene may be determined for each user by the voice initiating user, or may be determined by each user individually, or may also be determined according to each user
  • the user information corresponding to each user, the virtual position corresponding to each user in the virtual scene is determined in a plurality of predetermined virtual positions, wherein the tag information (for example, the virtual classroom) of the virtual position of each user in the virtual scene (for example, the virtual classroom) , "Podium”) matches the user's corresponding user information (eg, "language teacher").
  • the network device after determining the virtual scene corresponding to the multiple voices, sends the virtual scene information corresponding to the virtual scene to each user in the multiple voices, and then, according to the visualization information in the virtual scene information, Intuitively present the virtual scene to each user by means of a 2D scene image or a 3D scene model, and then each user determines the location of their respective The virtual position in the virtual scene, or only the virtual scene information corresponding to the virtual scene is sent to the voice initiating user, and the voice initiating user determines the virtual position of each user in the multi-person voice in the virtual scene.
  • the user and the user can be obtained. relative distance information and relative direction information between each other user, and determine the virtual position information of each other user in the virtual sound field corresponding to the user.
  • the network device sends the virtual location of each user in the virtual scene to each user and presents it on the corresponding user device, so that each user can know that he and each other user are in a virtual environment Virtual location in the scene.
  • each user equipment intuitively presents to each user the virtual position corresponding to itself and each other user in the virtual scene in the 2D scene image or 3D scene model corresponding to the virtual scene information, so that each user The user can intuitively and quickly know the relative distance and relative direction of other users in the virtual scene relative to himself.
  • the user equipment can present the corresponding user identification information (for example, at each virtual position in the 2D scene image or 3D scene model). , user name, user ID, etc.).
  • the step S13 includes: the network device obtains identification information corresponding to the target virtual scene information selected by the voice-initiated user among the plurality of default virtual scene information, and assigns the target virtual scene to the target virtual scene.
  • the information is determined as virtual scene information corresponding to the multi-person voices.
  • the voice-initiated user selects a target virtual scene from among multiple default virtual scenes, and sends identification information (eg, scene name, scene ID, etc.) corresponding to the target virtual scene to the network device.
  • multiple default virtual scenes include virtual meeting room 1, virtual meeting room 2, virtual classroom 1, and virtual classroom 2.
  • the voice-initiated user selects virtual meeting room 1 as the target virtual scene among the multiple default virtual scenes, and sets the corresponding virtual meeting room 1 as the target virtual scene.
  • the identification information "Virtual Meeting Room 1" is sent to the network device.
  • the step S13 includes: the network device obtains at least one target virtual scene information selected by at least one user among the plurality of users from the plurality of default virtual scene information, and obtains information from the at least one target virtual scene from the at least one target virtual scene.
  • the virtual scene information corresponding to the multi-person voices is determined in the information, wherein the determined virtual scene information is selected the most times.
  • each user can select one or more target virtual scenes from multiple default virtual scenes, and send identification information corresponding to the one or more target virtual scenes to the network device, and then the network device can retrieve the corresponding identification information from the one or more target virtual scenes to the network device.
  • the target virtual scene selected by the user the most times is determined as the virtual scene corresponding to the multi-person voice.
  • each user can only select one target virtual scene among multiple default virtual scenes.
  • the step S13 includes: the network device determines target default virtual scene information that matches the voice theme information from a plurality of default virtual scene information according to the voice theme information corresponding to the multi-person voices, and determining the target default virtual scene information as virtual scene information corresponding to the multi-person voices.
  • the voice topic information corresponding to the multi-person voices may be sent to the network device after input by the voice-initiated user, or may be selected by the voice-initiated user from multiple preset default voice topic information corresponding to the multi-person voices voice theme information, and send the identification information (for example, theme name, theme ID, etc.) corresponding to the voice theme information to the network device, wherein the voice theme information is used to represent the theme of this multi-person voice, including but not limited to "" Conference", “Class Meeting", “Technology Sharing", etc.
  • the default virtual scene matching the voice theme information is determined as the virtual scene corresponding to the multiple voices in the multiple default virtual scenes, for example, a plurality of default virtual scenes
  • the virtual scene includes a virtual conference room, a virtual classroom, and a virtual coffee shop.
  • the voice theme information "meeting" corresponding to the multi-person voice the default virtual scene "matching the voice theme information "meeting" in the multiple default virtual scenes" "Virtual meeting room” is determined as the virtual scene corresponding to the multi-person voice.
  • the step S13 includes a step S16 (not shown).
  • the network device determines target default virtual scene information that matches the user information from a plurality of default virtual scene information according to the user information corresponding to the multiple users, and converts the target default virtual scene information to the target default virtual scene information.
  • the virtual scene information corresponding to the multi-person voice is determined.
  • the multiple default virtual scene information will correspond to each user
  • the default virtual scene that matches the user information of the user information or the user information corresponding to the voice initiating user is determined as the virtual scene corresponding to the multi-person voice.
  • the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information.
  • target default virtual scene information that matches the user information is determined from a plurality of default virtual scene information.
  • multiple default virtual scenes include virtual conference rooms, virtual classrooms, and virtual coffee shops
  • the user information corresponding to the semantically initiating user among the multiple users includes "occupation: teacher", then according to the user information "occupation: teacher”
  • the default virtual scene "virtual classroom” that matches the user information "occupation: teacher” among the plurality of default virtual scenes is determined as the virtual scene corresponding to the multi-person speech.
  • the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information.
  • User information corresponding to each of the plurality of users, at least one default virtual scene information matching the user information corresponding to each of the plurality of users is determined from a plurality of default virtual scene information, and from the Target default virtual scene information is determined from at least one default virtual scene information, wherein the number of users matching the target default virtual scene information is the largest.
  • a default virtual scene matching the user information is determined from a plurality of default virtual scenes.
  • the default virtual scene with the largest number of matched users among the at least one default virtual scene matched with the user information corresponding to each user is determined as the virtual scene corresponding to the multi-person voice.
  • multiple users corresponding to multi-person voices include User1, User2, and User3, and multiple default virtual scenes include virtual conference rooms, virtual classrooms, and virtual cafes.
  • User1's user information includes "occupation: teacher".
  • the default virtual scene is a virtual classroom
  • the user information of User2 includes "occupation: student”
  • the default virtual scene matching User2 is also a virtual classroom
  • the user information of User3 includes “hobby: drinking coffee”
  • the default virtual scene matching User3 If it is a virtual coffee shop, in the virtual classroom and virtual coffee shop, the virtual scene "virtual classroom" with the largest number of matched users is determined as the virtual scene corresponding to the multi-person voice.
  • the virtual scene information includes multiple predetermined virtual locations; wherein, the step S14 includes: for each user of the multiple users, the network device obtains the user's location in the multiple predetermined virtual locations.
  • the target predetermined virtual position corresponding to the virtual position is determined as the virtual position of the user in the virtual scene information.
  • the virtual scene includes a plurality of predetermined virtual positions, and each predetermined virtual position is intuitively presented to the user in the 2D scene image or 3D scene model corresponding to the virtual scene information, and each user corresponds to the plurality of predetermined virtual positions.
  • a target predetermined virtual position in the virtual positions, and the target predetermined virtual position corresponding to each user is determined as the virtual position of the user in the virtual scene.
  • each user corresponds to a different target predetermined virtual position.
  • the virtual position of the user in the virtual scene can only be one of a plurality of predetermined virtual positions, and cannot be any virtual position in the virtual scene.
  • the target predetermined virtual location may be selected by the voice initiating user for each user among the plurality of predetermined virtual locations, or may also be selected by each user for himself among the plurality of predetermined virtual locations, or , or according to the respective user information of each user, a predetermined virtual position matching the user information of the user is automatically determined for each user from a plurality of predetermined virtual positions.
  • the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users obtaining a target predetermined virtual position designated by a voice-initiated user among the plurality of predetermined virtual positions for the user among the plurality of users.
  • the voice-initiated user specifies a target predetermined virtual position corresponding to each user in the plurality of predetermined virtual positions, and sends the identification information of the target predetermined virtual position corresponding to each user to the network device, preferably, The voice-initiated user needs to select a different target predetermined virtual position for each user, and cannot select the same target predetermined virtual position for multiple users.
  • the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users a user, according to the user information corresponding to the user, determine a target predetermined virtual position among the plurality of predetermined virtual positions, wherein the tag information of the target predetermined virtual position in the virtual scene information corresponds to the user corresponding to the user information to match.
  • a target predetermined virtual location whose corresponding tag information in the virtual scene matches the user information is automatically determined among a plurality of predetermined virtual locations.
  • the user information of User1 includes "occupation: Chinese teacher", the virtual scene is a virtual classroom, and the tag information corresponding to the predetermined virtual position L1 in the virtual scene is "podium”, and the tag information is the same as that of User1.
  • the user information "occupation: Chinese teacher” of the user information "occupation: Chinese teacher” is matched, and thus the predetermined virtual position L1 can be determined as the target predetermined virtual position corresponding to User1 in the plurality of predetermined virtual positions.
  • obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes step S17 (not shown), step S18 (not shown) and step S19 (not shown).
  • step S17 the network device generates virtual location request information and sends it to each of the plurality of users, wherein the virtual location request information includes the virtual scene information; in step S18, the network device receives the information of the virtual location.
  • step S19 the network device determines, for each user in the plurality of users, according to the feedback information, the target predetermined target corresponding to the user in the plurality of predetermined virtual positions virtual location.
  • virtual location request information including virtual scene information is sent to each user, and after each user receives the virtual location request information, each user presents a 2D scene image or 3D scene model corresponding to the virtual scene information, and A plurality of predetermined virtual positions are presented in the 2D scene image or 3D scene model, each user selects a target predetermined virtual position from the plurality of predetermined virtual positions, and includes the identification information of the selected target predetermined virtual position in the
  • the feedback information is sent to the network device. After receiving the feedback information sent by a certain user, the network device can obtain the target predetermined virtual position selected by the user among the plurality of predetermined virtual positions. Preferably, each user can only select With different target predetermined virtual positions, multiple users cannot select the same target predetermined virtual position.
  • the virtual location request information corresponds to a feedback period, and after the feedback period is reached, for each user among the plurality of users who has not yet given feedback, the voice-initiated user can select from the at least one currently unselected reservation.
  • a corresponding target predetermined virtual location is selected for each currently unreported user, or the network device may also automatically assign a corresponding target virtual location to each currently unreported user from at least one predetermined virtual location that is not currently selected. target predetermined virtual location.
  • the method further includes: after receiving the feedback information sent by the first user among the multiple users, the network device generates first prompt information corresponding to the feedback information, and sends the first prompt information to the first user.
  • the prompt information is sent to other users among the plurality of users who have not yet given feedback, so as to prompt that the predetermined virtual position of the first target indicated by the feedback information is not selectable.
  • the network device after receiving the feedback information sent by the first user and selecting the first target predetermined virtual location from the multiple predetermined virtual locations, the network device generates prompt information corresponding to the feedback information (for example, "the first target virtual location" The user has selected the first target predetermined virtual position”), and send it to each other user among the multiple users who have not yet provided feedback, so as to prompt each other user who has not yet provided feedback that the first target predetermined virtual position cannot be selected.
  • prompt information corresponding to the feedback information for example, "the first target virtual location” The user has selected the first target predetermined virtual position
  • the user equipment corresponding to each other user can display the 2D scene image or 3D scene image corresponding to the virtual scene information In the scene model, the predetermined virtual position of the first target is set to an unselectable state.
  • the method further includes: after the network device receives the feedback information sent by the first user among the multiple users, generating second prompt information corresponding to the feedback information, and converting the second prompt information to the network device.
  • the information is sent to other users in the plurality of users except the first user, to prompt that the first target predetermined virtual position indicated by the feedback information has been selected by the first user.
  • the network device sends prompt information (eg, "the first user has selected the first target predetermined virtual location") to each of the plurality of users except the first user to prompt each of the other users
  • the predetermined virtual position of the first target of the user has been selected by the first user, so that each user can know the virtual positions of other users in the virtual scene.
  • identification information eg, user name, user ID, etc.
  • of the first user is presented at the predetermined virtual location of the first target in the 2D scene image or 3D scene model corresponding to the virtual scene information.
  • the method further includes: the network device receiving invitation request information sent by a second user of the at least one user, wherein the second user has selected a first user in the plurality of predetermined virtual locations Second target predetermined virtual location, the invitation request information is used to invite a third user among the multiple users who has not given feedback currently to select a predetermined virtual location near the second target predetermined virtual location; send the invitation request information to The third user is to prompt the third user to select an unselected predetermined virtual position near the second target predetermined virtual position as the target predetermined virtual position corresponding to the third user.
  • the second user has selected the second target predetermined virtual position among the plurality of predetermined virtual positions as his own virtual position in the virtual scene, in response to the invitation performed by the second user for the third user who is not currently feeding back
  • An operation is triggered to generate invitation request information for inviting the second user to select a predetermined virtual location near the second target predetermined virtual location, and send the request to the network device.
  • it is necessary to first detect whether the third user has currently selected a predetermined virtual location corresponding to himself, and if the third user does not currently select, the invitation triggering operation can be performed for the third user.
  • the network device forwards the invitation request information to the third user, so as to prompt the third user to select an unselected reservation near the second target predetermined virtual location
  • the virtual position is used as the target predetermined virtual position corresponding to the third user.
  • the second target may be reserved in the 2D scene image or 3D scene model corresponding to the virtual scene information
  • At least one unselected predetermined virtual location near the virtual location is set to a special display state (eg, highlighted) to guide the third user to select one of the at least one predetermined virtual location as a target predetermined virtual location.
  • the method further includes: after the network device reaches a predetermined feedback period corresponding to the virtual location request information, for each user among the plurality of users that is not currently giving feedback, determining that the user is in the A target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected among the plurality of predetermined virtual positions, and the target predetermined virtual position is determined as the virtual position of the user in the virtual scene information.
  • the virtual location request information corresponds to a predetermined feedback period (for example, 5 minutes).
  • the user may select a corresponding target predetermined virtual position from at least one predetermined virtual position that is not currently selected by the voice-initiated user, or, alternatively, it may be
  • Each currently unfeedback user is automatically assigned a corresponding target predetermined virtual position from at least one predetermined virtual position that is not currently selected by the network device.
  • different predetermined virtual locations correspond to different priorities in the virtual scene according to their respective tag information, and the corresponding targets can be automatically assigned to users who have not given feedback currently in descending order of priority. Book a virtual location.
  • the priority corresponding to the plurality of predetermined virtual positions whose label information is "first row” will be higher than the priority corresponding to the plurality of predetermined virtual positions whose label information is "second row”
  • the priority is to automatically assign the currently unselected predetermined virtual position with the label information of "first row” to the user who has not given feedback currently as the corresponding target predetermined virtual position.
  • determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations includes: determining the hotspot location area information in the virtual scene information according to the virtual location of at least one user in the virtual scene information that has been currently fed back from the multiple users; for the multiple users For each user who has not given feedback at present, an unselected predetermined virtual location in the hotspot location area information is determined as the virtual location of the user in the virtual scene information.
  • the feedback period is reached, according to the current distribution of the user's virtual location corresponding to each user that has been fed back in the virtual scene, determine a hotspot location area in the virtual scene where the user's virtual location is densely distributed, then Priority is given to automatically assigning a predetermined virtual location to each current unreported user from one or more unselected predetermined virtual locations in the hotspot location area as the respective corresponding target predetermined virtual location, if all the predetermined virtual locations in the hotspot location area If all virtual positions have been selected, a predetermined virtual position is automatically assigned to each currently unfeeding user from other unselected predetermined virtual positions in the virtual scene as the respective corresponding target predetermined virtual positions.
  • determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations includes: for each user among the plurality of users who has not currently given feedback, determining a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected by the user among the plurality of predetermined virtual positions, Wherein, the tag information of the target predetermined virtual position in the virtual scene information matches the user information corresponding to the user.
  • the user information of User1 that has not been fed back currently includes "occupation: teacher”
  • the virtual scene is a virtual classroom
  • the label information corresponding to the predetermined virtual position L1 in at least one predetermined virtual position that is not currently selected in the virtual scene is " Lecture”
  • the tag information matches the user information "occupation: teacher” of User1, so that the predetermined virtual position L1 can be automatically assigned to User1 as its corresponding target predetermined virtual position.
  • FIG. 2 shows a structure diagram of a network device for playing voice in a multi-person voice according to an embodiment of the present application.
  • the device includes a first module 11 and a second module 12 .
  • a module 11 is configured to, for the target user among the multiple users participating in the multi-person voice, determine the virtual position information of the other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual location information, to generate virtual sound field information corresponding to the target user; the first and second modules 12 are used to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can The virtual position information of each user in the virtual sound field plays the user's voice information.
  • a module 11 is configured to, for the target user among the multiple users participating in the multi-person voice, determine the virtual position information of the other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information to generate virtual sound field information corresponding to the target user.
  • the target user is each of the multiple users participating in the multi-person speech.
  • the virtual sound field is a relative coordinate system
  • the relative coordinate system may be a two-dimensional plane coordinate system or a three-dimensional space coordinate system
  • each user corresponds to a virtual sound field
  • the virtual position refers to other users
  • the virtual position information is the coordinate value corresponding to the coordinate point
  • the virtual position corresponding to the user in the user's virtual sound field is the coordinate origin.
  • the virtual position information corresponding to User1 is (0,0)
  • the virtual position corresponding to User2 is information (0,1)
  • the virtual position information corresponding to User1 is ( 0, -1)
  • the virtual location information corresponding to User2 is (0,0).
  • the coordinate axis unit of the virtual sound field corresponding to a certain user is a predetermined distance interval, for example, 1 cm, 10 cm, 1 meter, etc.
  • the coordinate axis direction is a predetermined direction relative to the user
  • the positive direction of the X-axis is to the right of the user
  • the positive direction of the Y-axis is the front of the user.
  • relative distance information and relative direction information between two users can be obtained according to virtual position information corresponding to one user in the virtual sound field of another user, and the coordinate axis unit and coordinate axis direction of the virtual sound field. .
  • the virtual sound field information corresponding to the user includes but is not limited to the coordinate axis direction and coordinate axis unit of the virtual sound field of the user, and the information of each other user in the user's virtual sound field. Corresponding virtual position information in the virtual sound field (that is, the coordinate value of the coordinate point).
  • the first and second modules 12 are configured to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position of each of the other users in the target virtual sound field.
  • the message plays the user's voice message.
  • the voice information of the other users may be sent from the user equipment corresponding to the other users to the user equipment corresponding to the user via the network device, or may also be sent from the other users
  • the corresponding user equipment is sent to the user equipment corresponding to the user through the p2p connection established between the two user equipments.
  • the coordinate axis direction and coordinate axis unit of the virtual sound field can obtain relative distance information and relative direction information of the other user relative to the user, and play the voice information according to the relative distance information and relative direction information.
  • the positive direction of the X-axis is to the right of User1
  • the positive direction of the Y-axis is the front of User1
  • the unit of the X-axis and the Y-axis is 1 meter
  • the virtual position information corresponding to User1 is (0 ,0)
  • the virtual position information corresponding to User2 is (0,-2)
  • it can be concluded that User2 is 2 meters behind User1, and plays the voice information according to the relative distance information and relative direction information .
  • the manner of playing the voice information according to the relative distance information and the relative direction information may be to filter and delay the voice information through a head-related transfer function (HRTF), and then output the voice information to the speaker of the user equipment for playback, Therefore, in the multi-person voice, the user can clearly and accurately distinguish each person's voice when multiple other users are speaking at the same time, and the user can intuitively and quickly know which other user is currently speaking when each other user is speaking. , which can provide great convenience for users in multi-person voice.
  • HRTF head-related transfer function
  • the virtual position information of the other users among the multiple users in the virtual sound field corresponding to the target user is determined, including one or three Module 13 (not shown), a four-module 14 (not shown) and a five-module 15 (not shown).
  • a third module 13 is used to determine the virtual scene information corresponding to the multi-person voices;
  • a fourth module 14 is used to determine the virtual location corresponding to each of the multiple users according to the virtual scene information;
  • the fifth module 15 is configured to determine the virtual position information of the other users in the virtual sound field corresponding to the target user according to the virtual position corresponding to the target user and the virtual positions corresponding to the other users.
  • the specific implementations of the one three modules 13, the one four modules 14 and the one five modules 15 are the same as or similar to the embodiments of the steps S13, S14 and S15 in FIG. here.
  • the one-three modules 13 are configured to: obtain identification information corresponding to the target virtual scene information selected by the voice-initiated user among the plurality of default virtual scene information among the plurality of users, and assign the target virtual scene information to the target virtual scene information.
  • the scene information is determined as virtual scene information corresponding to the multi-person voices.
  • the one-three modules 13 are configured to: obtain at least one target virtual scene information selected by at least one user among the plurality of users from a plurality of default virtual scene information, and obtain at least one target virtual scene information from the at least one target virtual scene information.
  • the virtual scene information corresponding to the multi-person voices is determined in the scene information, wherein the determined virtual scene information is selected the most times.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the one-three modules 13 are configured to: determine target default virtual scene information matching the voice theme information from a plurality of default virtual scene information according to the voice theme information corresponding to the voices of the multiple people , and determine the target default virtual scene information as the virtual scene information corresponding to the multi-person voices.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the three-module 13 includes a six-module 16 (not shown).
  • a six-module 16 is configured to determine target default virtual scene information matching the user information from a plurality of default virtual scene information according to the user information corresponding to the plurality of users, and convert the target default virtual scene information to the target default virtual scene information.
  • the virtual scene information corresponding to the multi-person voice is determined.
  • the specific implementation of the six-module 16 is the same as or similar to the embodiment of step S16 in FIG. 1 , so it will not be repeated here, but is incorporated herein by reference.
  • the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information.
  • target default virtual scene information that matches the user information is determined from a plurality of default virtual scene information.
  • the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information.
  • User information corresponding to each of the plurality of users, at least one default virtual scene information matching the user information corresponding to each of the plurality of users is determined from a plurality of default virtual scene information, and from the Target default virtual scene information is determined from at least one default virtual scene information, wherein the number of users matching the target default virtual scene information is the largest.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the virtual scene information includes a plurality of predetermined virtual locations; wherein, the one-fourth module 14 is configured to obtain, for each user of the plurality of users, the user's location in the plurality of users
  • the target predetermined virtual position corresponding to the predetermined virtual positions is determined as the virtual position of the user in the virtual scene information.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users obtaining a target predetermined virtual position designated by a voice-initiated user among the plurality of predetermined virtual positions for the user among the plurality of users.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users a user, according to the user information corresponding to the user, determine a target predetermined virtual position among the plurality of predetermined virtual positions, wherein the tag information of the target predetermined virtual position in the virtual scene information corresponds to the user corresponding to the user information to match.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes a seven module 17 (not shown), There are eight modules 18 (not shown) and nine modules 19 (not shown).
  • a seventh module 17 is configured to generate virtual location request information and send it to each of the multiple users, wherein the virtual location request information includes the virtual scene information; an eighth module 18 is configured to receive all feedback information about the virtual location request information sent by at least one of the multiple users, wherein the feedback information sent by each of the at least one user is used to indicate that the user is in the multiple predetermined virtual locations The target predetermined virtual position selected from the positions; the 19th module 19 is configured to, for each user in the plurality of users, determine the target predetermined target corresponding to the user in the plurality of predetermined virtual positions according to the feedback information virtual location.
  • the specific implementations of the 17 module 17, the 18 module 18 and the 19 module 19 are the same as or similar to the embodiments of the steps S17, S18 and S19 in FIG. here.
  • the device is further configured to: after receiving the feedback information sent by the first user among the multiple users, generate first prompt information corresponding to the feedback information, and send the first prompt information to the user.
  • the information is sent to other users among the plurality of users who have not yet given feedback to prompt that the predetermined virtual position of the first target indicated by the feedback information is not selectable.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the device is further configured to: after receiving the feedback information sent by the first user among the multiple users, generate second prompt information corresponding to the feedback information, and convert the second prompt information It is sent to other users in the plurality of users except the first user, so as to prompt that the first target predetermined virtual position indicated by the feedback information has been selected by the first user.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the device is further configured to: receive invitation request information sent by a second user in the at least one user, wherein the second user has selected a second user in the plurality of predetermined virtual locations the target predetermined virtual location, and the invitation request information is used to invite a third user among the multiple users who has not given feedback currently to select a predetermined virtual location near the second target predetermined virtual location; send the invitation request information to all users and the third user, to prompt the third user to select an unselected predetermined virtual position near the second target predetermined virtual position as the target predetermined virtual position corresponding to the third user.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • the device is further configured to: after reaching a predetermined feedback period corresponding to the virtual location request information, for each user among the plurality of users that is not currently giving feedback, determine that the user is in the plurality of users.
  • a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected among the predetermined virtual positions, and the target predetermined virtual position is determined as the virtual position of the user in the virtual scene information.
  • determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations includes: determining the hotspot location area information in the virtual scene information according to the virtual location of at least one user in the virtual scene information that has been currently fed back from the multiple users; for the multiple users For each user who has not given feedback at present, an unselected predetermined virtual location in the hotspot location area information is determined as the virtual location of the user in the virtual scene information.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations includes: for each user among the plurality of users who has not currently given feedback, determining a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected by the user among the plurality of predetermined virtual positions, Wherein, the tag information of the target predetermined virtual position in the virtual scene information matches the user information corresponding to the user.
  • the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
  • FIG. 3 illustrates an exemplary system that may be used to implement various embodiments described in this application.
  • system 300 can function as any of the devices in each of the described embodiments.
  • system 300 may include one or more computer-readable media (eg, system memory or NVM/storage device 320 ) having instructions and be coupled to the one or more computer-readable media and configured to execute Instructions to implement a module to perform one or more processors (eg, processor(s) 305 ) to perform the actions described herein.
  • processors eg, processor(s) 305
  • system control module 310 may include any suitable interface controller to provide at least one of the processor(s) 305 and/or any suitable device or component in communication with the system control module 310 any appropriate interface.
  • the system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315 .
  • the memory controller module 330 may be a hardware module, a software module, and/or a firmware module.
  • System memory 315 may be used, for example, to load and store data and/or instructions for system 300 .
  • system memory 315 may include any suitable volatile memory, eg, suitable DRAM.
  • system memory 315 may include double data rate type quad synchronous dynamic random access memory (DDR4 SDRAM).
  • system control module 310 may include one or more input/output (I/O) controllers to provide interfaces to NVM/storage device 320 and communication interface(s) 325 .
  • I/O input/output
  • NVM/storage device 320 may be used to store data and/or instructions.
  • NVM/storage device 320 may include any suitable non-volatile memory (eg, flash memory) and/or may include any suitable non-volatile storage device(s) (eg, one or more hard drives ( HDD), one or more compact disc (CD) drives and/or one or more digital versatile disc (DVD) drives).
  • HDD hard drives
  • CD compact disc
  • DVD digital versatile disc
  • NVM/storage device 320 may include storage resources that are physically part of the device on which system 300 is installed, or it may be accessed by the device without necessarily being part of the device.
  • the NVM/storage device 320 is accessible via the communication interface(s) 325 over a network.
  • Communication interface(s) 325 may provide an interface for system 300 to communicate over one or more networks and/or with any other suitable device.
  • System 300 may wirelessly communicate with one or more components of a wireless network in accordance with any of one or more wireless network standards and/or protocols.
  • At least one of the processor(s) 305 may be packaged with the logic of one or more controllers of the system control module 310 (eg, the memory controller module 330 ). For one embodiment, at least one of the processor(s) 305 may be packaged with logic of one or more controllers of the system control module 310 to form a system-in-package (SiP). For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with the logic of one or more controllers of the system control module 310 . For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with logic of one or more controllers of the system control module 310 to form a system on a chip (SoC).
  • SoC system on a chip
  • system 300 may be, but is not limited to, a server, workstation, desktop computing device, or mobile computing device (eg, laptop computing device, handheld computing device, tablet computer, netbook, etc.). In various embodiments, system 300 may have more or fewer components and/or different architectures. For example, in some embodiments, system 300 includes one or more cameras, keyboards, liquid crystal display (LCD) screens (including touchscreen displays), non-volatile memory ports, multiple antennas, graphics chips, application specific integrated circuits ( ASIC) and speakers.
  • LCD liquid crystal display
  • ASIC application specific integrated circuits
  • the present application also provides a computer-readable storage medium, where the computer-readable storage medium stores computer code, and when the computer code is executed, the method described in any preceding item is executed.
  • the present application also provides a computer program product, when the computer program product is executed by a computer device, the method according to any one of the preceding items is executed.
  • the present application also provides a computer device, the computer device comprising:
  • processors one or more processors
  • memory for storing one or more computer programs
  • the one or more computer programs when executed by the one or more processors, cause the one or more processors to implement the method of any preceding item.
  • the present application may be implemented in software and/or a combination of software and hardware, eg, an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device.
  • the software program of the present application may be executed by a processor to implement the steps or functions described above.
  • the software programs of the present application (including associated data structures) may be stored on a computer-readable recording medium, such as RAM memory, magnetic or optical drives or floppy disks, and the like.
  • some steps or functions of the present application may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
  • a part of the present application can be applied as a computer program product, such as computer program instructions, which when executed by a computer, through the operation of the computer, can invoke or provide methods and/or technical solutions according to the present application.
  • Those skilled in the art should understand that the existing forms of computer program instructions in computer-readable media include but are not limited to source files, executable files, installation package files, etc.
  • the ways in which computer program instructions are executed by a computer include but are not limited to Limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding post-installation program. program.
  • the computer-readable medium can be any available computer-readable storage medium or communication medium that can be accessed by a computer.
  • Communication media includes media by which communication signals containing, for example, computer readable instructions, data structures, program modules or other data are transmitted from one system to another.
  • Communication media may include conducted transmission media such as cables and wires (eg, fiber optic, coaxial, etc.) and wireless (unconducted transmission) media capable of propagating energy waves, such as acoustic, electromagnetic, RF, microwave, and infrared .
  • Computer readable instructions, data structures, program modules or other data may be embodied, for example, as a modulated data signal in a wireless medium such as a carrier wave or similar mechanism such as embodied as part of spread spectrum technology.
  • modulated data signal refers to a signal whose one or more characteristics are altered or set in a manner that encodes information in the signal. Modulation can be analog, digital or hybrid modulation techniques.
  • computer-readable storage media may include volatile and non-volatile, readable storage media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Removable and non-removable media.
  • computer-readable storage media include, but are not limited to, volatile memory, such as random access memory (RAM, DRAM, SRAM); and non-volatile memory, such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, tapes, CDs, DVDs); or other media now known or later developed capable of storing data for computer systems Computer readable information/data used.
  • volatile memory such as random access memory (RAM, DRAM, SRAM
  • non-volatile memory such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, tapes, CDs, DVDs); or other media now known or later developed capable of storing data for computer systems Computer readable information/data used.
  • an embodiment according to the present application includes an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein, when the computer program instructions are executed by the processor, a trigger is
  • the apparatus operates based on the aforementioned methods and/or technical solutions according to various embodiments of the present application.

Abstract

The aim of the present application is to provide a method and device for broadcasting voice information in a multi-user voice call. The method comprises: for a target user in a plurality of users involved in a multi-user voice call, determining virtual position information of the other users in the plurality of users in a virtual sound field corresponding to the target user, and generating, according to the virtual position information, virtual sound field information corresponding to the target user; and sending the virtual sound field information to a user device corresponding to the target user, such that according to the virtual position information of each user in the other users in an ocular virtual sound field, the user device broadcasts voice information of the user. By means of the present application, each user can clearly and accurately distinguish the voice of each user in a multi-user voice call, and can intuitively and quickly know which user is currently speaking, thereby being greatly convenient for the users involved in a multi-user voice call.

Description

一种在多人语音中播放语音信息的方法与设备A method and device for playing voice information in multi-person voice
本申请是以CN申请号为202011049085.4,申请日为2020.09.29的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中This application is based on the application with the CN application number of 202011049085.4 and the filing date of 2020.09.29, and claims its priority. The disclosure of the CN application is hereby incorporated into this application as a whole.
技术领域technical field
本申请涉及通信领域,尤其涉及一种用于在多人语音中播放语音信息的技术。The present application relates to the field of communications, and in particular, to a technology for playing voice information in multi-person voices.
背景技术Background technique
随着时代的发展,语音通信已成为最流行最普遍的通信方式之一,在现有技术中,多人语音通信是指多个用户使用手机、PC等终端设备上的客户端,使用语音实时地通过网络进行通信交流,常见的多人语音通信方案是由每一客户端接收其他多个客户端的实时语音信息,然后在本地将接收到的多个所述实时语音信息进行混音,得到本地混合语音信息并进行播放。With the development of the times, voice communication has become one of the most popular and common communication methods. In the prior art, multi-person voice communication means that multiple users use clients on terminal devices such as mobile phones and PCs to use voice in real time. A common multi-person voice communication scheme is that each client receives real-time voice information from multiple other clients, and then mixes multiple received real-time voice information locally to obtain a local Mix the voice message and play it.
发明内容SUMMARY OF THE INVENTION
本申请的一个目的是提供一种在多人语音中播放语音信息的方法与设备。An object of the present application is to provide a method and device for playing voice information in a multi-person voice.
根据本申请的一个方面,提供了一种应用于网络设备端在多人语音中播放语音信息的方法,该方法包括:According to an aspect of the present application, there is provided a method for playing voice information in a multi-person voice applied to a network device, the method comprising:
对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;For a target user among multiple users participating in multi-person speech, determine the virtual location information of other users in the multiple users in the virtual sound field corresponding to the target user, and generate the virtual location information according to the virtual location information. The virtual sound field information corresponding to the target user;
将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。Send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment plays the user's voice information according to the virtual position information of each of the other users in the target virtual sound field .
根据本申请的一个方面,提供了一种在多人语音中播放语音信息的网络设备,该设备包括:According to one aspect of the present application, there is provided a network device for playing voice information in a multi-person voice, the device comprising:
一一模块,用于对于参与多人语音的多个用户中的目标用户,确定所述多个用户中 的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;One-to-one module, for the target user among the multiple users participating in the multi-person voice, to determine the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information of the other users in the multiple users position information, to generate virtual sound field information corresponding to the target user;
一二模块,用于将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。The first and second modules are configured to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position information of each of the other users in the target virtual sound field. Play the user's voice message.
根据本申请的一个方面,提供了一种在多人语音中播放语音信息的设备,其中,该设备包括:According to an aspect of the present application, there is provided a device for playing voice information in a multi-person voice, wherein the device includes:
处理器;以及processor; and
被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如下操作:memory arranged to store computer-executable instructions which, when executed, cause the processor to:
对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;For a target user among multiple users participating in multi-person speech, determine the virtual location information of other users in the multiple users in the virtual sound field corresponding to the target user, and generate the virtual location information according to the virtual location information. The virtual sound field information corresponding to the target user;
将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。Send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment plays the user's voice information according to the virtual position information of each of the other users in the target virtual sound field .
根据本申请的一个方面,提供了一种存储指令的计算机可读介质,所述指令在被执行时使得系统进行如下操作:According to one aspect of the present application, there is provided a computer-readable medium storing instructions that, when executed, cause a system to:
对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;For a target user among multiple users participating in multi-person speech, determine the virtual location information of other users in the multiple users in the virtual sound field corresponding to the target user, and generate the virtual location information according to the virtual location information. The virtual sound field information corresponding to the target user;
将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。Send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment plays the user's voice information according to the virtual position information of each of the other users in the target virtual sound field .
与现有技术相比,本申请能够针对参与语音的多个用户中的每个用户,确定其他用户在该用户对应的虚拟声场中的虚拟位置信息,进而根据其他用户在该用户对应的虚拟声场中的虚拟位置信息播放其他用户的语音信息,从而使得每个用户在多人语音中都能清楚准确地区分每个人的语音,并且能够直观快速地知悉当前是哪个其他用户在说话,这能够为多人语音中的用户提供极大的便利。Compared with the prior art, the present application can determine the virtual position information of other users in the virtual sound field corresponding to the user for each user of the multiple users participating in the voice, and then according to the virtual sound field corresponding to the user by other users in the user. The virtual location information in the server plays the voice information of other users, so that each user can clearly and accurately distinguish each person's voice in the multi-person voice, and can intuitively and quickly know which other user is currently speaking. Great convenience for users in multiplayer voice.
附图说明Description of drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:Other features, objects and advantages of the present application will become more apparent by reading the detailed description of non-limiting embodiments made with reference to the following drawings:
图1示出根据本申请一个实施例的一种应用于网络设备端的在多人语音中播放语音信息的方法流程图;1 shows a flowchart of a method for playing voice information in a multi-person voice applied to a network device end according to an embodiment of the present application;
图2示出根据本申请一个实施例的一种在多人语音中播放语音信息的网络设备结构图;2 shows a structural diagram of a network device for playing voice information in a multi-person voice according to an embodiment of the present application;
图3示出可被用于实施本申请中所述的各个实施例的示例性系统。3 illustrates an exemplary system that may be used to implement various embodiments described in this application.
附图中相同或相似的附图标记代表相同或相似的部件。The same or similar reference numbers in the drawings represent the same or similar parts.
具体实施方式Detailed ways
下面结合附图对本申请作进一步详细描述。The present application will be described in further detail below with reference to the accompanying drawings.
在本申请一个典型的配置中,终端、服务网络的设备和可信方均包括一个或多个处理器(例如,中央处理器(Central Processing Unit,CPU))、输入/输出接口、网络接口和内存。In a typical configuration of the present application, the terminal, the device serving the network, and the trusted party all include one or more processors (for example, a central processing unit (CPU)), an input/output interface, a network interface, and RAM.
内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(Random Access Memory,RAM)和/或非易失性内存等形式,如只读存储器(Read Only Memory,ROM)或闪存(Flash Memory)。内存是计算机可读介质的示例。Memory may include non-persistent memory in computer-readable media, random access memory (RAM) and/or non-volatile memory, such as read only memory (ROM) or flash memory ( Flash Memory). Memory is an example of a computer-readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(Phase-Change Memory,PCM)、可编程随机存取存储器(Programmable Random Access Memory,PRAM)、静态随机存取存储器(Static Random-Access Memory,SRAM)、动态随机存取存储器(Dynamic Random Access Memory,DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能光盘(Digital Versatile Disc,DVD)或其他光学存储、磁盒式磁带,磁带磁盘存储或其他磁 性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。Computer-readable media includes both persistent and non-permanent, removable and non-removable media, and storage of information may be implemented by any method or technology. Information may be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, Phase-Change Memory (PCM), Programmable Random Access Memory (PRAM), Static Random-Access Memory (Static Random-Access Memory, SRAM), Dynamic Random Access Memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically-Erasable Programmable Read -Only Memory, EEPROM), flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD), or other optical storage , magnetic tape cartridges, magnetic tape-disk storage or other magnetic storage devices or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
本申请所指设备包括但不限于用户设备、网络设备、或用户设备与网络设备通过网络相集成所构成的设备。所述用户设备包括但不限于任何一种可与用户进行人机交互(例如通过触摸板进行人机交互)的移动电子产品,例如智能手机、平板电脑等,所述移动电子产品可以采用任意操作系统,如Android操作系统、iOS操作系统等。其中,所述网络设备包括一种能够按照事先设定或存储的指令,自动进行数值计算和信息处理的电子设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑器件(Programmable Logic Device,PLD)、现场可编程门阵列(Field Programmable Gate Array,FPGA)、数字信号处理器(Digital Signal Processor,DSP)、嵌入式设备等。所述网络设备包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云;在此,云由基于云计算(Cloud Computing)的大量计算机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个虚拟超级计算机。所述网络包括但不限于互联网、广域网、城域网、局域网、VPN网络、无线自组织网络(Ad Hoc网络)等。优选地,所述设备还可以是运行于所述用户设备、网络设备、或用户设备与网络设备、网络设备、触摸终端或网络设备与触摸终端通过网络相集成所构成的设备上的程序。The equipment referred to in this application includes, but is not limited to, user equipment, network equipment, or equipment formed by integrating user equipment and network equipment through a network. The user equipment includes, but is not limited to, any mobile electronic product that can perform human-computer interaction with the user (for example, human-computer interaction through a touchpad), such as a smart phone, a tablet computer, etc., and the mobile electronic product can use any operation. system, such as Android operating system, iOS operating system, etc. Wherein, the network device includes an electronic device that can automatically perform numerical calculation and information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, application specific integrated circuits (ASICs) ), Programmable Logic Device (PLD), Field Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc. The network device includes but is not limited to a computer, a network host, a single network server, a plurality of network server sets or a cloud formed by a plurality of servers; here, the cloud is formed by a large number of computers or network servers based on cloud computing, Among them, cloud computing is a kind of distributed computing, a virtual supercomputer composed of a group of loosely coupled computer sets. The network includes but is not limited to the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless ad hoc network (Ad Hoc network), and the like. Preferably, the device may also be a program running on the user equipment, network equipment, or a device formed by user equipment and network equipment, network equipment, touch terminal or network equipment and touch terminal integrated through a network.
当然,本领域技术人员应能理解上述设备仅为举例,其他现有的或今后可能出现的设备如可适用于本申请,也应包含在本申请保护范围以内,并在此以引用方式包含于此。Of course, those skilled in the art should understand that the above-mentioned devices are only examples, and other existing or possible devices that may appear in the future, if applicable to this application, should also be included within the protection scope of this application, and are included in this application by reference. this.
在本申请的描述中,“多个”的含义是两个或者更多,除非另有明确具体的限定。In the description of this application, "plurality" means two or more, unless expressly and specifically defined otherwise.
图1示出了根据本申请一个实施例的一种应用于网络设备端的在多人语音中播放语音的方法流程图,该方法包括步骤S11和步骤S12。在步骤S11中,网络设备对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;在步骤S12中,网络设备将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。FIG. 1 shows a flow chart of a method for playing voice in a multi-person voice applied to a network device according to an embodiment of the present application, and the method includes step S11 and step S12. In step S11, for the target user among the multiple users participating in the multi-person speech, the network device determines the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual location information, and generate virtual sound field information corresponding to the target user; in step S12, the network device sends the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can The virtual position information of each user in the virtual sound field plays the user's voice information.
在步骤S11中,网络设备对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息。在一些实施例中,目标用户是参与多人语音的多个用户中的每个用户。在一些实施例中,虚拟声场是个相对坐标系,该相对坐标系可以是一个二维平面坐标系,也可以是一个三维空间坐标系,每个用户各自对应一个虚拟声场,虚拟位置是指其他用户在该用户的虚拟声场中对应的坐标点,虚拟位置信息是坐标点对应的坐标值,用户自身在该用户的虚拟声场中对应的虚拟位置为坐标原点。例如,在User1的虚拟声场中,User1对应的虚拟位置信息是(0,0),User2对应的虚拟位置是信息(0,1),在User2的虚拟声场中,User1对应的虚拟位置信息是(0,-1),User2对应的虚拟位置信息是(0,0)。在一些实施例中,某个用户对应的虚拟声场的坐标轴单位是一个预定的距离间隔,例如,1厘米、10厘米、1米等,坐标轴方向是一个预定的相对于该用户的方向,例如,X轴的正方向是该用户的右方,Y轴的正方向是该用户的前方。在一些实施例中,根据一个用户在另一个用户的虚拟声场中对应的虚拟位置信息,以及虚拟声场的坐标轴单位与坐标轴方向,可以获得两个用户之间的相对距离信息及相对方向信息。例如,在User1的虚拟声场中,X轴的正方向是User1的右方,Y轴的正方向是User1的前方,X轴与Y轴的单位是1米,User1对应的虚拟位置信息是(0,0),User2对应的虚拟位置信息是(1,0),由此可以得出,User2在User1的正前方1米处。在一些实施例中,对于多人语音中的每个用户,该用户对应的虚拟声场信息包括但不限于该用户的虚拟声场的坐标轴方向及坐标轴单位,以及每个其他用户在该用户的虚拟声场中对应的虚拟位置信息(即坐标点的坐标值)。In step S11, for the target user among the multiple users participating in the multi-person speech, the network device determines the virtual position information of other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information to generate virtual sound field information corresponding to the target user. In some embodiments, the target user is each of the multiple users participating in the multi-person speech. In some embodiments, the virtual sound field is a relative coordinate system, and the relative coordinate system may be a two-dimensional plane coordinate system or a three-dimensional space coordinate system, each user corresponds to a virtual sound field, and the virtual position refers to other users For the coordinate points corresponding to the user's virtual sound field, the virtual position information is the coordinate value corresponding to the coordinate point, and the virtual position corresponding to the user in the user's virtual sound field is the coordinate origin. For example, in the virtual sound field of User1, the virtual position information corresponding to User1 is (0,0), the virtual position corresponding to User2 is information (0,1), and in the virtual sound field of User2, the virtual position information corresponding to User1 is ( 0, -1), the virtual location information corresponding to User2 is (0,0). In some embodiments, the coordinate axis unit of the virtual sound field corresponding to a certain user is a predetermined distance interval, for example, 1 cm, 10 cm, 1 meter, etc., and the coordinate axis direction is a predetermined direction relative to the user, For example, the positive direction of the X-axis is to the right of the user, and the positive direction of the Y-axis is the front of the user. In some embodiments, relative distance information and relative direction information between two users can be obtained according to virtual position information corresponding to one user in the virtual sound field of another user, and the coordinate axis unit and coordinate axis direction of the virtual sound field. . For example, in the virtual sound field of User1, the positive direction of the X-axis is to the right of User1, the positive direction of the Y-axis is the front of User1, the unit of the X-axis and the Y-axis is 1 meter, and the virtual position information corresponding to User1 is (0 ,0), the virtual location information corresponding to User2 is (1,0), so it can be concluded that User2 is 1 meter in front of User1. In some embodiments, for each user in the multi-person speech, the virtual sound field information corresponding to the user includes but is not limited to the coordinate axis direction and coordinate axis unit of the virtual sound field of the user, and the information of each other user in the user's virtual sound field. Corresponding virtual position information in the virtual sound field (that is, the coordinate value of the coordinate point).
在步骤S12中,网络设备将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。在一些实施例中,对于多人语音中的每个用户,其他用户的语音信息可以是从其他用户对应的用户设备经由网络设备发送给该用户对应的用户设备的,或者,还可以从其他用户对应的用户设备通过双方用户设备之间建立的p2p连接发送给该用户对应的用户设备的。在一些实施例中,对于多人语音中的每个用户,在接收到某个其他用户发送的语音信息时,根据该其他用户在该用户的虚拟声场中对应的虚拟位置信息,以及该用户的虚拟声场的坐标轴方向与 坐标轴单位,可以获得该其他用户相对于该用户的相对距离信息及相对方向信息,并根据相对距离信息及相对方向信息,来播放该语音信息。例如,在User1的虚拟声场中,X轴的正方向是User1的右方,Y轴的正方向是User1的前方,X轴与Y轴的单位是1米,User1对应的虚拟位置信息是(0,0),User2对应的虚拟位置信息是(0,-2),由此可以得出,User2在User1的正后方2米处,并根据该相对距离信息及相对方向信息,来播放该语音信息。在一些实施例中,根据相对距离信息及相对方向信息播放语音信息的方式可以是通过头相关传输函数(HRTF)对语音信息进行滤波、时延等处理后再输出到用户设备的扬声器进行播放,从而能够在多人语音中使得用户在多个其他用户同时说话时可以清楚准确地区分每个人的语音,并且能够让用户在每个其他用户说话时可以直观快速地知悉当前是哪个其他用户在说话,这能够为多人语音中的用户提供极大的便利。In step S12, the network device sends the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position of each of the other users in the target virtual sound field. The message plays the user's voice message. In some embodiments, for each user in the multi-person voice, the voice information of the other users may be sent from the user equipment corresponding to the other users to the user equipment corresponding to the user via the network device, or may also be sent from the other users The corresponding user equipment is sent to the user equipment corresponding to the user through the p2p connection established between the two user equipments. In some embodiments, for each user in the multi-person voice, when receiving voice information sent by a certain other user, according to the virtual location information of the other user in the user's virtual sound field, and the user's The coordinate axis direction and coordinate axis unit of the virtual sound field can obtain relative distance information and relative direction information of the other user relative to the user, and play the voice information according to the relative distance information and relative direction information. For example, in the virtual sound field of User1, the positive direction of the X-axis is to the right of User1, the positive direction of the Y-axis is the front of User1, the unit of the X-axis and the Y-axis is 1 meter, and the virtual position information corresponding to User1 is (0 ,0), the virtual position information corresponding to User2 is (0,-2), it can be concluded that User2 is 2 meters behind User1, and plays the voice information according to the relative distance information and relative direction information . In some embodiments, the manner of playing the voice information according to the relative distance information and the relative direction information may be to filter and delay the voice information through a head-related transfer function (HRTF), and then output the voice information to the speaker of the user equipment for playback, Therefore, in the multi-person voice, the user can clearly and accurately distinguish each person's voice when multiple other users are speaking at the same time, and the user can intuitively and quickly know which other user is currently speaking when each other user is speaking. , which can provide great convenience for users in multi-person voice.
在一些实施例中,所述对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,包括步骤S13(未示出)、步骤S14(未示出)和步骤S15(未示出)。在步骤S13中,网络设备确定所述多人语音对应的虚拟场景信息;在步骤S14中,网络设备根据所述虚拟场景信息,确定所述多个用户中的每个用户对应的虚拟位置;在步骤S15中,网络设备根据所述目标用户对应的虚拟位置以及所述其他用户对应的虚拟位置,确定所述其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息。在一些实施例中,虚拟场景可以是一个虚拟的二维场景,也可以是一个虚拟的三维场景,例如,虚拟会议室、虚拟教室等。在一些实施例中,虚拟场景信息包括但不限于虚拟场景的可视化信息、虚拟场景的配置信息等,虚拟场景的可视化信息用于通过2D场景图像或3D场景模型的方式直观地向用户呈现虚拟场景,从而可以使得用户可以在虚拟场景中确定自己或别人在该虚拟场景中的虚拟位置,或者,浏览自己或别人在该虚拟场景中的虚拟位置,虚拟场景的配置信息用于根据虚拟场景中两个用户的虚拟位置,得到该两个用户之间的相对距离信息及相对方向信息,并确定该两个用户在各自对应的虚拟声场中的虚拟位置信息。在一些实施例中,虚拟场景可以是语音发起用户在多个默认虚拟场景中选择的,还可以是根据多个用户中的至少一个用户在多个默认虚拟场景中选择的至少一个目标虚拟场景,然后从至少一个目标虚拟场景中将被用户选择次数最多的目标虚拟场景确定为多个语音对应的虚拟场景,或 者,虚拟场景还可以是根据多人语音的语音主题信息,自动在多个默认虚拟场景中确定的与该语音主题信息相匹配的目标默认虚拟场景。在一些实施例中,每个用户在虚拟场景中的虚拟位置可以是由语音发起用户为每个用户确定的,或者,也可以是由每个用户各自确定的,或者,还可以是根据每个用户各自对应的用户信息,在多个预定虚拟位置中确定每个用户在虚拟场景中对应的虚拟位置,其中,每个用户的虚拟位置在虚拟场景(例如,虚拟教室)中的标签信息(例如,“讲台”)与该用户对应的用户信息(例如,“语文老师”)相匹配。在一些实施例中,网络设备在确定完多人语音对应的虚拟场景后,将该虚拟场景对应的虚拟场景信息发送给多个语音中的每个用户,然后根据虚拟场景信息中的可视化信息,通过2D场景图像或3D场景模型的方式直观地向每个用户呈现该虚拟场景,然后每个用户在该2D场景图像或该3D场景模型中通过预定的交互操作(例如,点击)来确定各自在该虚拟场景中的虚拟位置,或者,只将该虚拟场景对应的虚拟场景信息发送给语音发起用户,由语音发起用户来确定多人语音中的每个用户在该虚拟场景中的虚拟位置。在一些实施例中,对于每个用户,根据该用户在虚拟场景中对应的虚拟位置与每个其他用户在虚拟场景中对应的虚拟位置,以及该虚拟场景对应的配置信息,可以得到该用户与每个其他用户之间的相对距离信息及相对方向信息,并确定每个其他用户在该用户对应的虚拟声场中的虚拟位置信息。在一些实施例中,网络设备会将每个用户在虚拟场景中的虚拟位置发送给每个用户并在对应的用户设备上进行呈现,从而使得每个用户可以知悉自己与每个其他用户在虚拟场景中的虚拟位置。在一些实施例中,每个用户设备在虚拟场景信息对应的2D场景图像或3D场景模型中直观地向每个用户呈现自己与每个其他用户在虚拟场景中对应的虚拟位置,从而使得每个用户可以直观快速地知悉虚拟场景中其他用户相对于自己的相对距离及相对方向,例如,用户设备可以在2D场景图像或3D场景模型中的每个虚拟位置处呈现对应的用户的标识信息(例如,用户名称、用户ID等)。In some embodiments, for the target user among the multiple users participating in the multi-person voice, determining the virtual position information of other users among the multiple users in the virtual sound field corresponding to the target user includes step S13 (not shown), step S14 (not shown) and step S15 (not shown). In step S13, the network device determines the virtual scene information corresponding to the multi-person voice; in step S14, the network device determines the virtual location corresponding to each of the multiple users according to the virtual scene information; In step S15, the network device determines virtual position information of the other users in the virtual sound field corresponding to the target user according to the virtual position corresponding to the target user and the virtual positions corresponding to the other users. In some embodiments, the virtual scene may be a virtual two-dimensional scene or a virtual three-dimensional scene, for example, a virtual conference room, a virtual classroom, and the like. In some embodiments, the virtual scene information includes, but is not limited to, visualization information of the virtual scene, configuration information of the virtual scene, etc. The visualization information of the virtual scene is used to intuitively present the virtual scene to the user by means of a 2D scene image or a 3D scene model. , so that the user can determine the virtual position of himself or others in the virtual scene, or browse the virtual position of himself or others in the virtual scene, and the configuration information of the virtual scene is used to The virtual positions of the two users are obtained, the relative distance information and relative direction information between the two users are obtained, and the virtual position information of the two users in the corresponding virtual sound fields is determined. In some embodiments, the virtual scene may be selected by a voice-initiated user among multiple default virtual scenes, or may be at least one target virtual scene selected from multiple default virtual scenes by at least one user among the multiple users, Then, from at least one target virtual scene, the target virtual scene that has been selected the most times by the user is determined as the virtual scene corresponding to multiple voices. The target default virtual scene determined in the scene that matches the speech topic information. In some embodiments, the virtual position of each user in the virtual scene may be determined for each user by the voice initiating user, or may be determined by each user individually, or may also be determined according to each user The user information corresponding to each user, the virtual position corresponding to each user in the virtual scene is determined in a plurality of predetermined virtual positions, wherein the tag information (for example, the virtual classroom) of the virtual position of each user in the virtual scene (for example, the virtual classroom) , "Podium") matches the user's corresponding user information (eg, "language teacher"). In some embodiments, after determining the virtual scene corresponding to the multiple voices, the network device sends the virtual scene information corresponding to the virtual scene to each user in the multiple voices, and then, according to the visualization information in the virtual scene information, Intuitively present the virtual scene to each user by means of a 2D scene image or a 3D scene model, and then each user determines the location of their respective The virtual position in the virtual scene, or only the virtual scene information corresponding to the virtual scene is sent to the voice initiating user, and the voice initiating user determines the virtual position of each user in the multi-person voice in the virtual scene. In some embodiments, for each user, according to the virtual position corresponding to the user in the virtual scene, the virtual position corresponding to each other user in the virtual scene, and the configuration information corresponding to the virtual scene, the user and the user can be obtained. relative distance information and relative direction information between each other user, and determine the virtual position information of each other user in the virtual sound field corresponding to the user. In some embodiments, the network device sends the virtual location of each user in the virtual scene to each user and presents it on the corresponding user device, so that each user can know that he and each other user are in a virtual environment Virtual location in the scene. In some embodiments, each user equipment intuitively presents to each user the virtual position corresponding to itself and each other user in the virtual scene in the 2D scene image or 3D scene model corresponding to the virtual scene information, so that each user The user can intuitively and quickly know the relative distance and relative direction of other users in the virtual scene relative to himself. For example, the user equipment can present the corresponding user identification information (for example, at each virtual position in the 2D scene image or 3D scene model). , user name, user ID, etc.).
在一些实施例中,所述步骤S13包括:网络设备获得所述多个用户中的语音发起用户在多个默认虚拟场景信息中选择的目标虚拟场景信息对应的标识信息,将所述目标虚拟场景信息确定为所述多人语音对应的虚拟场景信息。在一些实施例中,语音发起用户在多个默认虚拟场景中选择目标虚拟场景,并将目标虚拟场景对应的标识信息(例如,场景名称、场景ID等)发送给网络设备。例如,多个默认虚拟 场景包括虚拟会议室1、虚拟会议室2、虚拟教室1、虚拟教室2,语音发起用户在该多个默认虚拟场景中选择虚拟会议室1作为目标虚拟场景,并将对应的标识信息“虚拟会议室1”发送给网络设备。In some embodiments, the step S13 includes: the network device obtains identification information corresponding to the target virtual scene information selected by the voice-initiated user among the plurality of default virtual scene information, and assigns the target virtual scene to the target virtual scene. The information is determined as virtual scene information corresponding to the multi-person voices. In some embodiments, the voice-initiated user selects a target virtual scene from among multiple default virtual scenes, and sends identification information (eg, scene name, scene ID, etc.) corresponding to the target virtual scene to the network device. For example, multiple default virtual scenes include virtual meeting room 1, virtual meeting room 2, virtual classroom 1, and virtual classroom 2. The voice-initiated user selects virtual meeting room 1 as the target virtual scene among the multiple default virtual scenes, and sets the corresponding virtual meeting room 1 as the target virtual scene. The identification information "Virtual Meeting Room 1" is sent to the network device.
在一些实施例中,所述步骤S13包括:网络设备获得所述多个用户中的至少一个用户在多个默认虚拟场景信息中选择的至少一个目标虚拟场景信息,从所述至少一个目标虚拟场景信息中确定所述多人语音对应的虚拟场景信息,其中,所确定的虚拟场景信息被选择的次数最多。在一些实施例中,每个用户可以在多个默认虚拟场景中选择一个或多个目标虚拟场景,并将该一个或多个目标虚拟场景对应的标识信息发送给网络设备,然后网络设备从该一个或多个目标虚拟场景中将被用户选择次数最多的目标虚拟场景确定为多人语音对应的虚拟场景,优选地,每个用户只可以在多个默认虚拟场景中选择一个目标虚拟场景。In some embodiments, the step S13 includes: the network device obtains at least one target virtual scene information selected by at least one user among the plurality of users from the plurality of default virtual scene information, and obtains information from the at least one target virtual scene from the at least one target virtual scene. The virtual scene information corresponding to the multi-person voices is determined in the information, wherein the determined virtual scene information is selected the most times. In some embodiments, each user can select one or more target virtual scenes from multiple default virtual scenes, and send identification information corresponding to the one or more target virtual scenes to the network device, and then the network device can retrieve the corresponding identification information from the one or more target virtual scenes to the network device. Among the one or more target virtual scenes, the target virtual scene selected by the user the most times is determined as the virtual scene corresponding to the multi-person voice. Preferably, each user can only select one target virtual scene among multiple default virtual scenes.
在一些实施例中,所述步骤S13包括:网络设备根据所述多人语音对应的语音主题信息,从多个默认虚拟场景信息中确定与所述语音主题信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所述多人语音对应的虚拟场景信息。在一些实施例中,多人语音对应的语音主题信息可以是由语音发起用户输入后发送给网络设备,还可以是由语音发起用户在预置的多个默认语音主题信息中选择多人语音对应的语音主题信息,并将语音主题信息对应的标识信息(例如,主题名称、主题ID等)发送给网络设备,其中,语音主题信息用于表征本次多人语音的主题,包括但不限于“会议”、“班会”、“技术分享”等。在一些实施例中,根据多人语音对应的语音主题信息,在多个默认虚拟场景中将与该语音主题信息相匹配的默认虚拟场景确定为多人语音对应的虚拟场景,例如,多个默认虚拟场景包括虚拟会议室、虚拟教室、虚拟咖啡厅,根据多人语音对应的语音主题信息“会议”,将该多个默认虚拟场景中与该语音主题信息“会议”相匹配的默认虚拟场景“虚拟会议室”确定为多人语音对应的虚拟场景。In some embodiments, the step S13 includes: the network device determines target default virtual scene information that matches the voice theme information from a plurality of default virtual scene information according to the voice theme information corresponding to the multi-person voices, and determining the target default virtual scene information as virtual scene information corresponding to the multi-person voices. In some embodiments, the voice topic information corresponding to the multi-person voices may be sent to the network device after input by the voice-initiated user, or may be selected by the voice-initiated user from multiple preset default voice topic information corresponding to the multi-person voices voice theme information, and send the identification information (for example, theme name, theme ID, etc.) corresponding to the voice theme information to the network device, wherein the voice theme information is used to represent the theme of this multi-person voice, including but not limited to "" Conference", "Class Meeting", "Technology Sharing", etc. In some embodiments, according to the voice theme information corresponding to the voices of the multiple people, the default virtual scene matching the voice theme information is determined as the virtual scene corresponding to the multiple voices in the multiple default virtual scenes, for example, a plurality of default virtual scenes The virtual scene includes a virtual conference room, a virtual classroom, and a virtual coffee shop. According to the voice theme information "meeting" corresponding to the multi-person voice, the default virtual scene "matching the voice theme information "meeting" in the multiple default virtual scenes" "Virtual meeting room" is determined as the virtual scene corresponding to the multi-person voice.
在一些实施例中,所述步骤S13包括步骤S16(未示出)。在步骤S16中,网络设备根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所述多人语音对应的虚拟场景信息。在一些实施例中,根据多个用户中的每个用户对应的用户信息,或者,根据多个用户中的语音发起用户对应的用户信息,在多个 默认虚拟场景信息中将与每个用户对应的用户信息或语音发起用户对应的用户信息相匹配的默认虚拟场景确定为多人语音对应的虚拟场景。In some embodiments, the step S13 includes a step S16 (not shown). In step S16, the network device determines target default virtual scene information that matches the user information from a plurality of default virtual scene information according to the user information corresponding to the multiple users, and converts the target default virtual scene information to the target default virtual scene information. The virtual scene information corresponding to the multi-person voice is determined. In some embodiments, according to the user information corresponding to each of the multiple users, or according to the user information corresponding to the voice initiating user among the multiple users, the multiple default virtual scene information will correspond to each user The default virtual scene that matches the user information of the user information or the user information corresponding to the voice initiating user is determined as the virtual scene corresponding to the multi-person voice.
在一些实施例中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:网络设备根据所述多个用户中的语音发起用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息。例如,多个默认虚拟场景包括虚拟会议室、虚拟教室、虚拟咖啡厅,多个用户中的语义发起用户对应的用户信息包括“职业:教师”,则可以根据该用户信息“职业:教师”,将该多个默认虚拟场景中与该用户信息“职业:教师”相匹配的默认虚拟场景“虚拟教室”确定为多人语音对应的虚拟场景。In some embodiments, the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information. For the user information corresponding to the voice-initiated user among the users, target default virtual scene information that matches the user information is determined from a plurality of default virtual scene information. For example, multiple default virtual scenes include virtual conference rooms, virtual classrooms, and virtual coffee shops, and the user information corresponding to the semantically initiating user among the multiple users includes "occupation: teacher", then according to the user information "occupation: teacher", The default virtual scene "virtual classroom" that matches the user information "occupation: teacher" among the plurality of default virtual scenes is determined as the virtual scene corresponding to the multi-person speech.
在一些实施例中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:网络设备根据所述多个用户中的每个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述多个用户中的每个用户对应的用户信息相匹配的至少一个默认虚拟场景信息,并从所述至少一个默认虚拟场景信息中确定目标默认虚拟场景信息,其中,与所述目标默认虚拟场景信息相匹配的用户数量最多。在一些实施例中,对于多个用户中的每个用户,根据该用户对应的用户信息,从多个默认虚拟场景中确定与该用户信息相匹配的默认虚拟场景。在一些实施例中,将与每个用户对应的用户信息相匹配的至少一个默认虚拟场景中与其相匹配的用户数量最多的默认虚拟场景确定为多人语音对应的虚拟场景。例如,多人语音对应的多个用户包括User1、User2、User3,多个默认虚拟场景包括虚拟会议室、虚拟教室、虚拟咖啡厅,User1的用户信息包括“职业:教师”,与User1相匹配的默认虚拟场景是虚拟教室,User2的用户信息包括“职业:学生”,与User2相匹配的默认虚拟场景也是虚拟教室,User3的用户信息包括“爱好:喝咖啡”,与User3相匹配的默认虚拟场景是虚拟咖啡厅,则在虚拟教室和虚拟咖啡厅中将与其相匹配的用户数量最多的虚拟场景“虚拟教室”确定为多人语音对应的虚拟场景。In some embodiments, the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information. User information corresponding to each of the plurality of users, at least one default virtual scene information matching the user information corresponding to each of the plurality of users is determined from a plurality of default virtual scene information, and from the Target default virtual scene information is determined from at least one default virtual scene information, wherein the number of users matching the target default virtual scene information is the largest. In some embodiments, for each user of the plurality of users, according to the user information corresponding to the user, a default virtual scene matching the user information is determined from a plurality of default virtual scenes. In some embodiments, the default virtual scene with the largest number of matched users among the at least one default virtual scene matched with the user information corresponding to each user is determined as the virtual scene corresponding to the multi-person voice. For example, multiple users corresponding to multi-person voices include User1, User2, and User3, and multiple default virtual scenes include virtual conference rooms, virtual classrooms, and virtual cafes. User1's user information includes "occupation: teacher". The default virtual scene is a virtual classroom, the user information of User2 includes "occupation: student", the default virtual scene matching User2 is also a virtual classroom, the user information of User3 includes "hobby: drinking coffee", and the default virtual scene matching User3 If it is a virtual coffee shop, in the virtual classroom and virtual coffee shop, the virtual scene "virtual classroom" with the largest number of matched users is determined as the virtual scene corresponding to the multi-person voice.
在一些实施例中,所述虚拟场景信息中包括多个预定虚拟位置;其中,所述步骤S14包括:网络设备对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在 所述虚拟场景信息中的虚拟位置。在一些实施例中,虚拟场景中包括多个预定虚拟位置,在虚拟场景信息对应的2D场景图像或3D场景模型中直观地向用户呈现每个预定虚拟位置,每个用户分别对应该多个预定虚拟位置中的一个目标预定虚拟位置,并将每个用户各自对应的目标预定虚拟位置确定为该用户在虚拟场景中的虚拟位置,优选地,每个用户分别对应一个不同的目标预定虚拟位置。在一些实施例中,用户在虚拟场景中的虚拟位置只能是多个预定虚拟位置中的一个,而不能是虚拟场景中的任意虚拟位置。在一些实施例中,目标预定虚拟位置可以是语音发起用户在多个预定虚拟位置中为每个用户选择的,或者,还可以是每个用户在多个预定虚拟位置中为自身选择的,或者,还可以是根据每个用户各自的用户信息,从多个预定虚拟位置中为每个用户自动确定与该用户的用户信息相匹配的预定虚拟位置。In some embodiments, the virtual scene information includes multiple predetermined virtual locations; wherein, the step S14 includes: for each user of the multiple users, the network device obtains the user's location in the multiple predetermined virtual locations. The target predetermined virtual position corresponding to the virtual position is determined as the virtual position of the user in the virtual scene information. In some embodiments, the virtual scene includes a plurality of predetermined virtual positions, and each predetermined virtual position is intuitively presented to the user in the 2D scene image or 3D scene model corresponding to the virtual scene information, and each user corresponds to the plurality of predetermined virtual positions. A target predetermined virtual position in the virtual positions, and the target predetermined virtual position corresponding to each user is determined as the virtual position of the user in the virtual scene. Preferably, each user corresponds to a different target predetermined virtual position. In some embodiments, the virtual position of the user in the virtual scene can only be one of a plurality of predetermined virtual positions, and cannot be any virtual position in the virtual scene. In some embodiments, the target predetermined virtual location may be selected by the voice initiating user for each user among the plurality of predetermined virtual locations, or may also be selected by each user for himself among the plurality of predetermined virtual locations, or , or according to the respective user information of each user, a predetermined virtual position matching the user information of the user is automatically determined for each user from a plurality of predetermined virtual positions.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中的每个用户,获得所述多个用户中的语音发起用户在所述多个预定虚拟位置中为该用户指定的目标预定虚拟位置。在一些实施例中,语音发起用户在多个预定虚拟位置中指定每个用户分别对应的目标预定虚拟位置,并将每个用户对应的目标预定虚拟位置的标识信息发送给网络设备,优选地,语音发起用户需要给每个用户分别选择不同的目标预定虚拟位置,不能给多个用户选择相同的目标预定虚拟位置。In some embodiments, the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users obtaining a target predetermined virtual position designated by a voice-initiated user among the plurality of predetermined virtual positions for the user among the plurality of users. In some embodiments, the voice-initiated user specifies a target predetermined virtual position corresponding to each user in the plurality of predetermined virtual positions, and sends the identification information of the target predetermined virtual position corresponding to each user to the network device, preferably, The voice-initiated user needs to select a different target predetermined virtual position for each user, and cannot select the same target predetermined virtual position for multiple users.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中的每个用户,根据该用户对应的用户信息,在所述多个预定虚拟位置中确定目标预定虚拟位置,其中,所述目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。在一些实施例中,对于每个用户,根据该用户对应的用户信息,在多个预定虚拟位置中自动确定在虚拟场景中对应的标签信息与该用户信息相匹配的目标预定虚拟位置。例如,User1的用户信息包括“职业:语文老师”,虚拟场景是虚拟教室,多个预定虚拟位置中的预定虚拟位置L1在该虚拟场景中对应的标签信息是“讲台”,该标签信息与User1的用户信息“职业:语文老师”相匹配,由此可以将预定虚拟位置L1确定为User1在该多个预定虚拟位置中对应的目标预定虚拟位置。In some embodiments, the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users a user, according to the user information corresponding to the user, determine a target predetermined virtual position among the plurality of predetermined virtual positions, wherein the tag information of the target predetermined virtual position in the virtual scene information corresponds to the user corresponding to the user information to match. In some embodiments, for each user, according to the user information corresponding to the user, a target predetermined virtual location whose corresponding tag information in the virtual scene matches the user information is automatically determined among a plurality of predetermined virtual locations. For example, the user information of User1 includes "occupation: Chinese teacher", the virtual scene is a virtual classroom, and the tag information corresponding to the predetermined virtual position L1 in the virtual scene is "podium", and the tag information is the same as that of User1. The user information "occupation: Chinese teacher" of the user information "occupation: Chinese teacher" is matched, and thus the predetermined virtual position L1 can be determined as the target predetermined virtual position corresponding to User1 in the plurality of predetermined virtual positions.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多 个预定虚拟位置中对应的目标预定虚拟位置,包括步骤S17(未示出)、步骤S18(未示出)和步骤S19(未示出)。在步骤S17中,网络设备生成虚拟位置请求信息并发送给所述多个用户中的每个用户,其中,所述虚拟位置请求信息包括所述虚拟场景信息;在步骤S18中,网络设备接收所述多个用户中的至少一个用户发送的、关于所述虚拟位置请求信息的反馈信息,其中,所述至少一个用户中每个用户发送的反馈信息用于指示该用户在所述多个预定虚拟位置中选择的目标预定虚拟位置;在步骤S19中,网络设备对于所述多个用户中的每个用户,根据所述反馈信息,确定该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置。在一些实施例中,将包括虚拟场景信息在内的虚拟位置请求信息发送给每个用户,每个用户接收到虚拟位置请求信息后,呈现虚拟场景信息对应的2D场景图像或3D场景模型,并在该2D场景图像或3D场景模型中呈现多个预定虚拟位置,每个用户各自在该多个预定虚拟位置中选择一个目标预定虚拟位置,并将包括该选择的目标预定虚拟位置的标识信息在内的反馈信息发送给网络设备,网络设备接收到某个用户发送的反馈信息后,可以获得该用户在多个预定虚拟位置中选择的目标预定虚拟位置,优选地,每个用户只能分别选择不同的目标预定虚拟位置,多个用户不能选择相同的目标预定虚拟位置。在一些实施例中,虚拟位置请求信息对应一个反馈期限,在达到该反馈期限后,对于多个用户中当前尚未反馈的每个用户,可以由语音发起用户来从当前未被选择的至少一个预定虚拟位置中为每个当前未反馈用户选择各自对应的目标预定虚拟位置,或者,还可以由网络设备来从当前未被选择的至少一个预定虚拟位置中为每个当前未反馈用户自动分配各自对应的目标预定虚拟位置。In some embodiments, for each user in the plurality of users, obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes step S17 (not shown), step S18 (not shown) and step S19 (not shown). In step S17, the network device generates virtual location request information and sends it to each of the plurality of users, wherein the virtual location request information includes the virtual scene information; in step S18, the network device receives the information of the virtual location. feedback information about the virtual location request information sent by at least one of the multiple users, wherein the feedback information sent by each of the at least one user is used to indicate that the user is in the multiple predetermined virtual locations The target predetermined virtual position selected from the positions; in step S19, the network device determines, for each user in the plurality of users, according to the feedback information, the target predetermined target corresponding to the user in the plurality of predetermined virtual positions virtual location. In some embodiments, virtual location request information including virtual scene information is sent to each user, and after each user receives the virtual location request information, each user presents a 2D scene image or 3D scene model corresponding to the virtual scene information, and A plurality of predetermined virtual positions are presented in the 2D scene image or 3D scene model, each user selects a target predetermined virtual position from the plurality of predetermined virtual positions, and includes the identification information of the selected target predetermined virtual position in the The feedback information is sent to the network device. After receiving the feedback information sent by a certain user, the network device can obtain the target predetermined virtual position selected by the user among the plurality of predetermined virtual positions. Preferably, each user can only select With different target predetermined virtual positions, multiple users cannot select the same target predetermined virtual position. In some embodiments, the virtual location request information corresponds to a feedback period, and after the feedback period is reached, for each user among the plurality of users who has not yet given feedback, the voice-initiated user can select from the at least one currently unselected reservation. In the virtual location, a corresponding target predetermined virtual location is selected for each currently unreported user, or the network device may also automatically assign a corresponding target virtual location to each currently unreported user from at least one predetermined virtual location that is not currently selected. target predetermined virtual location.
在一些实施例中,所述方法还包括:网络设备在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第一提示信息,并将所述第一提示信息发送至所述多个用户中尚未反馈的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置不可选择。在一些实施例中,网络设备在接收到第一用户发送的、在多个预定虚拟位置中选择第一目标预定虚拟位置的反馈信息后,生成该反馈信息对应的提示信息(例如,“第一用户已选择第一目标预定虚拟位置”),并发送给多个用户中每个当前尚未反馈的其他用户,以提示每个当前尚未反馈的其他用户该第一目标预定虚拟位置不可被选择。在一些实施例中,每个其他用户对应的用户设备在接收到提示信息(例如,“第一用户已选择第一目标预定虚拟位置”)后, 可以在虚拟场景信息对应的2D场景图像或3D场景模型中将第一目标预定虚拟位置设置为不可选中状态。In some embodiments, the method further includes: after receiving the feedback information sent by the first user among the multiple users, the network device generates first prompt information corresponding to the feedback information, and sends the first prompt information to the first user. The prompt information is sent to other users among the plurality of users who have not yet given feedback, so as to prompt that the predetermined virtual position of the first target indicated by the feedback information is not selectable. In some embodiments, after receiving the feedback information sent by the first user and selecting the first target predetermined virtual location from the multiple predetermined virtual locations, the network device generates prompt information corresponding to the feedback information (for example, "the first target virtual location" The user has selected the first target predetermined virtual position”), and send it to each other user among the multiple users who have not yet provided feedback, so as to prompt each other user who has not yet provided feedback that the first target predetermined virtual position cannot be selected. In some embodiments, after receiving the prompt information (for example, "the first user has selected the predetermined virtual location of the first target"), the user equipment corresponding to each other user can display the 2D scene image or 3D scene image corresponding to the virtual scene information In the scene model, the predetermined virtual position of the first target is set to an unselectable state.
在一些实施例中,所述方法还包括:网络设备在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第二提示信息,将所述第二提示信息发送至所述多个用户中除所述第一用户以外的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置已被所述第一用户选择。在一些实施例中,网络设备将提示信息(例如,“第一用户已选择第一目标预定虚拟位置”)发送给多个用户中除第一用户以外的每个其他用户,以提示每个其他用户该第一目标预定虚拟位置已被第一用户选中,从而使得每个用户可以知悉其他用户在虚拟场景中的虚拟位置。在一些实施例中,在虚拟场景信息对应的2D场景图像或3D场景模型中在第一目标预定虚拟位置处呈现第一用户的标识信息(例如,用户名称、用户ID等)。In some embodiments, the method further includes: after the network device receives the feedback information sent by the first user among the multiple users, generating second prompt information corresponding to the feedback information, and converting the second prompt information to the network device. The information is sent to other users in the plurality of users except the first user, to prompt that the first target predetermined virtual position indicated by the feedback information has been selected by the first user. In some embodiments, the network device sends prompt information (eg, "the first user has selected the first target predetermined virtual location") to each of the plurality of users except the first user to prompt each of the other users The predetermined virtual position of the first target of the user has been selected by the first user, so that each user can know the virtual positions of other users in the virtual scene. In some embodiments, identification information (eg, user name, user ID, etc.) of the first user is presented at the predetermined virtual location of the first target in the 2D scene image or 3D scene model corresponding to the virtual scene information.
在一些实施例中,所述方法还包括:网络设备接收所述至少一个用户中的第二用户发送的邀请请求信息,其中,所述第二用户已在所述多个预定虚拟位置中选择第二目标预定虚拟位置,所述邀请请求信息用于邀请所述多个用户中当前未反馈的第三用户选择所述第二目标预定虚拟位置附近的预定虚拟位置;将所述邀请请求信息发送给所述第三用户,以提示所述第三用户选择所述第二目标预定虚拟位置附近未被选择的预定虚拟位置作为所述第三用户对应的目标预定虚拟位置。在一些实施例中,第二用户已经在多个预定虚拟位置中选择第二目标预定虚拟位置作为自己在虚拟场景中的虚拟位置,响应于第二用户针对当前未反馈的第三用户执行的邀请触发操作,生成用于邀请第二用户选择第二目标预定虚拟位置附近的预定虚拟位置的邀请请求信息,并发送给网络设备。在一些实施例中,需要先检测第三用户当前是否已经选择自己对应的预定虚拟位置,若第三用户当前未选择,才可以针对第三用户执行邀请触发操作。在一些实施例中,网络设备在接收到针对第三用户的邀请请求信息后,会将邀请请求信息转发给第三用户,以提示第三用户选择第二目标预定虚拟位置附近未被选择的预定虚拟位置作为第三用户对应的目标预定虚拟位置。在一些实施例中,在接收到提示信息(例如,“第一用户已选择第一目标预定虚拟位置”)后,可以在虚拟场景信息对应的2D场景图像或3D场景模型中将第二目标预定虚拟位置附近未被选择的至少一个预定虚拟位置设置为特殊显示状态(例如,高亮显示)以引导第三用户在该至少一个预定虚拟位置中选择一个作为目标预定虚拟 位置。In some embodiments, the method further includes: the network device receiving invitation request information sent by a second user of the at least one user, wherein the second user has selected a first user in the plurality of predetermined virtual locations Second target predetermined virtual location, the invitation request information is used to invite a third user among the multiple users who has not given feedback currently to select a predetermined virtual location near the second target predetermined virtual location; send the invitation request information to The third user is to prompt the third user to select an unselected predetermined virtual position near the second target predetermined virtual position as the target predetermined virtual position corresponding to the third user. In some embodiments, the second user has selected the second target predetermined virtual position among the plurality of predetermined virtual positions as his own virtual position in the virtual scene, in response to the invitation performed by the second user for the third user who is not currently feeding back An operation is triggered to generate invitation request information for inviting the second user to select a predetermined virtual location near the second target predetermined virtual location, and send the request to the network device. In some embodiments, it is necessary to first detect whether the third user has currently selected a predetermined virtual location corresponding to himself, and if the third user does not currently select, the invitation triggering operation can be performed for the third user. In some embodiments, after receiving the invitation request information for the third user, the network device forwards the invitation request information to the third user, so as to prompt the third user to select an unselected reservation near the second target predetermined virtual location The virtual position is used as the target predetermined virtual position corresponding to the third user. In some embodiments, after receiving prompt information (for example, "the first user has selected the predetermined virtual position of the first target"), the second target may be reserved in the 2D scene image or 3D scene model corresponding to the virtual scene information At least one unselected predetermined virtual location near the virtual location is set to a special display state (eg, highlighted) to guide the third user to select one of the at least one predetermined virtual location as a target predetermined virtual location.
在一些实施例中,所述方法还包括:网络设备在达到所述虚拟位置请求信息对应的预定反馈期限后,对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。在一些实施例中,虚拟位置请求信息对应一个预定的反馈期限(例如,5分钟),该反馈期限可以是网络设备默认的,也可是由语音发起用户设置的,在达到该反馈期限后,对于多个用户中当前尚未反馈的每个用户,可以由语音发起用户来从当前未被选择的至少一个预定虚拟位置中为每个当前未反馈用户选择各自对应的目标预定虚拟位置,或者,还可以由网络设备来从当前未被选择的至少一个预定虚拟位置中为每个当前未反馈用户自动分配各自对应的目标预定虚拟位置。在一些实施例中,不同的预定虚拟位置根据各自的标签信息,在虚拟场景中会对应不同的优先级,可以按照优先级从高到低的顺序来为当前未反馈用户自动分配各自对应的目标预定虚拟位置。例如,若虚拟场景是虚拟礼堂,则在该虚拟场景中标签信息为“第一排”的多个预定虚拟位置对应的优先级会大于标签信息为“第二排”的多个预定虚拟位置对应的优先级,则会优先为为当前未反馈用户自动分配标签信息为“第一排”的当前未被选择的预定虚拟位置来作为各自对应的目标预定虚拟位置。In some embodiments, the method further includes: after the network device reaches a predetermined feedback period corresponding to the virtual location request information, for each user among the plurality of users that is not currently giving feedback, determining that the user is in the A target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected among the plurality of predetermined virtual positions, and the target predetermined virtual position is determined as the virtual position of the user in the virtual scene information. In some embodiments, the virtual location request information corresponds to a predetermined feedback period (for example, 5 minutes). For each user among the multiple users who has not yet given feedback, the user may select a corresponding target predetermined virtual position from at least one predetermined virtual position that is not currently selected by the voice-initiated user, or, alternatively, it may be Each currently unfeedback user is automatically assigned a corresponding target predetermined virtual position from at least one predetermined virtual position that is not currently selected by the network device. In some embodiments, different predetermined virtual locations correspond to different priorities in the virtual scene according to their respective tag information, and the corresponding targets can be automatically assigned to users who have not given feedback currently in descending order of priority. Book a virtual location. For example, if the virtual scene is a virtual auditorium, in the virtual scene, the priority corresponding to the plurality of predetermined virtual positions whose label information is "first row" will be higher than the priority corresponding to the plurality of predetermined virtual positions whose label information is "second row" The priority is to automatically assign the currently unselected predetermined virtual position with the label information of "first row" to the user who has not given feedback currently as the corresponding target predetermined virtual position.
在一些实施例中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:根据所述多个用户中当前已反馈的至少一个用户在所述虚拟场景信息中的虚拟位置,确定所述虚拟场景信息中的热点位置区域信息;对于所述多个用户中当前未反馈的每个用户,将所述热点位置区域信息中的未被选择的一个预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。在一些实施例中,在达到该反馈期限后,根据当前已反馈的每个用户对应的用户虚拟位置在虚拟场景中的分布情况,确定虚拟场景中用户虚拟位置分布较为密集的热点位置区域,则优先从该热点位置区域中未被选择的一个或多个预定虚拟位置中为每个当前未反馈用户自动分配一个预定虚拟位置作为各自对应的目标预定虚拟位置,若该热点位置区域中的所有预定虚拟位置均已被选择,则从虚拟场景中的其他未被选择的预定虚拟位置中为每个当前未反馈用户自动分配一个预定虚拟位置作为各自对应的 目标预定虚拟位置。In some embodiments, for each user among the plurality of users who has not currently given feedback, determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations The virtual location includes: determining the hotspot location area information in the virtual scene information according to the virtual location of at least one user in the virtual scene information that has been currently fed back from the multiple users; for the multiple users For each user who has not given feedback at present, an unselected predetermined virtual location in the hotspot location area information is determined as the virtual location of the user in the virtual scene information. In some embodiments, after the feedback period is reached, according to the current distribution of the user's virtual location corresponding to each user that has been fed back in the virtual scene, determine a hotspot location area in the virtual scene where the user's virtual location is densely distributed, then Priority is given to automatically assigning a predetermined virtual location to each current unreported user from one or more unselected predetermined virtual locations in the hotspot location area as the respective corresponding target predetermined virtual location, if all the predetermined virtual locations in the hotspot location area If all virtual positions have been selected, a predetermined virtual position is automatically assigned to each currently unfeeding user from other unselected predetermined virtual positions in the virtual scene as the respective corresponding target predetermined virtual positions.
在一些实施例中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,其中,该目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。在一些实施例中,对于当前未反馈的每个用户,根据该用户对应的用户信息,从当前未被选择的至少一个预定虚拟位置中为该用户确定在虚拟场景中对应的标签信息与该用户信息相匹配的预定虚拟位置来作为该用户对应的目标预定虚拟位置。例如,当前未反馈的User1的用户信息包括“职业:教师”,虚拟场景是虚拟教室,当前未被选择的至少一个预定虚拟位置中的预定虚拟位置L1在该虚拟场景中对应的标签信息是“讲台”,该标签信息与User1的用户信息“职业:教师”相匹配,由此可以将预定虚拟位置L1自动分配给User1来作为其对应的目标预定虚拟位置。In some embodiments, for each user among the plurality of users who has not currently given feedback, determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations The virtual position includes: for each user among the plurality of users who has not currently given feedback, determining a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected by the user among the plurality of predetermined virtual positions, Wherein, the tag information of the target predetermined virtual position in the virtual scene information matches the user information corresponding to the user. In some embodiments, for each user that is not currently feedback, according to the user information corresponding to the user, from at least one predetermined virtual location that is not currently selected, determine for the user the corresponding tag information in the virtual scene and the user The predetermined virtual position matched with the information is used as the target predetermined virtual position corresponding to the user. For example, the user information of User1 that has not been fed back currently includes "occupation: teacher", the virtual scene is a virtual classroom, and the label information corresponding to the predetermined virtual position L1 in at least one predetermined virtual position that is not currently selected in the virtual scene is " Lecture", the tag information matches the user information "occupation: teacher" of User1, so that the predetermined virtual position L1 can be automatically assigned to User1 as its corresponding target predetermined virtual position.
图2示出了根据本申请一个实施例的一种多人语音中播放语音的网络设备结构图,该设备包括一一模块11和一二模块12。一一模块11,用于对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;一二模块12,用于将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。FIG. 2 shows a structure diagram of a network device for playing voice in a multi-person voice according to an embodiment of the present application. The device includes a first module 11 and a second module 12 . A module 11 is configured to, for the target user among the multiple users participating in the multi-person voice, determine the virtual position information of the other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual location information, to generate virtual sound field information corresponding to the target user; the first and second modules 12 are used to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can The virtual position information of each user in the virtual sound field plays the user's voice information.
一一模块11,用于对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息。在一些实施例中,目标用户是参与多人语音的多个用户中的每个用户。在一些实施例中,虚拟声场是个相对坐标系,该相对坐标系可以是一个二维平面坐标系,也可以是一个三维空间坐标系,每个用户各自对应一个虚拟声场,虚拟位置是指其他用户在该用户的虚拟声场中对应的坐标点,虚拟位置信息是坐标点对应的坐标值,用户自身在该用户的虚拟声场中对应的虚拟位置为坐标原点。例如,在User1的虚拟声场中,User1对应的虚拟 位置信息是(0,0),User2对应的虚拟位置是信息(0,1),在User2的虚拟声场中,User1对应的虚拟位置信息是(0,-1),User2对应的虚拟位置信息是(0,0)。在一些实施例中,某个用户对应的虚拟声场的坐标轴单位是一个预定的距离间隔,例如,1厘米、10厘米、1米等,坐标轴方向是一个预定的相对于该用户的方向,例如,X轴的正方向是该用户的右方,Y轴的正方向是该用户的前方。在一些实施例中,根据一个用户在另一个用户的虚拟声场中对应的虚拟位置信息,以及虚拟声场的坐标轴单位与坐标轴方向,可以获得两个用户之间的相对距离信息及相对方向信息。例如,在User1的虚拟声场中,X轴的正方向是User1的右方,Y轴的正方向是User1的前方,X轴与Y轴的单位是1米,User1对应的虚拟位置信息是(0,0),User2对应的虚拟位置信息是(1,0),由此可以得出,User2在User1的正前方1米处。在一些实施例中,对于多人语音中的每个用户,该用户对应的虚拟声场信息包括但不限于该用户的虚拟声场的坐标轴方向及坐标轴单位,以及每个其他用户在该用户的虚拟声场中对应的虚拟位置信息(即坐标点的坐标值)。A module 11 is configured to, for the target user among the multiple users participating in the multi-person voice, determine the virtual position information of the other users in the multiple users in the virtual sound field corresponding to the target user, and according to the virtual position information to generate virtual sound field information corresponding to the target user. In some embodiments, the target user is each of the multiple users participating in the multi-person speech. In some embodiments, the virtual sound field is a relative coordinate system, and the relative coordinate system may be a two-dimensional plane coordinate system or a three-dimensional space coordinate system, each user corresponds to a virtual sound field, and the virtual position refers to other users For the coordinate points corresponding to the user's virtual sound field, the virtual position information is the coordinate value corresponding to the coordinate point, and the virtual position corresponding to the user in the user's virtual sound field is the coordinate origin. For example, in the virtual sound field of User1, the virtual position information corresponding to User1 is (0,0), the virtual position corresponding to User2 is information (0,1), and in the virtual sound field of User2, the virtual position information corresponding to User1 is ( 0, -1), the virtual location information corresponding to User2 is (0,0). In some embodiments, the coordinate axis unit of the virtual sound field corresponding to a certain user is a predetermined distance interval, for example, 1 cm, 10 cm, 1 meter, etc., and the coordinate axis direction is a predetermined direction relative to the user, For example, the positive direction of the X-axis is to the right of the user, and the positive direction of the Y-axis is the front of the user. In some embodiments, relative distance information and relative direction information between two users can be obtained according to virtual position information corresponding to one user in the virtual sound field of another user, and the coordinate axis unit and coordinate axis direction of the virtual sound field. . For example, in the virtual sound field of User1, the positive direction of the X-axis is to the right of User1, the positive direction of the Y-axis is the front of User1, the unit of the X-axis and the Y-axis is 1 meter, and the virtual position information corresponding to User1 is (0 ,0), the virtual location information corresponding to User2 is (1,0), so it can be concluded that User2 is 1 meter in front of User1. In some embodiments, for each user in the multi-person speech, the virtual sound field information corresponding to the user includes but is not limited to the coordinate axis direction and coordinate axis unit of the virtual sound field of the user, and the information of each other user in the user's virtual sound field. Corresponding virtual position information in the virtual sound field (that is, the coordinate value of the coordinate point).
一二模块12,用于将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述目虚拟声场中的虚拟位置信息播放该用户的语音信息。在一些实施例中,对于多人语音中的每个用户,其他用户的语音信息可以是从其他用户对应的用户设备经由网络设备发送给该用户对应的用户设备的,或者,还可以从其他用户对应的用户设备通过双方用户设备之间建立的p2p连接发送给该用户对应的用户设备的。在一些实施例中,对于多人语音中的每个用户,在接收到某个其他用户发送的语音信息时,根据该其他用户在该用户的虚拟声场中对应的虚拟位置信息,以及该用户的虚拟声场的坐标轴方向与坐标轴单位,可以获得该其他用户相对于该用户的相对距离信息及相对方向信息,并根据相对距离信息及相对方向信息,来播放该语音信息。例如,在User1的虚拟声场中,X轴的正方向是User1的右方,Y轴的正方向是User1的前方,X轴与Y轴的单位是1米,User1对应的虚拟位置信息是(0,0),User2对应的虚拟位置信息是(0,-2),由此可以得出,User2在User1的正后方2米处,并根据该相对距离信息及相对方向信息,来播放该语音信息。在一些实施例中,根据相对距离信息及相对方向信息播放语音信息的方式可以是通过头相关传输函数(HRTF)对语音信息进行滤波、时延等处理后再输出到用户设备的扬声器进行播放,从而能够在多人语 音中使得用户在多个其他用户同时说话时可以清楚准确地区分每个人的语音,并且能够让用户在每个其他用户说话时可以直观快速地知悉当前是哪个其他用户在说话,这能够为多人语音中的用户提供极大的便利。The first and second modules 12 are configured to send the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment can make the virtual sound field according to the virtual position of each of the other users in the target virtual sound field. The message plays the user's voice message. In some embodiments, for each user in the multi-person voice, the voice information of the other users may be sent from the user equipment corresponding to the other users to the user equipment corresponding to the user via the network device, or may also be sent from the other users The corresponding user equipment is sent to the user equipment corresponding to the user through the p2p connection established between the two user equipments. In some embodiments, for each user in the multi-person voice, when receiving voice information sent by a certain other user, according to the virtual location information of the other user in the user's virtual sound field, and the user's The coordinate axis direction and coordinate axis unit of the virtual sound field can obtain relative distance information and relative direction information of the other user relative to the user, and play the voice information according to the relative distance information and relative direction information. For example, in the virtual sound field of User1, the positive direction of the X-axis is to the right of User1, the positive direction of the Y-axis is the front of User1, the unit of the X-axis and the Y-axis is 1 meter, and the virtual position information corresponding to User1 is (0 ,0), the virtual position information corresponding to User2 is (0,-2), it can be concluded that User2 is 2 meters behind User1, and plays the voice information according to the relative distance information and relative direction information . In some embodiments, the manner of playing the voice information according to the relative distance information and the relative direction information may be to filter and delay the voice information through a head-related transfer function (HRTF), and then output the voice information to the speaker of the user equipment for playback, Therefore, in the multi-person voice, the user can clearly and accurately distinguish each person's voice when multiple other users are speaking at the same time, and the user can intuitively and quickly know which other user is currently speaking when each other user is speaking. , which can provide great convenience for users in multi-person voice.
在一些实施例中,所述对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,包括一三模块13(未示出)、一四模块14(未示出)和一五模块15(未示出)。一三模块13,用于确定所述多人语音对应的虚拟场景信息;一四模块14,用于根据所述虚拟场景信息,确定所述多个用户中的每个用户对应的虚拟位置;一五模块15,用于根据所述目标用户对应的虚拟位置以及所述其他用户对应的虚拟位置,确定所述其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息。在此,一三模块13、一四模块14和一五模块15的具体实现方式与图1中有关步骤S13、S14和S15的实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, for the target user among the multiple users participating in the multi-person speech, the virtual position information of the other users among the multiple users in the virtual sound field corresponding to the target user is determined, including one or three Module 13 (not shown), a four-module 14 (not shown) and a five-module 15 (not shown). A third module 13 is used to determine the virtual scene information corresponding to the multi-person voices; a fourth module 14 is used to determine the virtual location corresponding to each of the multiple users according to the virtual scene information; a The fifth module 15 is configured to determine the virtual position information of the other users in the virtual sound field corresponding to the target user according to the virtual position corresponding to the target user and the virtual positions corresponding to the other users. Here, the specific implementations of the one three modules 13, the one four modules 14 and the one five modules 15 are the same as or similar to the embodiments of the steps S13, S14 and S15 in FIG. here.
在一些实施例中,所述一三模块13用于:获得所述多个用户中的语音发起用户在多个默认虚拟场景信息中选择的目标虚拟场景信息对应的标识信息,将所述目标虚拟场景信息确定为所述多人语音对应的虚拟场景信息。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the one-three modules 13 are configured to: obtain identification information corresponding to the target virtual scene information selected by the voice-initiated user among the plurality of default virtual scene information among the plurality of users, and assign the target virtual scene information to the target virtual scene information. The scene information is determined as virtual scene information corresponding to the multi-person voices. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述一三模块13用于:获得所述多个用户中的至少一个用户在多个默认虚拟场景信息中选择的至少一个目标虚拟场景信息,从所述至少一个目标虚拟场景信息中确定所述多人语音对应的虚拟场景信息,其中,所确定的虚拟场景信息被选择的次数最多。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the one-three modules 13 are configured to: obtain at least one target virtual scene information selected by at least one user among the plurality of users from a plurality of default virtual scene information, and obtain at least one target virtual scene information from the at least one target virtual scene information. The virtual scene information corresponding to the multi-person voices is determined in the scene information, wherein the determined virtual scene information is selected the most times. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述一三模块13用于:根据所述多人语音对应的语音主题信息,从多个默认虚拟场景信息中确定与所述语音主题信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所述多人语音对应的虚拟场景信息。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the one-three modules 13 are configured to: determine target default virtual scene information matching the voice theme information from a plurality of default virtual scene information according to the voice theme information corresponding to the voices of the multiple people , and determine the target default virtual scene information as the virtual scene information corresponding to the multi-person voices. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述一三模块13包括一六模块16(未示出)。一六模块16,用于根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所 述多人语音对应的虚拟场景信息。在此,一六模块16的具体实现方式与图1中有关步骤S16的实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the three-module 13 includes a six-module 16 (not shown). A six-module 16 is configured to determine target default virtual scene information matching the user information from a plurality of default virtual scene information according to the user information corresponding to the plurality of users, and convert the target default virtual scene information to the target default virtual scene information. The virtual scene information corresponding to the multi-person voice is determined. Here, the specific implementation of the six-module 16 is the same as or similar to the embodiment of step S16 in FIG. 1 , so it will not be repeated here, but is incorporated herein by reference.
在一些实施例中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:网络设备根据所述多个用户中的语音发起用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information. For the user information corresponding to the voice-initiated user among the users, target default virtual scene information that matches the user information is determined from a plurality of default virtual scene information. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:网络设备根据所述多个用户中的每个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述多个用户中的每个用户对应的用户信息相匹配的至少一个默认虚拟场景信息,并从所述至少一个默认虚拟场景信息中确定目标默认虚拟场景信息,其中,与所述目标默认虚拟场景信息相匹配的用户数量最多。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the determining, according to the user information corresponding to the multiple users, the target default virtual scene information matching the user information from the multiple default virtual scene information includes: the network device according to the multiple default virtual scene information. User information corresponding to each of the plurality of users, at least one default virtual scene information matching the user information corresponding to each of the plurality of users is determined from a plurality of default virtual scene information, and from the Target default virtual scene information is determined from at least one default virtual scene information, wherein the number of users matching the target default virtual scene information is the largest. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述虚拟场景信息中包括多个预定虚拟位置;其中,所述一四模块14:用于对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the virtual scene information includes a plurality of predetermined virtual locations; wherein, the one-fourth module 14 is configured to obtain, for each user of the plurality of users, the user's location in the plurality of users The target predetermined virtual position corresponding to the predetermined virtual positions is determined as the virtual position of the user in the virtual scene information. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中的每个用户,获得所述多个用户中的语音发起用户在所述多个预定虚拟位置中为该用户指定的目标预定虚拟位置。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users obtaining a target predetermined virtual position designated by a voice-initiated user among the plurality of predetermined virtual positions for the user among the plurality of users. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中的每个用户,根据该用户对应的用户信息,在所述多个预定虚拟位置中确定目标预定虚拟位置,其中,所述目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。在此,相关操作与图1所示实施例相同或相近,故不再赘述, 在此以引用方式包含于此。In some embodiments, the obtaining, for each of the plurality of users, the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes: for each of the plurality of users a user, according to the user information corresponding to the user, determine a target predetermined virtual position among the plurality of predetermined virtual positions, wherein the tag information of the target predetermined virtual position in the virtual scene information corresponds to the user corresponding to the user information to match. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括一七模块17(未示出)、一八模块18(未示出)和一九模块19(未示出)。一七模块17,用于生成虚拟位置请求信息并发送给所述多个用户中的每个用户,其中,所述虚拟位置请求信息包括所述虚拟场景信息;一八模块18,用于接收所述多个用户中的至少一个用户发送的、关于所述虚拟位置请求信息的反馈信息,其中,所述至少一个用户中每个用户发送的反馈信息用于指示该用户在所述多个预定虚拟位置中选择的目标预定虚拟位置;一九模块19,用于对于所述多个用户中的每个用户,根据所述反馈信息,确定该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置。在此,一七模块17、一八模块18和一九模块19的具体实现方式与图1中有关步骤S17、S18和S19的实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, for each user in the plurality of users, obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions includes a seven module 17 (not shown), There are eight modules 18 (not shown) and nine modules 19 (not shown). A seventh module 17 is configured to generate virtual location request information and send it to each of the multiple users, wherein the virtual location request information includes the virtual scene information; an eighth module 18 is configured to receive all feedback information about the virtual location request information sent by at least one of the multiple users, wherein the feedback information sent by each of the at least one user is used to indicate that the user is in the multiple predetermined virtual locations The target predetermined virtual position selected from the positions; the 19th module 19 is configured to, for each user in the plurality of users, determine the target predetermined target corresponding to the user in the plurality of predetermined virtual positions according to the feedback information virtual location. Here, the specific implementations of the 17 module 17, the 18 module 18 and the 19 module 19 are the same as or similar to the embodiments of the steps S17, S18 and S19 in FIG. here.
在一些实施例中,所述设备还用于:在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第一提示信息,并将所述第一提示信息发送至所述多个用户中尚未反馈的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置不可选择。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the device is further configured to: after receiving the feedback information sent by the first user among the multiple users, generate first prompt information corresponding to the feedback information, and send the first prompt information to the user. The information is sent to other users among the plurality of users who have not yet given feedback to prompt that the predetermined virtual position of the first target indicated by the feedback information is not selectable. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述设备还用于:在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第二提示信息,将所述第二提示信息发送至所述多个用户中除所述第一用户以外的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置已被所述第一用户选择。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the device is further configured to: after receiving the feedback information sent by the first user among the multiple users, generate second prompt information corresponding to the feedback information, and convert the second prompt information It is sent to other users in the plurality of users except the first user, so as to prompt that the first target predetermined virtual position indicated by the feedback information has been selected by the first user. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述设备还用于:接收所述至少一个用户中的第二用户发送的邀请请求信息,其中,所述第二用户已在所述多个预定虚拟位置中选择第二目标预定虚拟位置,所述邀请请求信息用于邀请所述多个用户中当前未反馈的第三用户选择所述第二目标预定虚拟位置附近的预定虚拟位置;将所述邀请请求信息发送给所述第三用户,以提示所述第三用户选择所述第二目标预定虚拟位置附近未被选择的预定虚拟位置作为所述第三用户对应的目标预定虚拟位置。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the device is further configured to: receive invitation request information sent by a second user in the at least one user, wherein the second user has selected a second user in the plurality of predetermined virtual locations the target predetermined virtual location, and the invitation request information is used to invite a third user among the multiple users who has not given feedback currently to select a predetermined virtual location near the second target predetermined virtual location; send the invitation request information to all users and the third user, to prompt the third user to select an unselected predetermined virtual position near the second target predetermined virtual position as the target predetermined virtual position corresponding to the third user. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述设备还用于:在达到所述虚拟位置请求信息对应的预定反馈期限后,对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, the device is further configured to: after reaching a predetermined feedback period corresponding to the virtual location request information, for each user among the plurality of users that is not currently giving feedback, determine that the user is in the plurality of users. A target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected among the predetermined virtual positions, and the target predetermined virtual position is determined as the virtual position of the user in the virtual scene information. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:根据所述多个用户中当前已反馈的至少一个用户在所述虚拟场景信息中的虚拟位置,确定所述虚拟场景信息中的热点位置区域信息;对于所述多个用户中当前未反馈的每个用户,将所述热点位置区域信息中的未被选择的一个预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, for each user among the plurality of users who has not currently given feedback, determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations The virtual location includes: determining the hotspot location area information in the virtual scene information according to the virtual location of at least one user in the virtual scene information that has been currently fed back from the multiple users; for the multiple users For each user who has not given feedback at present, an unselected predetermined virtual location in the hotspot location area information is determined as the virtual location of the user in the virtual scene information. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
在一些实施例中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,其中,该目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。在此,相关操作与图1所示实施例相同或相近,故不再赘述,在此以引用方式包含于此。In some embodiments, for each user among the plurality of users who has not currently given feedback, determining a target reservation corresponding to the user in at least one predetermined virtual location that is not currently selected among the plurality of predetermined virtual locations The virtual position includes: for each user among the plurality of users who has not currently given feedback, determining a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected by the user among the plurality of predetermined virtual positions, Wherein, the tag information of the target predetermined virtual position in the virtual scene information matches the user information corresponding to the user. Here, the related operations are the same as or similar to the embodiment shown in FIG. 1 , so they are not repeated here, but are incorporated herein by reference.
图3示出了可被用于实施本申请中所述的各个实施例的示例性系统。3 illustrates an exemplary system that may be used to implement various embodiments described in this application.
如图3所示在一些实施例中,系统300能够作为各所述实施例中的任意一个设备。在一些实施例中,系统300可包括具有指令的一个或多个计算机可读介质(例如,系统存储器或NVM/存储设备320)以及与该一个或多个计算机可读介质耦合并被配置为执行指令以实现模块从而执行本申请中所述的动作的一个或多个处理器(例如,(一个或多个)处理器305)。In some embodiments, as shown in FIG. 3, system 300 can function as any of the devices in each of the described embodiments. In some embodiments, system 300 may include one or more computer-readable media (eg, system memory or NVM/storage device 320 ) having instructions and be coupled to the one or more computer-readable media and configured to execute Instructions to implement a module to perform one or more processors (eg, processor(s) 305 ) to perform the actions described herein.
对于一个实施例,系统控制模块310可包括任意适当的接口控制器,以向(一个或多个)处理器305中的至少一个和/或与系统控制模块310通信的任意适当的设备或组件提供任意适当的接口。For one embodiment, the system control module 310 may include any suitable interface controller to provide at least one of the processor(s) 305 and/or any suitable device or component in communication with the system control module 310 any appropriate interface.
系统控制模块310可包括存储器控制器模块330,以向系统存储器315提供接口。存储器控制器模块330可以是硬件模块、软件模块和/或固件模块。The system control module 310 may include a memory controller module 330 to provide an interface to the system memory 315 . The memory controller module 330 may be a hardware module, a software module, and/or a firmware module.
系统存储器315可被用于例如为系统300加载和存储数据和/或指令。对于一个实施例,系统存储器315可包括任意适当的易失性存储器,例如,适当的DRAM。在一些实施例中,系统存储器315可包括双倍数据速率类型四同步动态随机存取存储器(DDR4SDRAM)。 System memory 315 may be used, for example, to load and store data and/or instructions for system 300 . For one embodiment, system memory 315 may include any suitable volatile memory, eg, suitable DRAM. In some embodiments, system memory 315 may include double data rate type quad synchronous dynamic random access memory (DDR4 SDRAM).
对于一个实施例,系统控制模块310可包括一个或多个输入/输出(I/O)控制器,以向NVM/存储设备320及(一个或多个)通信接口325提供接口。For one embodiment, system control module 310 may include one or more input/output (I/O) controllers to provide interfaces to NVM/storage device 320 and communication interface(s) 325 .
例如,NVM/存储设备320可被用于存储数据和/或指令。NVM/存储设备320可包括任意适当的非易失性存储器(例如,闪存)和/或可包括任意适当的(一个或多个)非易失性存储设备(例如,一个或多个硬盘驱动器(HDD)、一个或多个光盘(CD)驱动器和/或一个或多个数字通用光盘(DVD)驱动器)。For example, NVM/storage device 320 may be used to store data and/or instructions. NVM/storage device 320 may include any suitable non-volatile memory (eg, flash memory) and/or may include any suitable non-volatile storage device(s) (eg, one or more hard drives ( HDD), one or more compact disc (CD) drives and/or one or more digital versatile disc (DVD) drives).
NVM/存储设备320可包括在物理上作为系统300被安装在其上的设备的一部分的存储资源,或者其可被该设备访问而不必作为该设备的一部分。例如,NVM/存储设备320可通过网络经由(一个或多个)通信接口325进行访问。NVM/storage device 320 may include storage resources that are physically part of the device on which system 300 is installed, or it may be accessed by the device without necessarily being part of the device. For example, the NVM/storage device 320 is accessible via the communication interface(s) 325 over a network.
(一个或多个)通信接口325可为系统300提供接口以通过一个或多个网络和/或与任意其他适当的设备通信。系统300可根据一个或多个无线网络标准和/或协议中的任意标准和/或协议来与无线网络的一个或多个组件进行无线通信。Communication interface(s) 325 may provide an interface for system 300 to communicate over one or more networks and/or with any other suitable device. System 300 may wirelessly communicate with one or more components of a wireless network in accordance with any of one or more wireless network standards and/or protocols.
对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器(例如,存储器控制器模块330)的逻辑封装在一起。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑封装在一起以形成系统级封装(SiP)。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上。对于一个实施例,(一个或多个)处理器305中的至少一个可与系统控制模块310的一个或多个控制器的逻辑集成在同一模具上以形成片上系统(SoC)。For one embodiment, at least one of the processor(s) 305 may be packaged with the logic of one or more controllers of the system control module 310 (eg, the memory controller module 330 ). For one embodiment, at least one of the processor(s) 305 may be packaged with logic of one or more controllers of the system control module 310 to form a system-in-package (SiP). For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with the logic of one or more controllers of the system control module 310 . For one embodiment, at least one of the processor(s) 305 may be integrated on the same die with logic of one or more controllers of the system control module 310 to form a system on a chip (SoC).
在各个实施例中,系统300可以但不限于是:服务器、工作站、台式计算设备或移动计算设备(例如,膝上型计算设备、持有计算设备、平板电脑、上网本等)。在各个实施例中,系统300可具有更多或更少的组件和/或不同的架构。例如,在一些实施例中,系统300包括一个或多个摄像机、键盘、液晶显示器(LCD)屏幕(包括 触屏显示器)、非易失性存储器端口、多个天线、图形芯片、专用集成电路(ASIC)和扬声器。In various embodiments, system 300 may be, but is not limited to, a server, workstation, desktop computing device, or mobile computing device (eg, laptop computing device, handheld computing device, tablet computer, netbook, etc.). In various embodiments, system 300 may have more or fewer components and/or different architectures. For example, in some embodiments, system 300 includes one or more cameras, keyboards, liquid crystal display (LCD) screens (including touchscreen displays), non-volatile memory ports, multiple antennas, graphics chips, application specific integrated circuits ( ASIC) and speakers.
本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机代码,当所述计算机代码被执行时,如前任一项所述的方法被执行。The present application also provides a computer-readable storage medium, where the computer-readable storage medium stores computer code, and when the computer code is executed, the method described in any preceding item is executed.
本申请还提供了一种计算机程序产品,当所述计算机程序产品被计算机设备执行时,如前任一项所述的方法被执行。The present application also provides a computer program product, when the computer program product is executed by a computer device, the method according to any one of the preceding items is executed.
本申请还提供了一种计算机设备,所述计算机设备包括:The present application also provides a computer device, the computer device comprising:
一个或多个处理器;one or more processors;
存储器,用于存储一个或多个计算机程序;memory for storing one or more computer programs;
当所述一个或多个计算机程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如前任一项所述的方法。The one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any preceding item.
需要注意的是,本申请可在软件和/或软件与硬件的组合体中被实施,例如,可采用专用集成电路(ASIC)、通用目的计算机或任何其他类似硬件设备来实现。在一个实施例中,本申请的软件程序可以通过处理器执行以实现上文所述步骤或功能。同样地,本申请的软件程序(包括相关的数据结构)可以被存储到计算机可读记录介质中,例如,RAM存储器,磁或光驱动器或软磁盘及类似设备。另外,本申请的一些步骤或功能可采用硬件来实现,例如,作为与处理器配合从而执行各个步骤或功能的电路。It should be noted that the present application may be implemented in software and/or a combination of software and hardware, eg, an application specific integrated circuit (ASIC), a general purpose computer, or any other similar hardware device. In one embodiment, the software program of the present application may be executed by a processor to implement the steps or functions described above. Likewise, the software programs of the present application (including associated data structures) may be stored on a computer-readable recording medium, such as RAM memory, magnetic or optical drives or floppy disks, and the like. In addition, some steps or functions of the present application may be implemented in hardware, for example, as a circuit that cooperates with a processor to perform various steps or functions.
另外,本申请的一部分可被应用为计算机程序产品,例如计算机程序指令,当其被计算机执行时,通过该计算机的操作,可以调用或提供根据本申请的方法和/或技术方案。本领域技术人员应能理解,计算机程序指令在计算机可读介质中的存在形式包括但不限于源文件、可执行文件、安装包文件等,相应地,计算机程序指令被计算机执行的方式包括但不限于:该计算机直接执行该指令,或者该计算机编译该指令后再执行对应的编译后程序,或者该计算机读取并执行该指令,或者该计算机读取并安装该指令后再执行对应的安装后程序。在此,计算机可读介质可以是可供计算机访问的任意可用的计算机可读存储介质或通信介质。In addition, a part of the present application can be applied as a computer program product, such as computer program instructions, which when executed by a computer, through the operation of the computer, can invoke or provide methods and/or technical solutions according to the present application. Those skilled in the art should understand that the existing forms of computer program instructions in computer-readable media include but are not limited to source files, executable files, installation package files, etc. Correspondingly, the ways in which computer program instructions are executed by a computer include but are not limited to Limited to: the computer directly executes the instruction, or the computer compiles the instruction and then executes the corresponding compiled program, or the computer reads and executes the instruction, or the computer reads and installs the instruction and then executes the corresponding post-installation program. program. Here, the computer-readable medium can be any available computer-readable storage medium or communication medium that can be accessed by a computer.
通信介质包括藉此包含例如计算机可读指令、数据结构、程序模块或其他数据的通信信号被从一个系统传送到另一系统的介质。通信介质可包括有导的传输介质(诸如电缆和线(例如,光纤、同轴等))和能传播能量波的无线(未有导的传输)介质, 诸如声音、电磁、RF、微波和红外。计算机可读指令、数据结构、程序模块或其他数据可被体现为例如无线介质(诸如载波或诸如被体现为扩展频谱技术的一部分的类似机制)中的已调制数据信号。术语“已调制数据信号”指的是其一个或多个特征以在信号中编码信息的方式被更改或设定的信号。调制可以是模拟的、数字的或混合调制技术。Communication media includes media by which communication signals containing, for example, computer readable instructions, data structures, program modules or other data are transmitted from one system to another. Communication media may include conducted transmission media such as cables and wires (eg, fiber optic, coaxial, etc.) and wireless (unconducted transmission) media capable of propagating energy waves, such as acoustic, electromagnetic, RF, microwave, and infrared . Computer readable instructions, data structures, program modules or other data may be embodied, for example, as a modulated data signal in a wireless medium such as a carrier wave or similar mechanism such as embodied as part of spread spectrum technology. The term "modulated data signal" refers to a signal whose one or more characteristics are altered or set in a manner that encodes information in the signal. Modulation can be analog, digital or hybrid modulation techniques.
作为示例而非限制,计算机可读存储介质可包括以用于存储诸如计算机可读指令、数据结构、程序模块或其它数据的信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动的介质。例如,计算机可读存储介质包括,但不限于,易失性存储器,诸如随机存储器(RAM,DRAM,SRAM);以及非易失性存储器,诸如闪存、各种只读存储器(ROM,PROM,EPROM,EEPROM)、磁性和铁磁/铁电存储器(MRAM,FeRAM);以及磁性和光学存储设备(硬盘、磁带、CD、DVD);或其它现在已知的介质或今后开发的能够存储供计算机系统使用的计算机可读信息/数据。By way of example and not limitation, computer-readable storage media may include volatile and non-volatile, readable storage media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Removable and non-removable media. For example, computer-readable storage media include, but are not limited to, volatile memory, such as random access memory (RAM, DRAM, SRAM); and non-volatile memory, such as flash memory, various read-only memories (ROM, PROM, EPROM) , EEPROM), magnetic and ferromagnetic/ferroelectric memory (MRAM, FeRAM); and magnetic and optical storage devices (hard disks, tapes, CDs, DVDs); or other media now known or later developed capable of storing data for computer systems Computer readable information/data used.
在此,根据本申请的一个实施例包括一个装置,该装置包括用于存储计算机程序指令的存储器和用于执行程序指令的处理器,其中,当该计算机程序指令被该处理器执行时,触发该装置运行基于前述根据本申请的多个实施例的方法和/或技术方案。Here, an embodiment according to the present application includes an apparatus comprising a memory for storing computer program instructions and a processor for executing the program instructions, wherein, when the computer program instructions are executed by the processor, a trigger is The apparatus operates based on the aforementioned methods and/or technical solutions according to various embodiments of the present application.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或步骤,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。It will be apparent to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, but that the present application may be implemented in other specific forms without departing from the spirit or essential characteristics of the present application. Accordingly, the embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the application is to be defined by the appended claims rather than the foregoing description, which is therefore intended to fall within the scope of the claims. All changes within the meaning and scope of the equivalents of , are included in this application. Any reference signs in the claims shall not be construed as limiting the involved claim. Furthermore, it is clear that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. Several units or means recited in the device claims can also be realized by one unit or means by means of software or hardware. The terms first, second, etc. are used to denote names and do not denote any particular order.

Claims (20)

  1. 一种在多人语音中播放语音信息的方法,应用于网络设备端,其中,所述方法包括:A method for playing voice information in a multi-person voice, applied to a network device, wherein the method includes:
    对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,并根据所述虚拟位置信息,生成所述目标用户对应的虚拟声场信息;For a target user among multiple users participating in multi-person speech, determine the virtual location information of other users in the multiple users in the virtual sound field corresponding to the target user, and generate the virtual location information according to the virtual location information. The virtual sound field information corresponding to the target user;
    将所述虚拟声场信息发送给所述目标用户对应的用户设备,以使所述用户设备根据所述其他用户中的每个用户在所述虚拟声场中的虚拟位置信息播放该用户的语音信息。Sending the virtual sound field information to the user equipment corresponding to the target user, so that the user equipment plays the user's voice information according to the virtual position information of each of the other users in the virtual sound field.
  2. 根据权利要求1所述的方法,其中,所述对于参与多人语音的多个用户中的目标用户,确定所述多个用户中的其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息,包括:The method according to claim 1, wherein, for the target user among the multiple users participating in the multi-person speech, the virtual positions of other users among the multiple users in the virtual sound field corresponding to the target user are determined information, including:
    确定所述多人语音对应的虚拟场景信息;determining the virtual scene information corresponding to the multi-person voice;
    根据所述虚拟场景信息,确定所述多个用户中的每个用户对应的虚拟位置;determining a virtual location corresponding to each of the multiple users according to the virtual scene information;
    根据所述目标用户对应的虚拟位置以及所述其他用户对应的虚拟位置,确定所述其他用户在所述目标用户对应的虚拟声场中的虚拟位置信息。According to the virtual position corresponding to the target user and the virtual positions corresponding to the other users, the virtual position information of the other users in the virtual sound field corresponding to the target user is determined.
  3. 根据权利要求2所述的方法,其中,所述确定所述多人语音对应的虚拟场景信息,包括:The method according to claim 2, wherein the determining the virtual scene information corresponding to the multi-person voices comprises:
    获得所述多个用户中的语音发起用户在多个默认虚拟场景信息中选择的目标虚拟场景信息对应的标识信息,将所述目标虚拟场景信息确定为所述多人语音对应的虚拟场景信息。The identification information corresponding to the target virtual scene information selected by the voice initiating user among the plurality of default virtual scene information among the multiple users is obtained, and the target virtual scene information is determined as the virtual scene information corresponding to the multi-person voices.
  4. 根据权利要求2所述的方法,其中,所述确定所述多人语音对应的虚拟场景信息,包括:The method according to claim 2, wherein the determining the virtual scene information corresponding to the multi-person voices comprises:
    获得所述多个用户中的至少一个用户在多个默认虚拟场景信息中选择的至少一个目标虚拟场景信息,从所述至少一个目标虚拟场景信息中确定所述多人语音对应的虚拟场景信息,其中,所确定的虚拟场景信息被选择的次数最多。obtaining at least one target virtual scene information selected by at least one of the plurality of users from a plurality of default virtual scene information, and determining virtual scene information corresponding to the multi-person voices from the at least one target virtual scene information, Among them, the determined virtual scene information is selected the most times.
  5. 根据权利要求2所述的方法,其中,所述确定所述多人语音对应的虚拟场景信息,包括:The method according to claim 2, wherein the determining the virtual scene information corresponding to the multi-person voices comprises:
    根据所述多人语音对应的语音主题信息,从多个默认虚拟场景信息中确定与所述语 音主题信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所述多人语音对应的虚拟场景信息。According to the voice theme information corresponding to the voices of the multiple people, the target default virtual scene information matching the voice theme information is determined from a plurality of default virtual scene information, and the target default virtual scene information is determined as the multiple default virtual scene information. The virtual scene information corresponding to the human voice.
  6. 根据权利要求2所述的方法,其中,所述确定所述多人语音对应的虚拟场景信息,包括:The method according to claim 2, wherein the determining the virtual scene information corresponding to the multi-person voices comprises:
    根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,并将所述目标默认虚拟场景信息确定为所述多人语音对应的虚拟场景信息。According to the user information corresponding to the multiple users, target default virtual scene information matching the user information is determined from multiple default virtual scene information, and the target default virtual scene information is determined as the multi-person voice Corresponding virtual scene information.
  7. 根据权利要求6所述的方法,其中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:The method according to claim 6, wherein, according to the user information corresponding to the multiple users, determining the target default virtual scene information that matches the user information from multiple default virtual scene information includes:
    根据所述多个用户中的语音发起用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息。According to the user information corresponding to the voice initiating user among the multiple users, target default virtual scene information matching the user information is determined from multiple default virtual scene information.
  8. 根据权利要求6所述的方法,其中,所述根据所述多个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述用户信息相匹配的目标默认虚拟场景信息,包括:The method according to claim 6, wherein, according to the user information corresponding to the multiple users, determining the target default virtual scene information that matches the user information from multiple default virtual scene information includes:
    根据所述多个用户中的每个用户对应的用户信息,从多个默认虚拟场景信息中确定与所述多个用户中的每个用户对应的用户信息相匹配的至少一个默认虚拟场景信息,并从所述至少一个默认虚拟场景信息中确定目标默认虚拟场景信息,其中,与所述目标默认虚拟场景信息相匹配的用户数量最多。According to user information corresponding to each of the plurality of users, at least one default virtual scene information that matches the user information corresponding to each of the plurality of users is determined from the plurality of default virtual scene information, and determining target default virtual scene information from the at least one default virtual scene information, wherein the number of users matching the target default virtual scene information is the largest.
  9. 根据权利要求2所述的方法,其中,所述虚拟场景信息中包括多个预定虚拟位置;The method according to claim 2, wherein the virtual scene information includes a plurality of predetermined virtual positions;
    其中,所述根据所述虚拟场景信息,确定所述多个用户中的每个用户对应的虚拟位置,包括:Wherein, determining the virtual location corresponding to each of the multiple users according to the virtual scene information includes:
    对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。For each user in the plurality of users, obtain a target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions, and determine the target predetermined virtual position as the user's target predetermined virtual position in the virtual scene information virtual location.
  10. 根据权利要求9所述的方法,其中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:The method according to claim 9, wherein, for each of the plurality of users, obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions comprises:
    对于所述多个用户中的每个用户,获得所述多个用户中的语音发起用户在所述多个预定虚拟位置中为该用户指定的目标预定虚拟位置。For each user of the plurality of users, obtain a target predetermined virtual position designated by the voice-initiated user among the plurality of predetermined virtual positions for the user among the plurality of users.
  11. 根据权利要求9述的方法,其中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:The method according to claim 9, wherein, for each of the plurality of users, obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions comprises:
    对于所述多个用户中的每个用户,根据该用户对应的用户信息,在所述多个预定虚拟位置中确定目标预定虚拟位置,其中,所述目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。For each user in the plurality of users, according to the user information corresponding to the user, a target predetermined virtual position is determined among the plurality of predetermined virtual positions, wherein the target predetermined virtual position is in the virtual scene information The tag information of the user matches the user information corresponding to the user.
  12. 根据权利要求9所述的方法,其中,所述对于所述多个用户中的每个用户,获得该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置,包括:The method according to claim 9, wherein, for each of the plurality of users, obtaining the target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions comprises:
    生成虚拟位置请求信息并发送给所述多个用户中的每个用户,其中,所述虚拟位置请求信息包括所述虚拟场景信息;generating and sending virtual location request information to each of the plurality of users, wherein the virtual location request information includes the virtual scene information;
    接收所述多个用户中的至少一个用户发送的、关于所述虚拟位置请求信息的反馈信息,其中,所述至少一个用户中每个用户发送的反馈信息用于指示该用户在所述多个预定虚拟位置中选择的目标预定虚拟位置;Receive feedback information about the virtual location request information sent by at least one of the multiple users, wherein the feedback information sent by each of the at least one user is used to indicate that the user is in the multiple users the target predetermined virtual position selected in the predetermined virtual position;
    对于所述多个用户中的每个用户,根据所述反馈信息,确定该用户在所述多个预定虚拟位置中对应的目标预定虚拟位置。For each user of the plurality of users, according to the feedback information, a target predetermined virtual position corresponding to the user in the plurality of predetermined virtual positions is determined.
  13. 根据权利要求12所述的方法,其中,所述方法还包括:The method of claim 12, wherein the method further comprises:
    在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第一提示信息,并将所述第一提示信息发送至所述多个用户中尚未反馈的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置不可选择。After receiving the feedback information sent by the first user among the multiple users, generate the first prompt information corresponding to the feedback information, and send the first prompt information to the other users who have not yet given feedback among the multiple users The user is prompted to prompt that the predetermined virtual position of the first target indicated by the feedback information cannot be selected.
  14. 根据权利要求12所述的方法,其中,所述方法还包括:The method of claim 12, wherein the method further comprises:
    在接收到所述多个用户中的第一用户发送的反馈信息之后,生成该反馈信息对应的第二提示信息,将所述第二提示信息发送至所述多个用户中除所述第一用户以外的其他用户,以提示该反馈信息所指示的第一目标预定虚拟位置已被所述第一用户选择。After receiving the feedback information sent by the first user among the multiple users, second prompt information corresponding to the feedback information is generated, and the second prompt information is sent to the multiple users except the first user. other users than the user, to prompt that the first target predetermined virtual position indicated by the feedback information has been selected by the first user.
  15. 根据权利要求12所述的方法,其中,所述方法还包括:The method of claim 12, wherein the method further comprises:
    接收所述至少一个用户中的第二用户发送的邀请请求信息,其中,所述第二用户已在所述多个预定虚拟位置中选择第二目标预定虚拟位置,所述邀请请求信息用于邀请所述多个用户中当前未反馈的第三用户选择所述第二目标预定虚拟位置附近的预定虚拟位置;Receive invitation request information sent by a second user of the at least one user, wherein the second user has selected a second target predetermined virtual location from the plurality of predetermined virtual locations, and the invitation request information is used to invite selecting a predetermined virtual position near the second target predetermined virtual position by a third user among the plurality of users who has not given feedback currently;
    将所述邀请请求信息发送给所述第三用户,以提示所述第三用户选择所述第二目标预定虚拟位置附近未被选择的预定虚拟位置作为所述第三用户对应的目标预定虚拟位置。Sending the invitation request information to the third user to prompt the third user to select an unselected predetermined virtual position near the second target predetermined virtual position as the target predetermined virtual position corresponding to the third user .
  16. 根据权利要求12所述的方法,其中,所述方法还包括:The method of claim 12, wherein the method further comprises:
    在达到所述虚拟位置请求信息对应的预定反馈期限后,对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,并将该目标预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。After the predetermined feedback period corresponding to the virtual location request information is reached, for each user among the plurality of users that is not currently giving feedback, determine at least one predetermined user that is not currently selected among the plurality of predetermined virtual locations. The target predetermined virtual position corresponding to the virtual position is determined as the virtual position of the user in the virtual scene information.
  17. 根据权利要求16所述的方法,其中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:The method according to claim 16, wherein, for each user in the plurality of users who is not currently giving feedback, determining at least one predetermined virtual location that is not currently selected by the user in the plurality of predetermined virtual locations The corresponding target predetermined virtual location in , including:
    根据所述多个用户中当前已反馈的至少一个用户在所述虚拟场景信息中的虚拟位置,确定所述虚拟场景信息中的热点位置区域信息;Determine the hotspot location area information in the virtual scene information according to the virtual position of at least one user in the virtual scene information that has been currently fed back from the plurality of users;
    对于所述多个用户中当前未反馈的每个用户,将所述热点位置区域信息中的未被选择的一个预定虚拟位置确定为该用户在所述虚拟场景信息中的虚拟位置。For each user of the plurality of users who is not currently giving feedback, a predetermined virtual location that is not selected in the hotspot location area information is determined as the virtual location of the user in the virtual scene information.
  18. 根据权利要求16所述的方法,其中,所述对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,包括:The method according to claim 16, wherein, for each user in the plurality of users who is not currently giving feedback, determining at least one predetermined virtual location that is not currently selected by the user in the plurality of predetermined virtual locations The corresponding target predetermined virtual location in , including:
    对于所述多个用户中当前未反馈的每个用户,确定该用户在所述多个预定虚拟位置中当前未被选择的至少一个预定虚拟位置中对应的目标预定虚拟位置,其中,该目标预定虚拟位置在所述虚拟场景信息中的标签信息与该用户对应的用户信息相匹配。For each user in the plurality of users who has not currently given feedback, determine a target predetermined virtual position corresponding to at least one predetermined virtual position that is not currently selected by the user in the plurality of predetermined virtual positions, wherein the target predetermined virtual position The tag information of the virtual location in the virtual scene information matches the user information corresponding to the user.
  19. 一种在多人语音中播放语音信息的设备,其特征在于,所述设备包括:A device for playing voice information in a multi-person voice, characterized in that the device comprises:
    处理器;以及processor; and
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行如权利要求1至18中任一项所述的方法。a memory arranged to store computer-executable instructions which, when executed, cause the processor to perform a method as claimed in any one of claims 1 to 18.
  20. 一种存储指令的计算机可读介质,所述指令在被执行时使得系统进行如权利要求1至18中任一项所述方法的操作。A computer-readable medium storing instructions that, when executed, cause a system to operate the method of any one of claims 1 to 18.
PCT/CN2021/119542 2020-09-29 2021-09-22 Method and device for broadcasting voice information in multi-user voice call WO2022068640A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011049085.4 2020-09-29
CN202011049085.4A CN112261337B (en) 2020-09-29 2020-09-29 Method and equipment for playing voice information in multi-person voice

Publications (1)

Publication Number Publication Date
WO2022068640A1 true WO2022068640A1 (en) 2022-04-07

Family

ID=74235010

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/119542 WO2022068640A1 (en) 2020-09-29 2021-09-22 Method and device for broadcasting voice information in multi-user voice call

Country Status (2)

Country Link
CN (1) CN112261337B (en)
WO (1) WO2022068640A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112261337B (en) * 2020-09-29 2023-03-31 上海连尚网络科技有限公司 Method and equipment for playing voice information in multi-person voice
CN115550600A (en) * 2022-09-27 2022-12-30 阿里巴巴(中国)有限公司 Method for identifying sound source of audio data, storage medium and electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090169037A1 (en) * 2007-12-28 2009-07-02 Korea Advanced Institute Of Science And Technology Method of simultaneously establishing the call connection among multi-users using virtual sound field and computer-readable recording medium for implementing the same
CN107066102A (en) * 2017-05-09 2017-08-18 北京奇艺世纪科技有限公司 Support the method and device of multiple VR users viewing simultaneously
CN108881784A (en) * 2017-05-12 2018-11-23 腾讯科技(深圳)有限公司 Virtual scene implementation method, device, terminal and server
CN109086029A (en) * 2018-08-01 2018-12-25 北京奇艺世纪科技有限公司 A kind of audio frequency playing method and VR equipment
CN112261337A (en) * 2020-09-29 2021-01-22 上海连尚网络科技有限公司 Method and equipment for playing voice information in multi-person voice

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2349055B (en) * 1999-04-16 2004-03-24 Mitel Corp Virtual meeting rooms with spatial audio
JP2001339799A (en) * 2000-05-29 2001-12-07 Alpine Electronics Inc Virtual meeting apparatus
US6850496B1 (en) * 2000-06-09 2005-02-01 Cisco Technology, Inc. Virtual conference room for voice conferencing
CN102724604B (en) * 2012-06-06 2014-11-26 北京中自投资管理有限公司 Sound processing method for video meeting
GB201211512D0 (en) * 2012-06-28 2012-08-08 Provost Fellows Foundation Scholars And The Other Members Of Board Of The Method and apparatus for generating an audio output comprising spartial information
EP3254456B1 (en) * 2015-02-03 2020-12-30 Dolby Laboratories Licensing Corporation Optimized virtual scene layout for spatial meeting playback
CN106131355B (en) * 2016-07-05 2019-10-25 华为技术有限公司 A kind of sound playing method and device
JP6884854B2 (en) * 2017-04-10 2021-06-09 ヤマハ株式会社 Audio providing device, audio providing method and program
WO2019121864A1 (en) * 2017-12-19 2019-06-27 Koninklijke Kpn N.V. Enhanced audiovisual multiuser communication
CN110035250A (en) * 2019-03-29 2019-07-19 维沃移动通信有限公司 Audio-frequency processing method, processing equipment, terminal and computer readable storage medium
CN110149332B (en) * 2019-05-22 2022-04-22 北京达佳互联信息技术有限公司 Live broadcast method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090169037A1 (en) * 2007-12-28 2009-07-02 Korea Advanced Institute Of Science And Technology Method of simultaneously establishing the call connection among multi-users using virtual sound field and computer-readable recording medium for implementing the same
CN107066102A (en) * 2017-05-09 2017-08-18 北京奇艺世纪科技有限公司 Support the method and device of multiple VR users viewing simultaneously
CN108881784A (en) * 2017-05-12 2018-11-23 腾讯科技(深圳)有限公司 Virtual scene implementation method, device, terminal and server
CN109086029A (en) * 2018-08-01 2018-12-25 北京奇艺世纪科技有限公司 A kind of audio frequency playing method and VR equipment
CN112261337A (en) * 2020-09-29 2021-01-22 上海连尚网络科技有限公司 Method and equipment for playing voice information in multi-person voice

Also Published As

Publication number Publication date
CN112261337A (en) 2021-01-22
CN112261337B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
US11233833B2 (en) Initiating a conferencing meeting using a conference room device
CN107580783B (en) Method, system and storage medium for synchronizing media content between different devices
US20190341048A1 (en) Method, Apparatus and Device for Interaction of Intelligent Voice Devices, and Storage Medium
WO2022068640A1 (en) Method and device for broadcasting voice information in multi-user voice call
WO2022142619A1 (en) Method and device for private audio or video call
CN110795004B (en) Social method and device
WO2022142913A1 (en) Method and device for implementing conference message synchronization
WO2019040400A1 (en) Systems and methods for changing language during live presentation
WO2022142912A1 (en) Method and device for realizing conference message synchronization
WO2022142504A1 (en) Meeting group merging method and device
CN109996167A (en) A kind of multiple terminals collaboration plays the method and terminal of audio file
CN112818303B (en) Interaction method and device and electronic equipment
WO2023237102A1 (en) Voice chat display method and apparatus, electronic device, and computer readable medium
WO2020221195A1 (en) Method and device for publishing dynamic information
WO2022142618A1 (en) Method and device for executing instruction by means of virtual conference robot
US11496333B1 (en) Audio reactions in online meetings
CN112422488A (en) Screen projection method and device
CN113329237B (en) Method and equipment for presenting event label information
CN110620761B (en) Method and device for realizing multi-person virtual interaction
Foss et al. An Immersive Audio Control System Using Mobile Devices and Ethernet AVB-Capable Speakers
CN112261569B (en) Method and equipment for playing multiple channels
CN112533061B (en) Method and equipment for collaboratively shooting and editing video
US20230156062A1 (en) Dynamic syncing of content within a communication interface
WO2022208609A1 (en) Distribution system, distribution method, and program
CN115734000A (en) Method, device, medium and program product for concert on live broadcast line

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21874301

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28.08.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21874301

Country of ref document: EP

Kind code of ref document: A1