WO2024060856A1 - 数据处理方法、装置、电子设备、存储介质和程序产品 - Google Patents

数据处理方法、装置、电子设备、存储介质和程序产品 Download PDF

Info

Publication number
WO2024060856A1
WO2024060856A1 PCT/CN2023/111133 CN2023111133W WO2024060856A1 WO 2024060856 A1 WO2024060856 A1 WO 2024060856A1 CN 2023111133 W CN2023111133 W CN 2023111133W WO 2024060856 A1 WO2024060856 A1 WO 2024060856A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
objects
information
media stream
stream
Prior art date
Application number
PCT/CN2023/111133
Other languages
English (en)
French (fr)
Inventor
黄柳文
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2024060856A1 publication Critical patent/WO2024060856A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • This application relates to the fields of multimedia and cloud technology. Specifically, this application relates to data processing.
  • both anchors and viewers In online interactive scenarios involving multiple people (at least two people), there are usually both anchors and viewers. Of course, some users may be both anchors and viewers. For example, in a multi-person video conference scenario, each participant can be both the host and the audience. Although the existing real-time audio and video technology has brought great convenience to people's daily life and can meet basic needs, in the current multi-person interaction scenario, whether it is the audience or the anchor, it is difficult to display to all viewers or all anchors. The media data are all undifferentiated, which will affect the user's perception of usage. Especially when the number of users participating in the interaction is large, it will seriously affect the user's interaction effect.
  • the purpose of the embodiments of this application is to provide a data processing method, device, electronic device and storage medium that can better meet the needs of practical applications and effectively improve the interaction effect.
  • the technical solutions provided by the embodiments of this application are as follows:
  • embodiments of the present application provide a data processing method, which method includes:
  • the display attribute information corresponding to each of the second objects is determined, wherein the display attribute information corresponding to one of the second objects is used to identify the The media stream of the second object corresponds to the display attribute of the first object;
  • the target data stream includes a media stream of at least one target object, and the at least one target object is determined from each of the second objects. the second object;
  • the target data stream is sent to the target terminal corresponding to the first object, so that the target terminal displays the media stream in the target data stream to the first object.
  • embodiments of the present application provide a data processing device, which includes:
  • the related information acquisition module is used to obtain the related information of the first object corresponding to the target application program and the related information of each second object in the candidate object set, wherein the related information includes at least one of location information or object attribute information.
  • the first object and each of the second objects are objects corresponding to the same streaming media identifier;
  • a display attribute information determination module configured to determine the display attribute information corresponding to each second object according to the related information of the first object and the related information of each of the second objects, wherein one second object corresponds to The display attribute information is used to identify that the media stream of the second object corresponds to the display attribute of the first object;
  • a target data generation module configured to generate a target data stream according to the display attribute information corresponding to each second object, wherein the target data stream includes a media stream of at least one target object, and the at least one target object is obtained from each of the second objects. a second object determined among the second objects;
  • a data transmission module configured to send the target data stream to the target terminal corresponding to the first object, so that the target terminal displays the media stream in the target data stream to the first object.
  • embodiments of the present application also provide a data processing method, which method includes:
  • each of the second objects has corresponding display attribute information
  • the at least one target object is a second object determined from each of the second objects
  • the media stream of the at least one target object matches the display attribute information corresponding to the at least one target object
  • the display attribute information corresponding to one of the second objects is used to identify that the media stream of the second object corresponds to the display attribute of the first object
  • the display attribute information is adapted to the relevant information of the first object and the relevant information of the second object
  • the relevant information includes at least one of the location information or the object attribute information
  • the first object and each of the second objects are objects corresponding to the same streaming media identifier of the target application.
  • embodiments of the present application also provide a data processing device, which includes:
  • An acquisition module used to acquire a trigger operation for acquiring media data of a first object corresponding to a target application
  • a display module configured to display the media stream of at least one target object in each second object in the candidate object set to the first object in response to the media data acquisition triggering operation
  • each of the second objects has corresponding display attribute information
  • the at least one target object is a second object determined from each of the second objects
  • the media stream of the at least one target object is consistent with the at least one
  • the display attribute information corresponding to the target object matches.
  • the display attribute information corresponding to the second object is used to identify that the media stream of the second object corresponds to the display attribute of the first object.
  • the display attribute information is consistent with the display attribute information of the second object.
  • the relevant information of an object is adapted to the relevant information of the second object, and the relevant information includes at least one of position information or object attribute information.
  • the first object and each of the second objects are corresponding to the target. An object with the same streaming identity as the application.
  • embodiments of the present application also provide an electronic device.
  • the electronic device includes a memory and a processor.
  • a computer program is stored in the memory.
  • the processor executes the computer program to implement the provisions of any optional embodiment of the present application. Methods.
  • embodiments of the present application also provide a computer-readable storage medium, which stores a computer program.
  • the computer program is executed by a processor, the method provided in any optional embodiment of the present application is implemented.
  • embodiments of the present application also provide a computer program product.
  • the computer product includes a computer program.
  • the computer program is executed by a processor, the method provided in any optional embodiment of the present application is implemented.
  • the data processing method provided by the embodiment of the present application can be applied to any multi-person online interaction scenario.
  • Each object corresponding to the same streaming media identifier (such as the identifier of the same virtual room) is no longer undifferentiated.
  • two pairs All media streams corresponding to the object are directly provided to each first object, but based on the relevant information of the first object and each second object, the display attribute information of each second object's media stream relative to the first object is determined. , so that the target data stream adapted to the first object can be generated according to the display attribute information corresponding to the media stream of each second object and provided to the first object.
  • Figure 1 is a schematic flow chart of a data processing method provided by an embodiment of the present application.
  • Figure 2 is a schematic structural diagram of a data processing system provided by an embodiment of the present application.
  • Figure 3 is a schematic architectural diagram of a real-time audio and video system provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of the acquisition principle of location information provided by an embodiment of the present application.
  • Figure 5 is a schematic diagram of the principle of a data processing method based on the system shown in Figure 3 provided by the embodiment of the present application;
  • Figure 6 is a schematic diagram of an environmental map provided in the example of this application.
  • Figure 7 is a schematic diagram of a display method of audio data provided in the example of this application.
  • Figure 8 is a schematic diagram of an environmental map provided in the example of this application.
  • Figure 9 is a schematic diagram of a video screen display method provided in the example of this application.
  • Figure 10 is a schematic structural diagram of a data processing device provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • connection or “coupled” as used herein may include wireless connections or wireless couplings.
  • the term “and/or” is used herein to indicate at least one of the items defined by the term.
  • a and/or B can be implemented as “A”, or as “B”, or as “A and B” ".
  • the multiple projects may refer to one, more or all of the multiple projects, for example, The description "parameter A includes A1, A2, and A3" can be implemented as parameter A includes A1 or A2 or A3, or it can also be implemented as parameter A includes at least two of the three parameters A1, A2, and A3.
  • the embodiment of this application is a data processing method proposed in order to better improve the user experience in multi-person online interactive applications.
  • this method can be applied to, but is not limited to, real-time audio and video scenarios.
  • Real-Time audio and video Communication is a low-latency, high-quality audio and video communication service that can provide users with stable, reliable and low-cost audio and video transmission capabilities.
  • This service can quickly build video calls, online education, online live broadcasts, Audio and video applications such as online conferencing. Therefore, how to better improve the user experience of online users has always been one of the important issues studied by relevant technical personnel.
  • the method provided by the embodiments of this application can achieve the effect of improving user perception from one or more dimensions.
  • the data processing involved in the method provided by the embodiments of this application can be implemented based on cloud technology.
  • the data involved in the embodiments of this application can be stored in a cloud storage manner, and the data computing operations involved can be implemented using cloud computing technology.
  • Cloud computing is a computing model that distributes computing tasks across a resource pool composed of a large number of computers, enabling various application systems to obtain computing power, storage space and information services as needed.
  • the network that provides resources is called a "cloud.”
  • the resources in the "cloud” can be infinitely expanded from the user's point of view, and can be obtained at any time, used on demand, expanded at any time, and paid according to use.
  • Cloud storage (cloud storage) is a new concept extended and developed from the concept of cloud computing.
  • Distributed cloud storage system (hereinafter referred to as storage system) refers to functions such as cluster application, grid technology and distributed storage file system. , a storage system that brings together a large number of different types of storage devices (storage devices are also called storage nodes) in the network to work together through application software or application interfaces, and jointly provide external data storage and business access functions.
  • Cloud conference is an efficient, convenient, and low-cost conference format based on cloud computing technology. Users only need to perform simple and easy-to-use operations through the Internet interface to quickly and efficiently share voice, data files and videos with teams and customers around the world. Complex technologies such as data transmission and processing in meetings are provided by cloud conferencing services. Providers help users operate. At present, domestic cloud conferences mainly focus on service content based on the SaaS (Software as a Service) model, including telephone, network, video and other service forms. Video conferences based on cloud computing are called cloud conferences.
  • SaaS Software as a Service
  • Cloud education refers to education platform services based on cloud computing business model applications.
  • cloud platform all educational institutions, training institutions, enrollment service agencies, publicity agencies, industry associations, management agencies, industry media, legal structures, etc. are integrated into a resource pool in the cloud.
  • Each resource displays and interacts with each other, and communicates on demand. , achieve intentions, thereby reducing education costs and improving efficiency.
  • FIG. 1 shows a schematic flowchart of a data processing method provided by an embodiment of the present application.
  • the method can be executed by a server.
  • the server can be an independent physical server or a server cluster composed of multiple physical servers. Or a distributed system, or a cloud server that provides cloud computing services.
  • the server can receive the media stream uploaded by the terminal, and can also send the received media stream to the corresponding receiving terminal.
  • Terminal also called user terminal
  • Terminal can be smartphones, tablets, laptops, desktop computers, intelligent voice interaction devices (such as smart speakers), wearable electronic devices (such as smart watches), vehicle-mounted terminals, smart home appliances (such as smart TV), AR/VR equipment, etc., but are not limited to these.
  • the terminal and the server can be connected directly or indirectly through wired or wireless communication methods, which is not limited in this application.
  • the server may be an application server of the target application, and the application server may provide multimedia data services for users of the target application.
  • the user terminal running the target application may be a push end or a pull end.
  • the push end can push live media streams (such as live content, audio streams or video streams of online meetings, etc.) to the server, and the pull end can obtain media streams from the server.
  • the method provided by the embodiment of the present application can also be executed by the user terminal.
  • the server can send the media stream of each second object to the user terminal of the first object.
  • the user terminal of the first object can execute the embodiment of the present application. Provided method to display the received media stream to the first object.
  • the data processing method provided by the embodiment of the present application may include the following S110 to S140.
  • S110 Obtain the relevant information of the first object corresponding to the target application program and the relevant information of each second object in the candidate object set.
  • the target application can be any application that can provide media streaming services for users, such as applications that provide online conference services, applications that provide online education functions, etc.
  • the media stream can be any form of media stream such as an audio stream, a video stream or a multimedia file stream.
  • the streaming media identifier can be used to uniquely identify a streaming media.
  • the streaming media identifier can be the identifier (room number) of a live broadcast room in a live broadcast scenario, the game match identifier in a game scenario (at this time, each player participating in the same game match can be understood as an object corresponding to the same streaming media identifier), the conference number of a conference in an online conference scenario (such as a conference ID), and the multi-person group identifier relied on in a multi-person audio and video call scenario.
  • the virtual live broadcast room in the live broadcast scenario, the virtual conference room in the online conference scenario (each conference ID corresponds to a virtual conference room), the group in the multi-person call scenario, the virtual game environment in the same game match in the game scenario, etc. can be understood as the virtual environment corresponding to the application program.
  • the server can determine which objects correspond to the objects in the current same virtual environment.
  • the streaming media identifier will be explained by taking the virtual room identifier as an example.
  • the above-mentioned first object and each second object are objects corresponding to the same streaming media identifier. That is to say, the above-mentioned first object and each second object are objects corresponding to the same room (virtual room), where the first object can be any streaming user in the room, and the second object can be any streaming user in the room. All or part of the streaming end users, that is to say, the second object is the user who sends the media stream to the server, and the first object is the user who receives the media stream sent by the server.
  • the first object and each The second objects can be viewers and anchors in the same live broadcast room respectively.
  • interactions between viewers or between viewers and anchors are also possible.
  • Both viewers and anchors can be used as the second object.
  • only the anchor can be used as the second object.
  • an object can be both a pull-stream user and a push-stream user.
  • users participating in the conference can be both anchors and viewers at the same time.
  • all current players of a game are both viewers and anchors.
  • users or viewers will be used as the first object, and anchors will be used as the second object.
  • the above-mentioned related information may include at least one of location information or object attribute information.
  • the location information of the object may be at least one of real location information of the object or virtual location information of the object corresponding to the virtual scene of the target application.
  • the location information can be the location information of the real environment/space, or the object can be the location information in the virtual environment corresponding to the target application.
  • the virtual location information of the object can be the user's player.
  • the location information of the object can be the location information of the object in the virtual conference room.
  • the target application can present A picture of a virtual classroom, which can include multiple virtual seats. Students (objects) participating in the online class can choose one of the virtual seats to join the online class, and the position of each student in the virtual class (such as the position corresponding to the virtual seat) can be used as student location information.
  • the above location information may include the coordinates and orientation of the object, such as the actual longitude and latitude coordinates and orientation of the object, or it may be virtual coordinates and orientation, such as virtual coordinates and orientation in the metaverse, or coordinates and orientation in the virtual conference room. , coordinates and orientation in the virtual classroom, coordinates and orientation in the virtual game map, etc.
  • the above coordinates may be two-dimensional coordinates or three-dimensional coordinates.
  • the user terminal can regularly report the location information of the object to the server at preset intervals under the premise of user authorization.
  • the user terminal can regularly report the location information of the object every second according to the heartbeat signal (or heartbeat message), or it can be the user's location information.
  • the location information is reported only when it changes. For example, it is reported when the distance between the user's current location and the previous location exceeds the set distance.
  • the user terminal can also report the location information to the intermediate node regularly at preset intervals.
  • the intermediate node can report the location information according to the terminal's current location. The reported location information and the previously reported location information, and then the current location information is reported to the server when it is determined that the user's location information has changed.
  • the object attribute information of an object can be understood as some attributes objectively possessed by the object itself, which can be some personalized parameters of the object, and can include but is not limited to information such as the object's interests and hobbies.
  • the object attribute information may include various information related to the object's media data preferences, such as what types of audio and video the object likes or dislikes.
  • the relevant information of the object is obtained under the premise of authorization of the object.
  • the specific method of obtaining the object attribute information is not limited in this embodiment. It can be set by the user through the client of the target application, or it can be obtained by the server based on statistical analysis of the user's historical usage of the target application.
  • S120 Determine the display attribute information corresponding to each second object based on the relevant information of the first object and the relevant information of each second object.
  • the display attribute information corresponding to a second object is used to identify the display attribute of the media stream of the second object relative to the first object.
  • S130 Generate a target data stream according to the display attribute information corresponding to each second object.
  • the target data stream includes a media stream of at least one target object, and the at least one target object is a second object determined from each of the second objects.
  • S140 Send the target data stream to the target terminal corresponding to the first object, so that the target terminal displays the media stream in the target data stream to the first object.
  • the media stream of the second object refers to the media data pushed by the user terminal of the second object to the server, or the media data pushed by the server to the server.
  • the media data corresponding to the second object obtained by the server may be, for example, the live broadcast content of the anchor in the live broadcast scene, or the conference video data of the participants in the multi-person online conference scene, or the virtual scene screen of the virtual character corresponding to the player in the game scene, or the audio stream or video screen of the player, etc.
  • the media stream may include but is not limited to audio stream, video stream or other multimedia data stream, etc.
  • the display attribute information corresponding to the second object can be determined based on the relevant information of the second object and the relevant information of the first object, and the display attribute information refers to the display attribute information of the media stream of the second object relative to the first object, that is, information related to the display of the media stream of the second object, wherein the display attributes of the media stream specifically include which attributes can be configured according to actual needs.
  • the display attribute information may include but is not limited to at least one of the attribute information such as whether to display and how to display.
  • the display attribute information may include whether to provide the media stream of the second object to the first object, and when the media stream is provided to the first object, how the audio stream is played to the first object, etc.
  • the display attribute information may be the display attribute information of the media stream, that is, in what form the media stream of the second object is displayed to the first object.
  • the display attribute information of the media stream may include at least one of first information or second information, wherein the first information is used to determine whether to provide the media stream of the second object to the first object, and the second information is used to determine a display mode for displaying the media stream to the first object. That is to say, the first information identifies whether the second object is the target object corresponding to the first object. If it is the target object, the server can provide the media stream of the second object to the first object. If it is not the target object, even if the second object is not the target object, the server can provide the media stream of the second object to the first object.
  • the first object and the second object are objects corresponding to the same room, and the media stream of the second object will not be provided to the first object.
  • the second information is used to determine how to display the media stream to the first object.
  • the media stream can be an audio stream.
  • the second information can determine how to play the audio stream to the first object.
  • the media stream is a video. stream, the second information may determine how to display the video picture of the video stream to the first object.
  • the target objects corresponding to the first objects may be all second objects, or may be some objects among all the second objects.
  • all participants in the scene can be anchors.
  • all other participants are second objects.
  • the second object can be directly used as the target object corresponding to the first object.
  • the above-mentioned display attribute information may not include the first information. That is to say, there is no need to determine which objects are the target objects.
  • the display attribute information may include the third The second information can determine how to display the media stream of each second object to the first object based on the related information of the first object and each second object.
  • the server can only provide the media streams of some or all of these target objects to the first object.
  • the above-mentioned display attribute information includes first information. According to the first information corresponding to each second object, the server can determine the target object from each second object.
  • the server can also determine each target based on the relevant information of the first object and the relevant information of each target object.
  • the second information corresponding to the object (at this time, you can only determine the second information corresponding to the target object, without determining the second information corresponding to all second objects), so as to send the media stream of each target object to the target of the first object.
  • the media streams of each target object can have their own display methods, and the target terminal can display the media streams of each target object according to the display method. Each media stream is presented to the first object.
  • the server can generate a target data stream corresponding to the first object based on the display attribute information corresponding to each second object, and pass the target data stream through the target terminal of the first object.
  • the target data stream at least includes a media stream corresponding to each target object.
  • the media stream in the data stream can be displayed to the first object.
  • the target data stream can also prompt information according to the display mode of the media stream corresponding to each target object.
  • the target terminal can prompt information according to the display mode of the media stream of each target object, and each media stream can prompt information according to the corresponding display mode.
  • the method may also include: if the current application scenario corresponding to the target application meets the preset conditions, the above display attribute information includes the second information and does not include the first information. If the current application scenario does not If the preset conditions are met, the above display attribute information includes at least one of the first information or the second information.
  • the above-mentioned current application scenario may refer to the program type of the target application or at least one of the quantities of all objects corresponding to the above-mentioned streaming media identifier.
  • the program type of the target application is the first type or the above-mentioned
  • the number of all objects corresponding to the streaming media identifier is less than at least one of the set values.
  • the display attribute information corresponding to the second object includes the second information and does not include the first information. Otherwise (such as the program type is not the first type),
  • the display attribute information corresponding to the second object includes at least one of the first information or the second information.
  • the target application is a specified type of program (such as an online meeting)
  • only the display mode of the media stream can be determined, and all second objects can be used as target objects, or the second object in the current room can be
  • the number such as the number of anchors in the live broadcast room
  • all second objects can also be used as target objects.
  • the server can determine the display attribute information of the second object relative to the first object based on the related information of the first object and the related information of the second object, so that it can provide each first object with related information.
  • Media stream data adapted to the information. It can be understood that the media stream of the same second object may have different display attribute information corresponding to different first objects.
  • the above-mentioned display attribute information includes the first information.
  • the server can determine which second objects are the target objects corresponding to the first object based on the object-related information of the first object and each second object, that is, determine which second objects' media streams are provided to the first object.
  • the live broadcast room includes M anchors, M ⁇ 1.
  • the server can filter out from each anchor which anchor's live content will be transmitted to the target terminal of the viewer based on the relevant information of the viewer and each anchor.
  • the above-mentioned display attribute information corresponding to each second object is determined based on the relevant information of the first object and the relevant information of each second object, including:
  • the above method of generating a target data stream based on the display attribute information corresponding to each second object may include:
  • the degree of association between the object and the first object represents the matching degree between the first object and the object.
  • the degree of association may be based on the location information and/or object attributes of the first object and the second object. According to information calculation, the greater the degree of correlation between a second object and the first object, the closer the second object matches the first object, and the more likely it is that the second object is the target object of the first object.
  • the above-mentioned related information may include location information; for each second object, the degree of association between the first object and the second object is determined based on the related information of the first object and the related information of the second object, include:
  • the distance between the first object and the second object is determined, and the distance is used to represent the degree of association between the first object and the second object, where, Distance is negatively correlated with degree of association.
  • the server can determine the target anchor who provides live content to the audience based on the distance between the audience and each anchor. The greater the distance, the greater the distance between the anchor and the anchor. The lower the matching degree of the audience, using this solution, the media stream of the anchor with a higher matching degree with the audience (that is, the anchor closer to the first object) can be provided to the audience, so that the terminals of different viewers receive The media streams can be different.
  • the location information of the same viewer changes, the received media stream may also change.
  • the above position information is the corresponding virtual position information in the virtual scene corresponding to the first object (such as the position information of the virtual game object controlled by the game player in the game map).
  • the target object corresponding to the first object may also change.
  • the degree of relatedness between objects can be calculated by:
  • the second correlation between the first object and the second object may be calculated based on the object attribute information of the first object and the object attribute information of the second object;
  • the correlation degree between the first object and the second object is obtained.
  • the first correlation represents the degree of correlation between the first object and the second object in terms of location. The closer the distance, the greater the correlation.
  • the second correlation represents the degree of correlation between the first object and the second object in preference. , the greater the correlation, the closer the preferences are.
  • the correlation degree of the first object and the second object can be obtained by combining the two correlations. For example, the first correlation and the second correlation can be added. Or average the values to get the degree of correlation. Or, obtain the first weight corresponding to the location information and the second weight corresponding to the object attribute information, and use the first weight and the second weight to perform a weighted sum of the first correlation and the second correlation to obtain the degree of correlation.
  • the above-mentioned data processing method provided in the embodiment of the present application can provide the media stream of the target anchor with a high degree of association with the audience to the audience according to the degree of association between the audience and the anchor in the same room.
  • different media data such as live broadcast content of different anchors
  • the target data stream corresponding to the audience can also include display mode prompt information of the media stream of each target anchor (such as the above-mentioned second information or prompt information corresponding to the second information).
  • the target terminal After receiving the media stream containing each target anchor and the display mode prompt information of the media stream, the target terminal can also display the media stream of each target anchor to the audience according to their respective display modes, so as to achieve differentiated display of media data of different anchors, which can better enhance user perception.
  • the form of the media stream may be different.
  • there is only audio data and in some scenarios it is video data.
  • the video data may only have video images, or it may have both video images and audio data.
  • audio no matter which direction the sound comes from the first object, the first object can be heard within a certain hearing distance (audible distance)
  • video data Whether the first object can see the video picture is not only related to the visual distance of the first object, but also related to its current orientation (sight direction).
  • the above-mentioned position information includes the orientation and position coordinates of the object, and the method may also include:
  • Determining at least one target object from each second object based on the distance corresponding to each second object includes:
  • At least one target object is determined from the second objects according to the distances and viewing angle deviations corresponding to the second objects.
  • the second object can be object as the target object.
  • the viewing angle of the object may be a preset angle, may be configured by default on the server, or may be set by the first object itself.
  • the distance between the first object and the second object can be calculated based on the position coordinates of the first object and the position coordinates of the second object.
  • the distance between the first object and the second object can be calculated based on the position information (coordinate position and orientation) of the first object.
  • the position coordinates of the second object calculate the viewing angle deviation of the second object relative to the first object, if the angle deviation is within the viewing angle range of the first object, for example, the viewing angle range of the first object is 150 degrees, that is, 75 degrees to the left and right of the first object's line of sight. If the angle between the second object and the first object's current line of sight does not exceed 75 degrees, the second object is considered to be within the first object's viewing angle range.
  • the second object can be used as the target object.
  • At least one target object is determined from each second object based on the distance corresponding to each second object.
  • the distance between the above distances corresponding to each second object and not greater than the preset value is determined as the target distance, and the second object corresponding to the target distance is determined as the target object, that is, the distance between the anchor and the audience is within a certain range. Live content is provided to viewers.
  • the Determining a distance that is no greater than a preset value among the distances corresponding to each of the second objects as the target distance, and determining the second object corresponding to the target distance as the target object may include:
  • Method 1 If the number of target distances does not meet the preset quantity requirements, adjust the above preset value to determine the target distance that meets the preset quantity requirements from the distances corresponding to each second object, and add the third target distance corresponding to each second object.
  • the second object is determined as the target object;
  • Method 2 If the number of target distances is greater than the set number, use each second object corresponding to the target distance as a candidate object, obtain the quality of the media stream corresponding to each candidate object, and obtain the quality of the media stream corresponding to each candidate object from each At least one target object is determined among the candidate objects.
  • the above-mentioned preset quantity requirement can be a preset quantity range.
  • the preset quantity range can be a set value or a value interval.
  • the interval can include multiple positive integers.
  • the above setting The value may include at least one of an upper limit value or a lower limit value.
  • the upper limit value is used to limit the maximum number of target objects
  • the lower limit value is used to limit the minimum number of target objects
  • the value range is used to limit the final goal. The number of objects falls within this interval.
  • the preset quantity requirements if the number of target objects filtered out according to the initial above-mentioned preset values does not meet the requirements, the preset values can be adjusted one or more times and re-determined based on the adjusted preset values.
  • the above-mentioned preset quantity requirements include a lower limit value.
  • the lower limit value can be a positive integer not less than 1.
  • the lower limit value is 1. If the distance between each second object and the first object is greater than the initial predetermined value, If the value is set, then the number of target objects is 0, which does not meet the requirements, you can increase the preset value according to the preset interval, and re-determine the target objects based on the increased preset value.
  • the above-mentioned preset quantity requirement includes an upper limit value
  • the number of distances between each second object and the first object that are not greater than the initial preset value is greater than the upper limit value
  • the number can be calculated at a certain interval. Reduce the preset value and re-target the object based on the reduced preset value.
  • the quality of media streams of different second objects is usually different.
  • the quality of live video streams corresponding to different anchors will be different.
  • the picture is clear.
  • the speed, lag, etc. will be different.
  • the number of target objects filtered out based on the distance between the first object and each second object and the above preset value is too large (greater than the set number )
  • these objects can be further filtered based on the quality of their media streams, for example, a set number of target objects with better media stream quality can be selected.
  • the embodiments of the present application do not limit the specific evaluation and determination method of the quality of the media stream, and any existing method of evaluating the quality of media data can be used to determine it.
  • the quality of the media stream corresponding to the second object can be evaluated based on the data transmission delay, packet loss rate and other information between the target server and the server of the second object, or a trained neural network model can be used to predict The quality of the media stream corresponding to each second object, etc.
  • the first object may not be able to obtain any media streams, to avoid providing too many media streams to the first object, and to provide the first object with better quality media streams as possible. , such as video streams or audio streams with relatively good quality.
  • the method may also include:
  • which media streams of the second objects are provided to the first object can be selected based on the quality of the media streams of each second object, thereby realizing automatic screening based on quality, so that the first object can obtain quality-guaranteed media streams.
  • the second object corresponding to the media stream whose quality is not lower than the preset quality can be directly used as the target object, or the first object corresponding to a certain number of qualities in the top order can be used in order of quality from high to low.
  • the second object is determined as the target object.
  • the degree of association between the first object and the second object and the quality of the media stream of the second object can also be jointly determined as an optional method. , can be based on the second
  • the quality of the media stream of the object determines the correction coefficient of the degree of association between the second object and the first object, corrects the degree of association through the correction coefficient, and based on the modified degree of association corresponding to each second object, from each second object
  • the target object is determined among the objects.
  • the correction coefficient can be an integer belonging to a certain value range, and the correction coefficient corresponding to each second object is positively related to the quality of the media stream of each second object.
  • the quality of each media stream is normalized.
  • the quality of each media stream is normalized to within the value range of the correction coefficient.
  • the normalized quality of each media stream is As the correction coefficient, for each second object, the correction coefficient corresponding to the object can be added or multiplied by the correlation degree between the object and the first object to obtain the corrected correlation degree.
  • the target object matching the first object can be determined from each second object through, but not limited to, the above-mentioned specific implementations.
  • each second object can be used as a target object. For example, in a multi-person online video conference or voice conference scenario, if all participants have the right to speak, then all participants are both viewers and anchors. If the current participant The number of participants is small. For any participant, other participants can be the target objects corresponding to that participant.
  • the embodiment of the present application does not limit the specific number of target objects finally selected, and can be configured and adjusted according to the actual application scenario.
  • the number of target objects can be determined based on the total number of users in the current room to achieve adaptive adjustment of the number of target objects.
  • the above total quantity is positively correlated with the number of target objects. The larger the total quantity, the larger the number of target objects.
  • the above total number may refer to the total number of second objects corresponding to the current streaming media identification, that is, the total number of anchors in the current room. In actual implementation, if the total number of anchors in the current room is less than a certain number, all anchors can also be directly targeted.
  • a corresponding number of target objects can be screened out from each second object based on the relevant information of the first object and each second object. For example, based on the degree of correlation between the objects, screening A corresponding number of second objects with a relatively high degree of correlation are selected as target objects, and the media streams of these objects are provided to the first object.
  • the above related information may include location information
  • For each target object determine the orientation indication information between the target object and the first object based on the position information of the first object and the position information of the target object, and use the orientation indication information as the second information corresponding to the target object;
  • the above method of generating a target data stream based on the display attribute information corresponding to each second object may include:
  • the target data stream also includes the orientation indication information corresponding to each target object, which is used to instruct the target terminal to determine the target terminal according to the orientation indication information corresponding to each target object.
  • Corresponding orientation indication information displays the media stream corresponding to each target object to the first object.
  • the server can send the media stream of each target object to the first object.
  • it can be directly based on the media stream of each target object ( or or media stream and orientation indication information) to generate a target data stream and send it to the target terminal.
  • the method may also include:
  • Target object list Send the target object list to the target terminal to display the target object list to the first object through the target terminal, where the target object list includes the object identifier of each target object;
  • the media stream included in the target data stream is the media stream of at least one target object corresponding to the object selection feedback information.
  • the first object before sending the media stream of the target object to the first object, the first object can choose which of the target objects it wants to receive the media stream of.
  • the first object can select some or all of the objects in the list according to its own needs. All objects, at this time, the server can send the media stream data corresponding to part or all of the target objects to the target terminal according to the selection of the first object.
  • the server may first display the media data display mode option interface to the first object through the first object's target terminal.
  • this option may include a first mode and a second mode, wherein if the first object selects the first mode, the server may provide the media stream to the first object by performing the methods shown in S110 to S140 above, and if the second In the second object selection mode, the media stream of each second object can be provided to the first object in other ways.
  • the above target object list can also be provided to the first object, and the first object itself selects which target objects' media streams to provide to the first object.
  • the target object corresponding to the first object may change, when the target object corresponding to the first object changes, the latest target object list may be sent to the first object, and the first object may be selected again.
  • an optional object list may also be sent to the target terminal.
  • the list may include the object identifier of each second object, or the object identifier of the target object, or the object identifier of the target object and a third object that is not part of the target object.
  • the object identification of the second object allows the first object to have more choices.
  • the optional object identification may include the identification of all target objects and the identification of at least one non-target object.
  • the identification of the target object and the non-target object also Different prompt information can be displayed, and the prompt information can inform the first object about the difference between the target object and the non-target object, such as the distance range between the target object and the first object.
  • the data processing method provided in the embodiments of this application can intelligently recommend anchors to users based on factors such as object location information, object attribute information, or media stream quality, and can also provide media to users based on one or more of these factors.
  • Intelligent display of data can realize real-time audio and video solutions for realistic and immersive mutual communication, which can better meet user needs and improve user perception.
  • the data processing method provided by the embodiment of the present application can also be executed by a user terminal.
  • the data processing method may include:
  • each of the second objects has corresponding display attribute information
  • the at least one target object is a second object determined from each of the second objects
  • the media stream of the at least one target object is consistent with the at least one
  • the display attribute information corresponding to the target object matches, and the display attribute information corresponding to the second object is used to identify the second object.
  • the media stream corresponds to the display attribute of the first object.
  • the display attribute information is adapted to the related information of the first object and the related information of the second object.
  • the related information includes at least one of location information or object attribute information.
  • the first object and each second object are objects corresponding to the same streaming media identifier of the target application.
  • the media data acquisition trigger operation is an operation triggered by the first object to acquire media data from the server of the target application, which may include but is not limited to the trigger operation of the first object joining the virtual scene corresponding to the same streaming media identifier, such as the operation of the first object clicking to join the live broadcast room on the user interface of the live broadcast application, the operation of clicking to start a game in the game application, the operation of joining a meeting in the online meeting scene, etc.
  • the user terminal of the first object obtains the operation of the first object, it can display the media stream of at least one second object to the first object according to the relative relationship between the first object and each second object, so as to realize the display of media data adapted to the first object.
  • the user terminal of the first object can send a media data acquisition request to the server, and the server can obtain the relevant information of the first object and the relevant information of each second object, Determine the display attribute information of each second object's media stream relative to the first object, and generate a corresponding target data stream based on the display attribute information corresponding to each second object and send it to the user terminal.
  • the user terminal can then send the server
  • the sent media data that is adapted to the relevant information of the first object is displayed to the first object.
  • the media data that is adapted to the relevant information of the first object may refer to the media data that is adapted to the relevant information of the first object.
  • the matching media stream of at least one second object or refers to the display method of the media stream of at least one second object and the media streams of each second object.
  • the media data displayed to the object by the user terminal of the object is Matching the display attribute information of each pusher object corresponding to the streaming media identifier (i.e. the second object, such as the anchor corresponding to the live broadcast room) with respect to the object, a second object with respect to the display attribute information of the first object
  • the relative relationship (such as the degree of association) between the second object and the first object can be reflected.
  • the media data provided to the first object can be determined according to the relative relationship between each second object and the first object.
  • the media data includes a media stream of at least one second object.
  • the display attribute information includes first information used to represent whether to display the media stream of the second object to the second object. If the first information corresponding to a second object indicates not to display the media stream of the second object, The first object and the media stream displayed to the first object by the user terminal of the second object do not include the media stream of the second object.
  • the method may further include: in response to a change in the display attribute information corresponding to each second object, displaying the media stream of at least one target object that matches the changed display attribute information corresponding to each second object to the first object.
  • the media stream of the second object changes relative to the display attribute information of the first object
  • the media data presented to the first object by the user terminal of the first object may also change.
  • This change may The target object has changed, or the display method of the target object's media stream has changed.
  • the change of the display attribute information corresponding to the second object is caused by the change of the related information of the first object or the related information of the second object. For example, it can be based on the position information of the first object and the position information of the second object. Determine the relative position relationship (such as distance) between the first object and the second object.
  • the display attribute information may be determined based on the relative position relationship between the second object and the second object.
  • the target object corresponding to the first object may be each The second object is the same as the first If the distance between the objects is less than the preset value for the second object, the user terminal of the first object can display the media streams of these target objects to the first object.
  • the current target object is relative to the target object in the previous period If a change occurs, only the media stream of the current target object can be displayed to the first object.
  • the current target objects are object A and object B, and the target objects in the previous period were object A and object C.
  • the media streams of object A and object B are provided to the first object.
  • the above-mentioned display attribute information includes first information, and the first information is used to determine whether to provide the media stream of the second object to the first object.
  • the above-mentioned display of the media stream of at least one target object among the second objects to the first object includes:
  • a second object located within the visual range of the first object and/or within the audible distance of the first object is taken as the target object, and the media stream of the target object is displayed to the first object.
  • the second objects located within the visual range and/or the audible distance of the first object belong to the target objects, and the user terminal can only display the media streams of these target objects to the first object.
  • the second object located within the visual range of the first object means that the distance between the second object and the first object is within the visual distance of the first object or the second object is located within the visual angle of the first object. at least one of them.
  • the audible distance of the first object means that the distance between the second object and the first object is less than the set distance.
  • the above-mentioned display attribute information includes second information, and the second information is used to determine a display method for displaying the media stream to the first object.
  • a personalized media stream display method can be provided for the first object.
  • the media stream of each target object can be displayed to the first object according to the relative position between each target object and the first object. That is, the target terminal of the first object can determine the relative orientation information between each target object and the first object based on the orientation indication information corresponding to each target object, and transfer the media stream of each target object according to the relative orientation information corresponding to each target object.
  • the target terminal of the first object can determine the relative orientation information between each target object and the first object based on the orientation indication information corresponding to each target object, and transfer the media stream of each target object according to the relative orientation information corresponding to each target object.
  • the orientation indication information between the first object and the target object can be any form of information capable of determining the relative orientation between the two.
  • the orientation indication information can be displayed indication information.
  • the orientation indication information corresponding to each target object may be the relative orientation information between the target object and the first object (such as which orientation of the second object is in the first object), that is, the server can directly inform the target terminal of the relationship between each target object and the first object.
  • the orientation indication information can also be implicit indication information.
  • the orientation indication information can include the location information of the target object. The target terminal can determine the location of the first object and the location of each target object notified by the server.
  • the position information determines the relative orientation information between each target object and the first object.
  • the relative orientation information corresponding to a target object may include but is not limited to the distance between two objects, orientation information of the target object relative to the first object, etc.
  • the target terminal can display the media stream of each target object to the first object based on the orientation indication information of each target object in a display manner that matches the orientation indication information, so that the first object can have the identity of the first object. Immersive perception can effectively improve user perception and better meet practical application needs.
  • the relative orientation information between the second object and the first object can determine at least one of the first information or the second information corresponding to the second object.
  • the first information determines whether to transfer the second object's
  • the media stream is provided to the first object
  • the second information determines how the media stream of the second object is displayed when the media stream is provided to the first object.
  • the media stream displayed to the first object may be the media stream of some of the second objects that meet the condition (for example, the second object whose distance from the first object is not greater than a preset value), or it may be the media stream of all second objects.
  • the media streams of the two objects are combined into When these media streams are displayed to the first object, they can be displayed according to the relative orientation information between each target object and the first object.
  • the spatial audio playback method can be used to play the audio streams corresponding to each target object to the first object according to the relative orientations of each target object and the first object.
  • the audio streams corresponding to each target object can be played according to the relative orientation of each target The relative position of the object and the first object displays the video image of each target object on the user terminal of the first object.
  • the above-mentioned media stream includes an audio stream
  • the target terminal can display the media stream corresponding to each target object to the first object in the following manner:
  • the audio stream corresponding to each target object is played to the first object using a spatial audio playback method.
  • the target terminal can determine the relative orientation information corresponding to each target object according to the orientation indication information corresponding to each target object, and determine the audio stream corresponding to each target object relative to the first object according to the relative orientation information corresponding to each target object. spatial playback direction.
  • the orientation indication information may be position information of the target object, relative orientation information between the target object and the first object, or spatial playback direction indication information of the audio stream of the target object.
  • the orientation indication information is the location information of the target object
  • the target terminal can determine the location information of the first object according to the location information of the first object and the location of the target object.
  • Information, the relative orientation of the target object and the first object can be calculated, so that the playback mode of the audio stream of the target object, that is, the above-mentioned spatial playback direction, can be determined based on the relative orientation.
  • the relative orientation can be directly used as the corresponding audio stream. sound playback direction, and then the target terminal can play the audio stream of each target object to the first object using the spatial audio playback method, so that the first object has a more realistic and immersive audio experience.
  • the target anchors (target objects) corresponding to Z are anchors A, B, C, and D.
  • A is behind Z
  • B is to the right of Z
  • C is on Z
  • D is directly in front of Z.
  • Z can hear A's voice coming from behind, hear D's voice directly in front, hear B's voice on the right, and C's voice. The sound is on the left.
  • the above-mentioned media stream includes a video stream
  • the target terminal can display the media stream corresponding to each target object to the first object in the following manner:
  • the video stream corresponding to each target object is displayed to the first object.
  • the target terminal can determine the relative orientation information of each target object relative to the first object based on the orientation indication information corresponding to each target object, and can determine the video of each target object based on the relative orientation information corresponding to each target object. The position where the streamed video will appear on the target terminal.
  • the target terminal can play the video stream of each target object to the first object according to the relative position between the first object and the target object.
  • the position indication information can be the position information of the target object, the relative position information between the target object and the first object, or the display position indication information of the video stream of each target object on the target terminal. That is to say, the specific display mode of each target video stream can be determined by the target terminal according to the received video stream.
  • the position indication information received may be determined by the target terminal itself, or the server may determine the position indication information based on the position information between the first object and each second object and then inform the target terminal.
  • the target terminal may include at least two display areas (such as multiple display screens), and may determine the target display corresponding to the video stream of each target object based on the relative orientation information between each target object and the first object. area, and display the video streams of each target object to the first object according to their corresponding target display areas.
  • the target terminal has two display screens placed relative to the left and right, and there are three target objects. Among them, target objects a and b are both on the left of the first object, and another target object c is on the right of the second object.
  • the target object can be The video images of a and b are displayed on the left display screen, and the video image of target object c is displayed on the right display screen.
  • target object a is at the upper left of the first object
  • target object b is at the upper left corner of the first object.
  • the video images of target object a and target object b can be displayed relatively above and below the left display screen.
  • the video stream may be a video stream that only includes images, or it may be a video stream that includes both images and audio.
  • the relevant information of the first object and the second object determines at least one of the display mode of the picture or the display mode of the audio in the video stream of the target object, and according to the display mode of at least one of the video picture or audio, each target is The subject's video stream is shown to the first subject.
  • the solutions provided by the embodiments of this application can be applied to many application scenarios, including but not limited to online live broadcasts, online games (including but not limited to board games, online shooting and other types of games), online social networking, and online education. , metaverse, online recruitment, online signing and other online communication scenarios.
  • online live broadcast tourism scenario is used as an example for explanation.
  • Figure 2 shows a schematic structural diagram of an implementation environment of a real-time audio and video system applicable to this scenario embodiment.
  • the implementation environment may include an application server 110 and a user terminal, where,
  • the application server 110 is a server of the target application program and can provide online live broadcast services for users of the target application program.
  • the second object corresponding to the target application is the anchor
  • the first object is the audience in the live broadcast room.
  • the anchor can be a real anchor or an AI anchor (that is, a virtual anchor).
  • the user terminal can be any terminal running the target application.
  • the user terminal includes the user terminal of the audience and the user terminal of the anchor.
  • Figure 2 schematically shows n streaming terminals in the same live broadcast room.
  • each pull terminal 120 and push terminal 130 can communicate with the application server 110 through a wired network or a wireless network.
  • the end user can initiate an online live broadcast as the anchor through the target application running on the user terminal, or can Join the live broadcast room as a viewer, where a live broadcast room can include multiple anchors at the same time.
  • the streaming end can collect the video stream (i.e. live content) on the anchor side through the corresponding video collection device (which can be a terminal or a collection device connected to the terminal), and then compress and encapsulate the collected video stream and push it to
  • the application server 110 such as the terminal 131, sends the live broadcast content (video stream 1) of the host 1 to the application server.
  • the streaming end can pull the existing live broadcast content of the server to the audience's user terminal, and display the pulled live broadcast content to the audience.
  • FIG. 3 shows the method provided by the embodiment of the present application. Structural diagram of a real-time audio and video system.
  • devices in the real-time audio system other than the user's user terminal can be understood as service nodes in the application server. These service nodes work together to provide real-time audio services to users.
  • FIG. 3 schematically shows two different regions, namely Region A and Region B.
  • Region A and Region B For the same
  • Area Aset3 in the figure is an area of area A
  • area Bset1 is an area.
  • a section of B, where each media access service section can include a media management node and a media accounting machine deployed with a QoS (Quality of Service, Quality of Service) module.
  • the media management node can include as shown in Figure 3
  • the room management module in the set and the load management module in the set, the media access machine can adopt a distributed deployment method.
  • the total number of required media accesses can be calculated based on the total number of users in the area, and the proportion of media access operators can also be allocated based on the operator ratio. If area A requires 100 media access machines, and the ratio of different operators a, b, c is 4:4:2, then 40 media access machines corresponding to operator a and 40 media access machines corresponding to operator can be deployed in area A b's media access machine, 20 media access machines corresponding to operator a, plus load management within the set and room management within the set form a set (area). If there are a lot of users in a certain region, you can also deploy multiple sets in that region.
  • the system architecture shown in Figure 3 can adopt distributed load management and distributed room management.
  • Each set has its own load management service, that is, the above-mentioned load management module in the set.
  • the load management load in the set collects the load of each module in the set ( Including media access machine) load.
  • Each set has its own room management module, which is responsible for managing the rooms and users of this set. When the room management module crashes, it can be restored through media access and the room management module in the set.
  • a distributed QoS module is deployed in the media access machine to control QoS for local users.
  • the service system of this system architecture can be accessed through unified media.
  • Each media access machine can access the host's client and the audience's client. There is no need to distinguish between the interface machine and the agent machine, and the anchor and the audience can be unified. Connected to the media access machine, the audience does not need to go through the process of checking out and re-entering the room to the interface machine. After switching roles, the audio and video data can be directly uploaded.
  • room management modules room management in Figure 3
  • load management modules load management in Figure 3
  • the above-mentioned in-set load management module and in-set room management The module can be called the second-level management module.
  • the upper-level room management module and load management module can be called the first-level management module.
  • the second-level management module is used to manage rooms and loads of the equipment in the set.
  • the first-level management module Used to manage each area, wherein the first-level management module can receive reports from the second-level management module of each area to manage and control each area.
  • the first-level load management module can obtain the load information of each area (such as the number of connected users, the load status of the media access set, equipment resource occupancy, etc.) from each second-level load management module.
  • the module can send the load information of each area to the scheduling module, and the scheduling module can perform media access scheduling for each area based on the received information.
  • the process may include the following steps:
  • the client sends a distribution media access request to the forwarding layer (forwarding node).
  • the media access request can Including streaming media identification and client identification (user identification).
  • the subpacketization layer determines the cluster to which the user belongs based on the user ID or room number, and forwards the access request to the scheduling module of the corresponding cluster (the scheduling shown in the figure).
  • the subpackaging layer can be pre-configured with clusters corresponding to multiple streaming media identification intervals, such as cluster 1 to cluster N as shown in Figure 3. Different clusters correspond to different streaming media identification intervals.
  • the subpackaging layer is in When a media access request is received, the streaming media identification interval in which the streaming media identification is located can be determined based on the streaming media identification in the media access request, the target cluster is determined based on the streaming media identification interval, and the access request is forwarded to The scheduling module corresponding to the target cluster.
  • the scheduling module allocates media access machines to media access requests.
  • the scheduling module when the scheduling module allocates media access machines, it no longer needs to do single-machine level aggregation scheduling, but set-level aggregation scheduling. Specifically, it can be based on the load situation of the nearby set. Select the set, and then select the media access machine in the set in a weighted and random manner based on the load of the media access machine. Since the processing power of a single set (each set can have multiple media access machines) is a hundred times that of a single media access machine, it can more easily and effectively resist high concurrency and prevent excessive aggregation of single machines. The entire scheduling and allocation process can be completed locally, greatly simplifying the process. Finally, the allocated media access machine is returned to the user terminal, for example, the IP address of the media access machine is sent to the user terminal.
  • the client After the client obtains the address of the media access machine, it connects to the media access machine (Media Access
  • the live broadcast room entry process is completed. Otherwise, the corresponding processing can be performed according to the preconfigured policy. For example, the room management module and scheduling module in the set can enter according to the live broadcast room. The strategy allows the user to finally enter the live broadcast room, or it can also prompt the user that the live broadcast room does not exist and the user can re-execute the live broadcast room entry process.
  • the media access machine can obtain the attribute information of each user in the live broadcast room, such as the user's preference information, etc., subject to the user's authorization.
  • the client of the target application can also provide users with various personalized setting options. Users can configure some data sending and receiving methods when participating in live broadcast according to their own needs.
  • Setting options may include but are not limited to hearing distance setting items, sight distance setting items, viewing angle setting items, etc. These options may have their own default values, and users can modify their corresponding hearing distance range and visualization range through these options. Wait within.
  • tag information can also be set for users through statistical analysis based on the user's historical operation information and interactions between users in the real-time audio and video process. These tag information can also be used as the user's attribute information. , for example, users who have chatted with each other for a long time can summarize common tags, and users who have been chatted with for a long time can accumulate their popularity values. When there are many target objects selected, in addition to considering the quality of the object's media stream, You can also further filter the target objects based on their popularity.
  • Figure 4 shows a schematic diagram of the principle of real-time location reporting based on the system structure shown in Figure 3, where, In this scenario embodiment, the streaming media identifier may include but is not limited to the room identifier of the live broadcast room.
  • the user in Figure 4 may be any user in the live broadcast room, including the anchor and the audience.
  • the user's terminal can send a heartbeat signal (including coordinate heartbeat) containing the user's location information to the media access machine it accesses once every second, where the location information can include ⁇ X-axis coordinate, Y-axis coordinate, orientation (such as rotation angle relative to the Y-axis) ⁇ position information, for the selection of the coordinate system
  • the location information can include ⁇ X-axis coordinate, Y-axis coordinate, orientation (such as rotation angle relative to the Y-axis) ⁇ position information, for the selection of the coordinate system
  • the acquisition and setting are not limited in the embodiments of this application.
  • the media access machine After the media access machine obtains the real-time location heartbeat of each user, it can aggregate the location information of each user and pass it to the room management module in the set. The room management module in the set further aggregates and then passes it to the first-level room management module.
  • the media access machine can aggregate the received user's location information at preset intervals and send it to the room management module in the set, or it can judge the user's location information. If the user's current location information is found Changes from the last received location information are reported to the room management module in the set. If the distance between the user's two adjacent location information is greater than the preset distance, it will be reported. Similarly, the room management module in the set can also report to the first-level room management module based on changes in user location information after aggregation.
  • the above location information can be actual longitude, latitude coordinates and orientation, or virtual coordinates.
  • the travel environment can be a real environment or a virtual environment.
  • the user can control the virtual environment of his target.
  • the character travels in the virtual environment.
  • the anchor can provide live broadcast services to the audience in the real environment and take the audience to experience online travel.
  • the application server 110 may store related information (related information about viewers, related information about hosts) of objects corresponding to each terminal that accesses the application server 110 into the database 140 .
  • the application server 110 can be based on the information stored in the database 140.
  • the media access machine can communicate with each viewer in the live broadcast room based on the relevant information of the viewer.
  • the relevant information of the anchor provides the audience with media data that is more suitable for them.
  • the application server 110 sends the video stream 13 including the live content of the anchor 1 and the anchor 3 to the terminal 121 according to the relevant information of the audience 1 and the m anchors.
  • the relative positional relationship of the audience 1 (the relative orientation information may include the relative positional relationship), and the relative positional relationship between the audience 1 and the anchor 2, the live broadcast screen 1 of the anchor 1 and the live broadcast screen 3 of the anchor 3 are displayed on the terminal 121, specifically , the live broadcast screen 1 of anchor 1 is displayed on the left side of the terminal 121, and the live broadcast screen 3 of anchor 3 is displayed on the right side of the terminal 121.
  • the application server 110 will contain the live broadcast content of anchor 2 and the live broadcast content of anchor 3.
  • the video stream 23 is sent to the terminal 122.
  • the terminal 122 displays the live screen 2 and the live screen 3 of the anchor 2 and the anchor 3.
  • the terminal 12n sends the server to the terminal's video stream 13 to the live broadcast screen 1 and the anchor 3 of the anchor 1.
  • the live broadcast screen 3 is displayed to the audience n, where the live broadcast screen 1 and the live broadcast screen 3 on the terminal 12n are displayed in a manner as shown in Figure 2, which is different from the live broadcast screen 1 and the live broadcast screen 3 on the terminal 121.
  • the first-level room management module can add the anchor's attribute information and the anchor's real-time location information to the signaling pushed by the anchor list (full anchor list) to the room management module in the set of the same room.
  • the room management module in the set then spreads it to all media access machines in the same room in the set.
  • the media access machine can customize the host list according to the user's characteristics based on the relevant information of the hosts and viewers in the same room and deliver it to the user, which can include but is not limited to the following methods:
  • Location intelligent range recommendation Based on the number of users in the current room, it can ensure that users can see and hear relatively nearby anchors. If there are many users within the sight or hearing range, the anchors who are closer can be given priority.
  • Quality intelligent range recommendation based on the current number of room users, ensures that users can see and hear anchors with relatively good quality. For example, when there are many users within the sight or hearing range, you can give priority to smaller delay, louder volume, and better image quality. A clearer anchor, that is, an anchor with better media stream quality.
  • Example 1 Location-based audio anchor list recommendation
  • Figure 6 shows a schematic diagram of a tourism environment.
  • the environment can be a real environment, a virtual environment, or a combination of virtual and real environments.
  • User Z in the figure is any viewer in the live broadcast room in this example.
  • Each user except user Z shown in the figure is the anchor.
  • the hearing distance range of user Z that is, the maximum hearing distance, is radius d.
  • Audio scene 1 Based on the solution provided by the embodiment of this application, as shown in Figure 6, when user Z or user Z’s virtual character moves to the position Z1 near scenic spot 1, it is within the hearing distance range (centered on Z1 in the figure) The range covered by the circle shown by the dotted line, the radius of the circle is d, and the radius d is the default value in this example) There are 4 anchors (objects whose distance from user Z is not greater than the default value), namely A, B, C, and D. At this time, the anchor list (target object list) delivered to the user terminal of user Z by the media access machine contains the relevant information of the four anchors A, B, C, and D. User Z can choose to automatically listen to all 4 anchors or conversations, or manually choose to listen to some anchors or conversations.
  • the media access machine can send the audio data of the selected anchor to the user terminal of user Z.
  • the user terminal can use the spatial audio playback method to play the audio data of each anchor. Play to Z.
  • the angle between Z's orientation and Y is 0 (facing due north on the map in Figure 3)
  • A is behind Z
  • Z hears A's voice from behind
  • D is directly in front of Z.
  • Ahead Z hears the sound of D directly in front
  • B is on the right of Z
  • Z hears the sound of B on the right
  • C is on the left of Z
  • Z hears the sound of C on the left.
  • user Z can listen to audio data from different anchors in different directions.
  • the dotted lines in the figure represent audio data in different directions.
  • Audio scene 2 When user Z moves to the Z2 position near scenic spot 2 on the map, there are multiple anchors within the hearing range. As shown in Figure 6, there are a total of 7 anchors. At this time, the media access machine can deliver The anchor list contains all 7 anchors. If you feel it is too noisy, you can also choose the best anchor based on the above location intelligent range recommendation, quality intelligent range recommendation, attribute intelligent range recommendation or comprehensive quality range to ensure the safety of user Z Optimal listening experience. For example, 7 anchors can be used as candidates, and based on the distance between each of the 7 anchors and Z, the 4 closer anchors can be selected as the final target objects, and the audio data of these anchors can be sent to user Z. Similarly, Z receives After obtaining the audio of each anchor, you can use the spatial audio playback method to play the audio data of each anchor to user Z.
  • Audio scene three When user Z moves to the Z3 position on the map, there is no anchor within the default listening distance range of user Z (the small circle with Z3 as the center).
  • the listening distance range can be intelligently enlarged, that is, the distance can be adjusted.
  • the default value (the courtyard centered on Z3), after appropriately enlarging the listening distance range, as shown in Figure 6, there are two anchors within the listening distance range. After that, you can refer to the processing method of audio scene 1 above to The audio data of these two anchors is provided to user Z and can be played to the user using spatial audio playback.
  • Video scene 1 As shown in Figure 8, when user Z moves to the Z1 position on the map and the angle between user Z's orientation angle and the Y-axis is 0°, user Z's sight distance range and viewing angle are within ( The range covered by the virtual arc in the figure) has three anchors, namely B, C, and D. At this time, the anchor list delivered to user Z can include four anchors B, C, and D. User Z can choose to automatically watch all three anchors or conversations, or manually choose to watch some of the anchors or conversations. After Z receives the videos from each anchor, it can arrange the layout of the screen according to the directions of B, C, and D. As shown in Figure 9, the live broadcast screen of anchor C (C in the figure) is on the left side of Z. D's live broadcast screen is directly in front of Z, and B is on the right side of Z's live broadcast screen.
  • Video Scenario 2 When user Z moves to the Z2 position on the map, and the angle between Z's orientation angle and the Y-axis is 90°, there are 6 anchors within his sight range. At this time, the anchors can be sent The list contains all these 6 anchors. If the business side or the user wants to select more anchors (the user can be provided with the anchor number setting function, the user can set the target application on his terminal and can accept up to several anchors at the same time. media data), you can also select the best anchor based on the above location intelligent range recommendation, quality intelligent range recommendation, attribute intelligent range recommendation or comprehensive quality range to ensure the optimal experience for user Z. After Z receives the videos of each anchor sent by the media access machine, the target terminal of user Z can also arrange the layout of the screen according to the sight distance and orientation of each anchor relative to Z.
  • Video Scenario 3 When user Z moves to the Z3 position on the map and the angle between Z's orientation angle and the Y-axis is 0°, there is no anchor within the default viewing distance range of user Z, and the viewing distance range can be enlarged intelligently. After appropriately enlarging the visual distance range, two anchors are within the visual distance range of user Z. Then, the video stream can be provided to user Z by referring to the processing method of video scenario 1 above.
  • the audio-based anchor list generation method and the video-based anchor list generation method described above can be used together.
  • video images can be arranged and audio signals can be played based on the relative position of the anchor and user Z.
  • the user terminal of the first object can provide the media stream of at least one anchor to the first object according to the relative relationship between each anchor and the first object.
  • the media streams of some anchors whose association with the object meets certain conditions can be provided to the object, or the media streams of some or all anchors can be provided according to The relative orientation information of each anchor and the object is displayed to the user.
  • the server can send the media streams of some anchors whose association with the first object meets certain conditions to the user terminal of the first object and display them to the user terminal according to the relevant information between the first object and each anchor.
  • the object, or the server can send some or all of the anchor's media streams and the display mode prompt information of each media stream to the user terminal of the first object, and the terminal prompts the received information according to the display mode of each media stream.
  • Each media stream is presented to the first object.
  • the media data received by different users in the same live broadcast room is no longer the same, but media data that is more consistent with each user, which can improve user perception.
  • it can intelligently recommend the best anchor based on the anchor's audio and video quality, the positional relationship between the user and other anchors, and common interests and hobbies, etc., to achieve a more realistic experience.
  • Immersive metaverse real-time audio and video solution The pictures that users see are no longer indistinguishable, but can be displayed based on their positional relationships. The sounds the user hears are no longer indistinguishable. The position and direction of the anchor's voice that the user hears are determined based on the actual positional relationship and direction. This solution can realize more realistic immersive online classes, meetings, games, live broadcasts and other scenarios.
  • the embodiment of the present application also provides a data processing device, as shown in Figure 10, the data processing device 100 includes a relevant information acquisition module 110, a display attribute information determination module 120, a target data generation module 130 and a data transmission module 140.
  • the related information acquisition module 110 is used to obtain the related information of the first object corresponding to the target application program and the related information of each second object in the candidate object set, where the related information includes at least one of location information or object attribute information,
  • the first object and each second object are objects corresponding to the same streaming media identifier;
  • the display attribute information determination module 120 is used to determine the display attribute information corresponding to each second object according to the related information of the first object and the related information of each second object, wherein the display attribute information corresponding to a second object is used to identify The media stream of the second object corresponds to the display attribute of the first object;
  • a target data generating module 130 configured to generate a target data stream according to the display attribute information corresponding to each second object, wherein the target data stream includes a media stream of at least one target object, and the at least one target object is a second object determined from each of the second objects;
  • the data transmission module 140 is used to send the target data stream to the target terminal corresponding to the first object, so that the target terminal displays the media stream in the target data stream to the first object.
  • the above display attribute information includes at least one of first information or second information.
  • the first information is used to determine whether to provide the media stream of the second object to the first object.
  • the second information is used to determine whether to provide the media stream of the second object to the first object. How the stream is displayed to the first object.
  • the display attribute information includes the first information; when the display attribute information determination module determines the display attribute information corresponding to each second object according to the relevant information of the first object and the relevant information of each second object, it can be used to:
  • For each second object determine the degree of association between the first object and the second object based on the related information of the first object and the related information of the second object, and use the degree of association as the first information corresponding to the second object;
  • the target data generation module may be configured to: determine at least one target object matching the first object from each second object according to the degree of association corresponding to each second object, and generate a target data stream based on the media stream of the at least one target object.
  • the relevant information includes location information; for each second object, the display attribute information determination module can be used to:
  • the distance between the first object and the second object is determined, and the distance is used to represent the degree of association between the first object and the second object, where the distance Negatively related to the degree of association.
  • the media stream includes a video stream
  • the location information includes the object's orientation and position coordinates.
  • the target data generation module can be used to:
  • For each second object determine the viewing angle deviation of the second object relative to the first object based on the position coordinates of the second object and the position information of the first object;
  • At least one target object is determined from each second object.
  • the target data generation module may be used to determine the second object corresponding to a target distance that is not greater than a preset value among the distances corresponding to each second object as the target object.
  • the target data generation module can also be used for:
  • the preset value is adjusted to obtain the distances corresponding to each second object. Determine the target distances that meet the preset quantity requirement, and determine the second objects corresponding to the target distances as the target objects;
  • each second object corresponding to the target distance is used as a candidate object, the quality of the media stream corresponding to each candidate object is obtained, and based on the quality of the media stream corresponding to each candidate object, from each candidate object Identify at least one target audience.
  • the target data generation module can also be used to: obtain the quality of the media stream of each second object; and determine at least one target object from each second object according to the quality of the media stream of each second object.
  • the relevant information includes location information
  • the display attribute information includes second information
  • the display attribute information determination module may be used to:
  • For each target object determine the orientation indication information between the target object and the first object based on the position information of the first object and the position information of the target object, and use the orientation indication information as the second information corresponding to the target object;
  • the target data generation module may be used to: generate a target data stream based on the orientation indication information corresponding to each target object and the media stream corresponding to each target object.
  • the target data stream also includes the orientation indication information corresponding to each target object for indicating the said
  • the target terminal displays the media stream corresponding to each target object to the first object according to the orientation indication information corresponding to each target object.
  • the target data generation module can also be used to: determine the total number of objects corresponding to the streaming media identifier; and determine the number of target objects based on the total number.
  • the data transmission module can also be used for:
  • the target object list Before generating the target data stream, sending the target object list to the target terminal to display the target object list to the first object through the target terminal, where the target object list includes the object identifier of each target object;
  • the media stream included in the target data stream is the media stream of at least one target object corresponding to the object selection feedback information.
  • the location information of any object includes at least one of the following:
  • the real location information of the object corresponds to the virtual location information in the virtual scene of the target application.
  • the embodiment of the present application also provides a data processing device.
  • the data processing device may be a user terminal.
  • the data processing device may include:
  • the acquisition module is used to obtain the media data acquisition trigger operation of the first object corresponding to the target application
  • a display module configured to display the media stream of at least one target object among the second objects in the candidate object set to the first object in response to the media data acquisition triggering operation
  • each of the second objects has corresponding display attribute information
  • the at least one target object is a second object determined from each of the second objects, and the first object and each second object are corresponding to the target application.
  • objects with the same streaming media identifier the media stream of the at least one target object matches the display attribute information corresponding to the at least one target object, and the display attribute information corresponding to a second object is used to identify the media stream of the second object
  • the display attribute information is adapted to the related information of the first object and the related information of the second object, and the related information includes at least one of location information or object attribute information.
  • the display attribute information includes at least one of first information or second information, and the first information is used to determine whether Whether to provide the media stream of the second object to the first object, the second information is used to determine the presentation mode of displaying the media stream to the first object.
  • the display module may be configured to: use a second object located within the visual range of the first object and/or within the audible distance of the first object as the target object, and place the target object as the target object.
  • the media stream is displayed to the first object.
  • the display module can be used to:
  • the media stream of each target object is displayed to the first object.
  • the media stream includes audio stream
  • the display module can be used for:
  • the audio stream corresponding to each target object is played to the first object using a spatial audio playback method.
  • the media stream includes video stream
  • the display module can be used for:
  • the video stream corresponding to each target object is displayed to the first object.
  • the above display module can also be used to: in response to a change in the display attribute information corresponding to each second object, display the media stream of at least one target object that matches the changed display attribute information corresponding to the second object. To the first object.
  • the device of the embodiment of the present application can execute the method provided by the embodiment of the present application, and its implementation principle is similar.
  • the actions performed by each module in the device of the embodiment of the present application are the same as those of the embodiment of the present application.
  • the embodiment of the present application also provides an electronic device, including a memory, a processor, and a computer program stored in the memory.
  • the processor executes the computer program stored in the memory, it can implement any of the optional embodiments of the present application. method.
  • Figure 11 shows a schematic structural diagram of an electronic device applicable to the embodiment of the present application.
  • the electronic device can be a server or a user terminal, and the electronic device can be used to implement any embodiment of the present application. provided method.
  • the electronic device 2000 can mainly include at least one processor 2001 (one is shown in Figure 11), a memory 2002, a communication module 2003, an input/output interface 2004 and other components.
  • processor 2001 one is shown in Figure 11
  • memory 2002 a memory
  • communication module 2003 a communication module
  • input/output interface 2004 an input/output interface
  • bus 2005 a bus 2005
  • the memory 2002 can be used to store operating systems and application programs.
  • Application programs can include computer programs that implement the methods shown in the embodiments of this application when called by the processor 2001, and can also include programs used to implement other functions or services.
  • the memory 2002 may be a ROM (Read Only Memory) or may store static information.
  • RAM Random Access Memory, random access memory
  • EEPROM Electrically Erasable Programmable Read Only Memory, electronic erasable programmable read-only memory
  • CD-ROM Compact Disc Read Only Memory, read-only disc
  • optical disc storage including compressed optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.
  • the processor 2001 is connected to the memory 2002 through the bus 2005, and implements corresponding functions by calling the application program stored in the memory 2002.
  • the processor 2001 can be a CPU (Central Processing Unit, central processing unit), a general-purpose processor, a DSP (Digital Signal Processor, a data signal processor), an ASIC (Application Specific Integrated Circuit, an application-specific integrated circuit), or an FPGA (Field Programmable). Gate Array, field programmable gate array) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof, which can implement or execute various exemplary logic blocks and modules described in conjunction with the disclosure of this application and circuit.
  • the processor 2001 may also be a combination that implements computing functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc.
  • the electronic device 2000 can be connected to the network through the communication module 2003 (which may include but is not limited to components such as a network interface) to communicate with other devices (such as a user terminal or a server, etc.) through the network to achieve data interaction, such as sending data to other devices or receiving data from other devices.
  • the communication module 2003 may include a wired network interface and/or a wireless network interface, etc., that is, the communication module may include at least one of a wired communication module or a wireless communication module.
  • the electronic device 2000 can connect required input/output devices, such as keyboards, display devices, etc. through the input/output interface 2004.
  • the electronic device 200 itself can have a display device, and can also be connected to other external display devices through the interface 2004.
  • a storage device such as a hard disk, etc., can also be connected through the interface 2004, so that the data in the electronic device 2000 can be stored in the storage device, or the data in the storage device can be read, and the data in the storage device can also be transferred to the storage device. stored in memory 2002.
  • the input/output interface 2004 can be a wired interface or a wireless interface.
  • the device connected to the input/output interface 2004 may be a component of the electronic device 2000 or an external device connected to the electronic device 2000 when needed.
  • the bus 2005 used to connect various components may include a path to transfer information between the above-mentioned components.
  • the bus 2005 can be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus or an EISA (Extended Industry Standard Architecture) bus, etc. According to different functions, bus 2005 can be divided into address bus, data bus, control bus, etc.
  • the memory 2002 can be used to store a computer program for executing the solution of the present application and run by the processor 2001.
  • the processor 2001 runs the computer program, the implementation of the present application is implemented.
  • the embodiment of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the computer program is executed by a processor, the foregoing method can be implemented.
  • Embodiments of the present application also provide a computer program product, which includes a computer program.
  • the computer program When executed by the processor, the corresponding contents of the foregoing method embodiments can be realized.
  • each operation step is indicated by arrows in the flow chart of the embodiment of the present application, the order of implementation of these steps is not limited to the order indicated by the arrows.
  • the implementation steps in each flowchart may be executed in other orders according to requirements.
  • some or all of the steps in each flowchart are based on actual implementation scenarios and may include multiple sub-steps or multiple stages. Some or all of these sub-steps or stages may be executed at the same time, and each of these sub-steps or stages may also be executed at different times. In scenarios with different execution times, the execution order of these sub-steps or stages can be flexibly configured according to needs, and the embodiments of the present application do not limit this.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本申请实施例提供了一种数据处理方法、装置、电子设备、存储介质和程序产品,涉及多媒体技术和云技术领域。该方法包括:获取第一对象的相关信息以及各第二对象的相关信息,该相关信息包括位置信息或对象属性信息中的至少一项;根据第一对象和各第二对象的相关信息,确定各第二对象对应的展示属性信息,该展示属性信息用于确定第二对象的媒体流对应于第一对象的展示属性信息;根据各第二对象对应的展示属性信息,生成包括至少一个目标对象的媒体流的目标数据流并发送给第一对象对应的目标终端,其中,至少一个目标对象是从各所述第二对象中确定的第二对象。基于本申请实施例提供该方法,可以更加适配地向第一对象提供相应的媒体流。

Description

数据处理方法、装置、电子设备、存储介质和程序产品
本申请要求于2022年09月20日提交中国专利局、申请号为202211146126.0、申请名称为“数据处理方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及多媒体和云技术领域,具体而言,本申请涉及数据处理。
背景技术
随着科技的飞速发展以及人们生活水平的提高,提供各种服务的应用程序已经成为人们日常生活中不可或缺的一部分,具有交互功能的各种实时音视频技术可以为人们提供各式各样的在线交互功能,如在线直播、在线课堂、在线会议、群直播、多人在线游戏、视频分享等等。
对于由多人(至少两人)参与的在线交互场景中,通常都会既有主播又有观众,当然有的用户可能既是主播又是观众。比如,在多人视频会议场景中,每个参会人员都可以既是主播又是观众。虽然现有的实时音视频技术已经为人们日常生活带来了极大的便利,能够满足基本需求,但是目前的多人交互场景中,不论是观众还是主播而言,展示给所有观众或所有主播的媒体数据都是无差别的,会对用户的使用感知造成影响,尤其是在参与交互的用户的数量较多时,会严重影响用户的交互效果。
发明内容
本申请实施例的目的旨在提供一种能够更好的满足实际应用需求、能够有效提升交互效果的数据处理方法、装置、电子设备及存储介质。为了实现上述目的,本申请实施例提供的技术方案如下:
一方面,本申请实施例提供了一种数据处理方法,该方法包括:
获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息,其中,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于同一流媒体标识的对象;
根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,其中,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于第一对象的展示属性;
根据各所述第二对象对应的展示属性信息,生成目标数据流,其中,所述目标数据流包括至少一个目标对象的媒体流,所述至少一个目标对象是从各所述第二对象中确定的第二对象;
将目标数据流发送给所述第一对象对应的目标终端,以使所述目标终端将目标数据流中的媒体流展示给所述第一对象。
另一方面,本申请实施例提供了一种数据处理装置,该装置包括:
相关信息获取模块,用于获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息,其中,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于同一流媒体标识的对象;
展示属性信息确定模块,用于根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,其中,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性;
目标数据生成模块,用于根据各所述第二对象对应的展示属性信息,生成目标数据流,其中,所述目标数据流包括至少一个目标对象的媒体流,所述至少一个目标对象是从各所述第二对象中确定的第二对象;
数据传输模块,用于将目标数据流发送给所述第一对象对应的目标终端,以使所述目标终端将目标数据流中的媒体流展示给所述第一对象。
另一方面,本申请实施例还提供了一种数据处理方法,该方法包括:
响应于目标应用程序对应的第一对象的媒体数据获取触发操作,将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给所述第一对象;
其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,所述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性,该展示属性信息与所述第一对象的相关信息和该第二对象的相关信息相适配,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于所述目标应用程序的同一流媒体标识的对象。
另一方面,本申请实施例还提供了一种数据处理装置,该装置包括:
获取模块,用于获取目标应用程序对应的第一对象的媒体数据获取触发操作;
显示模块,用于响应于所述媒体数据获取触发操作,将候选对象集中各第二对象中的至少一个目标对象的媒体流展示给所述第一对象;
其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,所述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性,该展示属性信息与所述第一对象的相关信息和该第二对象的相关信息相适配,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于目标应用程序的同一流媒体标识的对象。
另一方面,本申请实施例还提供了一种电子设备,该电子设备包括存储器和处理器,存储器中存储有计算机程序,处理器执行该计算机程序以实现本申请任一可选实施例中提供的方法。
另一方面,本申请实施例还提供了一种计算机可读存储介质,该存储介质中存储有计算机程序,该计算机程序被处理器执行时实现本申请任一可选实施例中提供的方法。
另一方面,本申请实施例还提供了一种计算机程序产品,该计算机产品包括计算机程序,该计算机程序被处理器执行时实现本申请任一可选实施例中提供的方法。
本申请实施例提供的技术方案带来的有益效果如下:
本申请实施例提供的数据处理方法,可以适用于任意的多人在线交互场景中,对应于同一流媒体标识(如同一虚拟房间的标识)的各个对象,不再是无差别化的将各个第二对 象对应的所有媒体流直接都提供给每个第一对象,而是会根据第一对象和各个第二对象的相关信息,确定出各个第二对象的媒体流相对于第一对象的展示属性信息,从而可以按照各个第二对象的媒体流各自对应的展示属性信息,生成与第一对象相适配的目标数据流并提供给第一对象。由于不同的第一对象的相关信息通常都是存在差异的,而基于本申请实施例提供的方法,可以按照与各个对象各自适配的方式向各个对象提供对应的媒体数据,因此实现了更加灵活、个性化的推荐,可以有效提升用户的使用感知,更好的满足了实际应用需求。
附图说明
图1为本申请实施例提供的一种数据处理方法的流程示意图;
图2为本申请实施例提供的一种数据处理系统的结构示意图;
图3为本申请实施例提供的一种实时音视频系统的架构示意图;
图4为本申请实施例提供的一种位置信息的获取原理示意图;
图5为本申请实施例提供的基于图3所示的系统的一种数据处理方法的原理示意图;
图6为本申请示例中提供的一种环境地图的示意图;
图7为本申请示例中提供的一种音频数据的展示方式示意图;
图8为本申请示例中提供的一种环境地图的示意图;
图9为本申请示例中提供的一种视频画面的展示方式示意图;
图10为本申请实施例提供的一种数据处理装置的结构示意图;
图11为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
下面结合本申请中的附图描述本申请的实施例。应理解,下面结合附图所阐述的实施方式,是用于解释本申请实施例的技术方案的示例性描述,对本申请实施例的技术方案不构成限制。
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请实施例所使用的术语“包括”以及“包含”是指相应特征可以实现为所呈现的特征、信息、数据、步骤、操作、元件和/或组件,但不排除实现为本技术领域所支持其他特征、信息、数据、步骤、操作、元件、组件和/或它们的组合等。应该理解,当我们称一个元件被“连接”或“耦接”到另一元件时,该一个元件可以直接连接或耦接到另一元件,也可以指该一个元件和另一元件通过中间元件建立连接关系。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的术语“和/或”指示该术语所限定的项目中的至少一个,例如“A和/或B”可以实现为“A”,或者实现为“B”,或者实现为“A和B”。在描述多个(两个或两个以上)项目时,如果没有明确限定多个项目之间的关系,这多个项目之间可以是指多个项目中的一个、多个或者全部,例如,对于“参数A包括A1、A2、A3”的描述,可以实现为参数A包括A1或A2或A3,还可以实现为参数A包括参数A1、A2、A3这三项中的至少两项。
本申请实施例是为了更好的提升多人在线交互应用中用户的使用感,而提出来的一种数据处理方法。其中,该方法可以适用于但不限于实时音视频场景中。实时音视频(Real-Time  Communication,RTC)是提供的低延时、高质量的音视频通讯服务,可以为用户提供稳定、可靠和低成本的音视频传输能力,使用该服务可以快速构建视频通话、在线教育、在线直播、在线会议等音视频应用。因此,如何更好的提升线上用户的使用体验一直是相关技术人员在研究的重要问题之一,而本申请实施例提供的方法,可以从一个或多个维度达到提升用户使用感知的效果。
可选的,本申请实施例提供的方法中所涉及的数据处理可以基于云技术实现。可选的,本申请实施例中所涉及的数据可以采用云存储的方式存储,所涉及的数据计算操作可以采用云计算技术实现。
云计算(cloud computing)是一种计算模式,它将计算任务分布在大量计算机构成的资源池上,使各种应用系统能够根据需要获取计算力、存储空间和信息服务。提供资源的网络被称为“云”。“云”中的资源在使用者看来是可以无限扩展的,并且可以随时获取,按需使用,随时扩展,按使用付费。而云存储(cloud storage)是在云计算概念上延伸和发展出来的一个新的概念,分布式云存储系统(以下简称存储系统)是指通过集群应用、网格技术以及分布存储文件系统等功能,将网络中大量各种不同类型的存储设备(存储设备也称之为存储节点)通过应用软件或应用接口集合起来协同工作,共同对外提供数据存储和业务访问功能的一个存储系统。
可选的,本申请实施例提供的方法可以适用于云会议场景中,云会议是基于云计算技术的一种高效、便捷、低成本的会议形式。使用者只需要通过互联网界面,进行简单易用的操作,便可快速高效地与全球各地团队及客户同步分享语音、数据文件及视频,而会议中数据的传输、处理等复杂技术由云会议服务商帮助使用者进行操作。目前国内云会议主要集中在以SaaS(Software as a Service,软件即服务)模式为主体的服务内容,包括电话、网络、视频等服务形式,基于云计算的视频会议就叫云会议。
可选的,本申请实施例提供的方法也适用于云教育(Cloud Computing Education,CCEDU)中,云教育是指基于云计算商业模式应用的教育平台服务。在云平台上,所有的教育机构,培训机构,招生服务机构,宣传机构,行业协会,管理机构,行业媒体,法律结构等都集中云整合成资源池,各个资源相互展示和互动,按需交流,达成意向,从而降低教育成本,提高效率。
当然,可以理解的是,本申请实施例提供的方法,同样可以适用于没有采用云技术的多人在线交互应用中。
下面通过对几个示例性实施方式的描述,对本申请实施例的技术方案以及本申请的技术方案产生的技术效果进行说明。需要指出的是,下述实施方式之间可以相互参考、借鉴或结合,对于不同实施方式中相同的术语、相似的特征以及相似的实施步骤等,不再重复描述。
图1示出了本申请实施例提供的一种数据处理方法的流程示意图,其中,该方法可以由服务器执行,其中,服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。服务器可以接收终端上传的媒体流,还可以将接收到的媒体流发送给相应的接收终端。终端(也可以称为用户终 端/用户设备/终端设备)可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能语音交互设备(例如智能音箱)、可穿戴电子设备(例如智能手表)、车载终端、智能家电(例如智能电视)、AR/VR设备等,但并不局限于此。终端以及服务器可以通过有线或无线通信方式进行直接或间接地连接,本申请在此不做限制。
本申请实施例中,服务器可以是目标应用程序的应用服务器,应用服务器可以为目标应用程序的用户提供多媒体数据服务,运行目标应用程序的用户终端可以是推流端,也可以是拉流端,推流端可以将媒体流(如直播内容、在线会议的音频流或视频流等)直播内容推送给服务器,拉流端则可以从服务器获取媒体流。
可选的,本申请实施例提供的方法也可以由用户终端执行,服务器可以将各个第二对象的媒体流发送给第一对象的用户终端,第一对象的用户终端可以通过执行本申请实施例提供的方法,将接收到的媒体流展示给第一对象。
如图1中所示,本申请实施例提供的数据处理方法可以包括以下S110至S140。
S110:获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息。
本申请实施例中,目标应用程序可以是任一可以为用户提供媒体流服务的应用程序,如提供在线会议服务的应用、提供在线教育功能的应用等。
媒体流可以是音频流、视频流或多媒体文件流等任意形式的媒体流。流媒体标识可用于唯一标识一个流媒体。例如,流媒体标识可以是直播场景下的某个直播间的标识(房间号)、游戏场景中的游戏对局标识(此时参与同一游戏对局的各个玩家可以理解为对应于同一流媒体标识的对象)、在线会议场景下某个会议的会议号码(如会议ID)、多人音视频通话场景下所依赖的多人群组标识等。对于上述各应用场景的示例中,直播场景中的虚拟直播间、在线会议场景中的虚拟会议室(每个会议ID对应一个虚拟会议室)、多人通话场景下的群组、游戏场景下同一游戏对局下的虚拟游戏环境等可以理解为是应用程序对应的虚拟环境,根据流媒体标识,服务器可以确定哪些对象是对应于当前同一虚拟环境中的对象。为了描述方便,在下文的一些实施例中,流媒体标识将以虚拟房间标识为例进行说明。
本申请实施例中,上述第一对象和各第二对象是对应于同一流媒体标识的对象。也就是说,上述第一对象和各第二对象是对应于同一房间(虚拟房间)的对象,其中,第一对象可以是该房间中任一拉流端用户,第二对象可以是该房间中的全部或部分推流端用户,也就是说,第二对象是向服务器发送媒体流的用户,第一对象是接收服务器发送的媒体流的用户,比如,在直播场景中,第一对象和各第二对象可以分别是同一直播间的观众和主播,可选的,在一些直播场景中,观众与观众之间或者观众与主播之间也是可以互动的(如语音交互),在这些场景中,观众和主播都可以作为第二对象,当然,也可以只将主播作为第二对象。可以理解的是,在一些应用场景中,一个对象既可以是拉流端用户,又可以是推流端用户,比如,在多人会议场景中,参与会议的用户可以同时是主播和观众,再比如,在游戏场景中,一局游戏的当前所有玩家既是观众又是主播。在下面的一些实施例中,将以用户或观众替代第一对象,以主播替代第二对象进行说明。
对于任一对象,上述相关信息可以包括位置信息或对象属性信息中的至少一项。其中, 对象的位置信息可以是对象的真实位置信息或对象对应于目标应用程序的虚拟场景中虚拟位置信息中的至少一项。
也就是说,该位置信息可以是真实环境/空间的位置信息,也可以对象对应于目标应用程序的虚拟环境中的位置信息,比如,在游戏场景中,对象的虚拟位置信息可以是用户的玩家角色在游戏环境中的位置信息,再例如,在多人在线会议场景中,对象的位置信息可以是对象在虚拟会议室中的位置信息,再比如,在在线教育场景中,目标应用程序可以呈现虚拟教室的画面,其中可以包括多个虚拟座位,参与在线课堂的学生(对象)可以选择其中一个虚拟座位加入在线课堂,可以采用各个学生在该虚拟课堂中的位置(如虚拟座位对应的位置)作为学生的位置信息。
可选的,上述位置信息可以包括对象的坐标和朝向,如对象实际的经纬度坐标和朝向,也可以是虚拟坐标与朝向,如元宇宙中的虚拟坐标与朝向、虚拟会议室中的坐标与朝向、虚拟课堂中的坐标与朝向、虚拟游戏地图里的坐标与朝向等等。其中,上述坐标可以是二维坐标,也可以是三维坐标。
对于获取对象的位置信息的方式,本申请实施例不做限定,可以根据实际应用需求配置。比如,可以是用户终端在用户授权的前提下按照预设间隔定期向服务器上报,例如,用户终端可以按心跳信号(或者说心跳报文)每秒定期上报对象的位置信息,也可以是用户的位置信息发生变更时才上报,如用户当前的位置与之前的位置的距离超过设定距离时上报,还可以是用户终端按照预设间隔定期将位置信息上报给中间节点,中间节点可以根据终端当前上报的位置信息和之前上报的位置信息,确定用户的位置信息有变动时再将当前的位置信息上报给服务器。
对象的对象属性信息可以理解为对象自身客观具有的一些属性,可以是对象的一些个性化的参数,可以包括但不限于对象的兴趣爱好等信息。可选的,对象属性信息可以包括与对象的媒体数据偏好有关的各项信息,如对象喜欢或不喜欢什么类型的音频、视频等。
需要说明的是,本申请实施例中,对象的相关信息是在对象授权的前提下获取的。对于对象属性信息的具体获取方式本申请实施例不做限定,可以是用户通过目标应用程序的客户端设置的,也可以是服务器根据用户对于目标应用程序的历史使用情况统计分析得到的。
S120:根据第一对象的相关信息和各第二对象的相关信息,确定各第二对象对应的展示属性信息。
其中,一个第二对象对应的展示属性信息用于标识该第二对象的媒体流相对于第一对象的展示属性。
S130:根据各第二对象对应的展示属性信息,生成目标数据流。
其中,目标数据流包括至少一个目标对象的媒体流,至少一个目标对象是从各所述第二对象中确定的第二对象。
S140:将目标数据流发送给第一对象对应的目标终端,以使目标终端将目标数据流中的媒体流展示给第一对象。
其中,第二对象的媒体流是指第二对象的用户终端推送给服务器的媒体数据,或者服 务器获取的第二对象对应的媒体数据,如直播场景中主播的直播内容,再比如,多人在线会议场景中参与人员的会议视频数据,游戏场景中玩家对应的虚拟角色的虚拟场景画面或者是玩家的音频流或视频画面等。本申请实施例中,媒体流可以包括但不限于音频流、视频流或其他多媒体数据流等等。
本申请实施例中,对于每个第二对象,可以根据该第二对象的相关信息和第一对象的相关信息,确定该第二对象对应的展示属性信息,展示属性信息是指该第二对象的媒体流相对于第一对象而言的展示属性信息,也就是与该第二对象的媒体流的展示有关的信息,其中,媒体流的展示属性具体包括哪些属性可以根据实际需求配置。本申请实施例中,展示属性信息可以包括但不限于是否展示、如何展示等属性信息中的至少一项。比如,对于包括音频流的媒体流,展示属性信息可以包括是否将第二对象的媒体流提供给第一对象,在将该媒体流提供给第一对象时,音频流如何播放给第一对象等。再比如,对于包括图像的媒体流(比如视频流),展示属性信息可以是媒体流的显示属性信息,即将第二对象的媒体流采用何种形式显示给第一对象。
本申请的可选实施例中,媒体流的展示属性信息可以包括第一信息或第二信息中的至少一项,其中,第一信息用于确定是否将第二对象的媒体流提供给第一对象,第二信息用于确定将媒体流展示给第一对象的展示方式。也就是说,第一信息标识了第二对象是否是第一对象对应的目标对象,如果是目标对象,服务器可以将该第二对象的媒体流提供给第一对象,如果不是目标对象,即使第一对象和第二对象是对应于同一房间的对象,也不会将该第二对象的媒体流提供给第一对象。第二信息则是用于确定将媒体流展示给第一对象的方式,比如,媒体流可以是音频流,第二信息可以确定如何将音频流播放给第一对象,再比如,媒体流是视频流,第二信息可以确定如何将视频流的视频画面展示给第一对象。
本申请实施例中,第一对象对应的目标对象可以是所有第二对象,也可以是所有第二对象中的部分对象。比如,在多人在线视频会议场景中,该场景中的所有参会人员都可以是主播,对于任一参会人员(第一对象)而言,所有其他参会人员都是第二对象,这些第二对象可以直接都作为该第一对象所对应的目标对象,该示例中,上述展示属性信息可以不包括第一信息,也就是说可以无需确定哪些对象是目标对象,展示属性信息可以包括第二信息,可以根据第一对象和各个第二对象的相关信息,确定如何将各个第二对象的媒体流展示给第一对象。
可选的,也可以是先根据第一对象和各个第二对象的相关信息,确定哪些第二对象是目标对象,服务器可以只将这些目标对象中的部分或全部对象的媒体流提供给第一对象,此时,上述展示属性信息则包括第一信息,根据各个第二对象对应的该第一信息,服务器可以从各个第二对象中确定出目标对象。
可选的,在目标对象是从各个第二对象中筛选出的对象时,在确定出各个目标对象之后,服务器还可以根据第一对象的相关信息和各个目标对象的相关信息,确定出各个目标对象对应的第二信息(此时可以只确定目标对象对应的第二信息,而无需确定所有第二对象对应的第二信息),以在将各个目标对象的媒体流发送给第一对象的目标终端时,各目标对象的媒体流可以有各自的展示方式,目标终端可以按照各目标对象的媒体流的展示方式 将各个媒体流展示给第一对象。
服务器在确定出各第二对象对应的展示属性信息之后,则可以根据各个第二对象对应的展示属性信息生成第一对象对应的目标数据流,并将该目标数据流通过第一对象的目标终端提供给第一对象。其中,目标数据流中至少包括各个目标对象对应的媒体流,第一对象的目标终端接收到目标数据流之后,则可以将该数据流中的媒体流展示给第一对象。可选的,目标数据流中还可以各目标对象对应的媒体流的展示方式提示信息,目标终端可以按照各目标对象的媒体流的展示方式提示信息,将各媒体流按照相应的展示方式提示信息展示给第一对象。
作为本申请的一可选方案,该方法还可以包括:若目标应用程序对应的当前应用场景满足预设条件,上述展示属性信息包括第二信息,不包括第一信息,若该当前应用场景不满足预设条件,上述展示属性信息包括第一信息或第二信息中的至少一项。
可选的,上述当前应用场景可以是指目标应用程序的程序类型或者是上述流媒体标识对应的所有对象的数量中的至少一项,例如,若目标应用程序的程序类型是第一类型或上述流媒体标识对应的所有对象的数量小于设定数值中的至少一项,第二对象对应的展示属性信息包括第二信息,不包括第一信息,否则(如该程序类型不是第一类型),第二对象对应的展示属性信息包括第一信息或第二信息中的至少一项。
也就是说,在目标应用程序是指定类型的程序(如在线会议)时,可以只确定媒体流的展示方式,可以将所有第二对象均作为目标对象,或者,在当前房间中第二对象的数量(比如直播间中主播的数量)较少时,也可以将所有第二对象均作为目标对象。
在对应于同一媒体流表示的第一对象有多个时,比如在直播场景中观众是多个,由于不同的第一对象的相关信息很可能是存在差异的,基于本申请的上述方案,对于每一个第一对象,服务器可以根据该第一对象的相关信息和第二对象的相关信息,确定第二对象相对于该第一对象的展示属性信息,从而可以向每个第一对象提供与其相关信息相适配的媒体流数据。可以理解的是,同一个第二对象的媒体流对应于不同的第一对象的展示属性信息有可能是不同的。
本申请的可选实施例中,上述展示属性信息包括第一信息,此时,服务器可以根据第一对象和各第二对象的对象相关信息,来确定哪些第二对象是第一对象对应的目标对象,即确定将哪些第二对象的媒体流提供给第一对象。例如,在直播间中包括M个主播,M≥1,对于任一观众(第一对象),服务器可以根据该观众和各个主播的相关信息,从各个主播中筛选出将哪些主播的直播内容传输给该观众的目标终端。其中,上述根据第一对象的相关信息和各第二对象的相关信息,确定各第二对象对应的展示属性信息,包括:
对于每个第二对象,根据第一对象的相关信息和该第二对象的相关信息,确定第一对象和该第二对象的关联程度,将关联程度作为该第二对象对应的第一信息;
上述根据各第二对象对应的展示属性信息,生成目标数据流,可以包括:
根据各第二对象对应的关联程度,从各第二对象中确定与第一对象相匹配的至少一个目标对象;
基于至少一个目标对象对应的媒体流生成目标数据流。
对于任一第二对象,该对象与第一对象的关联程度表征了该第一对象与该对象的匹配程度,关联程度可以是根据第一对象和该第二对象的位置信息和/或对象属性信息计算得到的,一个第二对象与第一对象的关联程度越大,表示该第二对象与第一对象越匹配,越有可能是第一对象的目标对象。
作为一可选方案,上述相关信息可以包括位置信息;对于每个第二对象,根据第一对象的相关信息和该第二对象的相关信息,确定第一对象和该第二对象的关联程度,包括:
根据第一对象的位置信息和该第二对象的位置信息,确定第一对象和该第二对象之间的距离,采用该距离表征第一对象和该第二对象之间的关联程度,其中,距离与关联程度成负相关。
该可选方案中,对于一个虚拟房间中的任一观众,服务器可以根据该观众与各个主播之间的距离,从各个主播中确定出向该观众提供直播内容的目标主播,距离越大说明主播与观众的匹配程度越低,采用该方案,可以将与观众匹配度比较高的主播(也就是离第一对象比较近的主播)的媒体流提供给该观众,从而不同的观众的终端接收到的媒体流可以是不同的,同一观众的位置信息在发生变化时,接收到的媒体流也可能会发生变化。比如,上述位置信息是第一对象对应的虚拟场景中对应的虚拟位置信息(如游戏玩家控制的虚拟游戏对象在游戏地图中的位置信息),当该第一对象的虚拟位置信息或第二对象的虚拟位置信息中的至少一项发生变化时,该第一对象对应的目标对象也可能是变化了的。
作为另一可选方案,可以通过以下方式计算对象之间的关联程度:
根据第一对象的位置信息和第二对象的位置信息,计算第一对象和第二对象的第一关联性;
可以根据第一对象的对象属性信息和第二对象的对象属性信息,计算第一对象和第二对象的第二关联性;
根据第一关联性和第二相关性,得到第一对象和该第二对象的关联程度。
其中,第一关联性代表了第一对象和第二对象在位置上的关联程度,距离越近,关联性越大,第二关联性代表了第一对象和第二对象在偏好上的关联程度,关联性越大,偏好越接近。在得到两个对象在两个维度的关联程度之后,可以通过综合这两个相关性,得到第一对象和第二对象的关联程度,例如,可以将第一关联性和第二相关性相加或求均值得到关联程度。或者,获取位置信息对应的第一权重以及对象属性信息对应的第二权重,采用第一权重和第二权重对第一关联性和第二关联性进行加权求和得到关联程度。
本申请实施例提供的上述数据处理方法,可以按照同一房间中观众与主播之间的关联程度,将与观众关联程度高的目标主播的媒体流提供给观众。采用该方法,可以为同一房间的不同观众提供不同的媒体数据(如不同主播的直播内容),数据处理方式更加灵活,可以提升用户的感知。可选的,对于任一观众,该观众对应的目标数据流中还可以包括各个目标主播的媒体流的展示方式提示信息(如上述第二信息或者与第二信息相对应的提示信息),目标终端在接收到包含各目标主播的媒体流以及媒体流的展示方式提示信息之后,还可以将各目标主播的媒体流按照各自的展示方式展示给观众,实现不同主播的媒体数据的差异化展示,可以更好的提升用户感知。
由于不同应用场景中,媒体流的形式可能是不同的,比如,有些场景中只有音频数据,有些场景中是视频数据,视频数据可能只有视频画面,也可能既有视频画面又有音频数据。在实际应用中,对于音频而言,由于不论声音来自于第一对象的哪个方向,在一定的听觉距离(可听距离)内第一对象都是能够听到的,而对于视频数据而言,第一对象能否看到视频画面除了第一对象的视觉距离有关之外,还与其当前的朝向(视线方向)有关。考虑到该因素,本申请的可选实施例中,在媒体流包括视频流时,上述位置信息包括对象的朝向和位置坐标,该方法还可以包括:
对于每个第二对象,根据该第二对象的位置坐标以及第一对象的位置信息,确定该第二对象相对于第一对象的视角偏差;
上述根据各第二对象对应的上述距离,从各第二对象中确定出至少一个目标对象,包括:
根据各第二对象对应的距离和视角偏差,从各第二对象中确定出至少一个目标对象。
基于该方案,对于视频流,可以从可视距离和可视角度两个维度来判断第二对象是否位于第一对象的可视范围之内,如果位于该可视范围之内,可以将第二对象作为目标对象。其中,对象的可视角度可以是预设角度,可以是服务器默认配置好的,也可以是第一对象自己设置的。
采用本申请提供的该方案,可以根据第一对象的位置坐标和第二对象的位置坐标,计算第一对象和第二对象之间的距离,可以根据第一对象的位置信息(坐标位置和朝向)和第二对象的位置坐标计算第二对象对于第一对象而言的视角偏差,如果该角度偏差位于第一对象的可视角度范围之内,例如,第一对象的可视角度范围是150度,即第一对象的视线正前方左右各75度,如果第二对象在第一对象的当前视线方向的夹角不超过75度,则认为第二对象位于第一对象的可视角度范围之内。如果第二对象与第一对象之间的距离,且第二对象位于第一对象的可视角度范围内,可以将第二对象作为目标对象。
本申请的可选实施例中,在采用对象之间的距离表征对象之间的关联程度时,上述根据各第二对象对应的上述距离,从各第二对象中确定出至少一个目标对象,可以是将各第二对象对应的上述距离中不大于预设值的距离确定为目标距离,并将目标距离对应的第二对象确定为目标对象,也就是将与观众距离在一定范围内的主播的直播内容提供给观众。
为了保证第一对象能够获取至少一个第二对象对应的媒体流,或者避免将过多的第二对象的媒体流都提供给第一对象导致对象感知差的问题,本申请实施例中,该将各所述第二对象对应的所述距离中不大于预设值的距离确定为目标距离,并将所述目标距离对应的第二对象确定为目标对象可以包括:
方式一:若目标距离的数量不符合预设数量要求,通过调整上述预设值,以从各第二对象对应的距离中确定出符合预设数量要求的目标距离,将各目标距离对应的第二对象确定为目标对象;
方式二:若目标距离的数量大于设定数量,将目标距离对应的各第二对象作为候选对象,获取各候选对象对应的媒体流的质量,根据各候选对象对应的媒体流的质量,从各候选对象中确定出至少一个目标对象。
对于上述方式一,上述预设数量要求可以是一个预设数量范围,该预设数量范围可以是一个设定值,也可以是一个取值区间,该区间可以包括多个正整数,上述设定值可以包括数量上限值或下限值中的至少一个,上限值用于限定目标对象的最大数量,下限值用于限定目标对象的最小数量,而取值区间用于限定最终的目标对象的数量属于该区间。根据预设数量要求,如果按照初始的上述预设值筛选出的目标对象的数量不符合该要求,则可以通过对预设值进行一次或多次调整,并根据调整后的预设值重新确定符合目标对象,直至确定出的目标对象的数量符合预设数量要求。比如,上述预设数量要求包括下限值,下限值可以是不小于1的正整数,以下限值是1为例,如果各第二对象与第一对象之间的距离均大于初始的预设值,那么目标对象的数量为0,不符合要求,则可以按照预设间隔调大预设值,根据调大后的预设值重新确定目标对象。同样的,如果上述预设数量要求包括上限值,如果各第二对象与第一对象之间的距离中不大于初始的预设值的距离的个数大于上限值,则可以按照一定间隔调小预设值,根据调小后的预设值重新确定目标对象。
对于上述方式二,由于第二对象的媒体流是服务器从第二对象的用户终端接收到的,而不同用户终端的终端性能、用户终端与服务器之间的数据传输链路、用户终端侧的网络质量等信息都是存在差异的,因此,不同第二对象的媒体流的质量通常也是不同的,比如,在直播场景中,不同主播对应的直播视频流的质量都会有所差异,比如,画面清晰度、卡顿情况等会有所不同,考虑到上述原因,如果根据第一对象和各第二对象之间的距离和上述预设值所筛选出的目标对象的数量过大(大于设定数量)时,可以根据这些对象的媒体流的质量对这些对象进行进一步筛选,比如,从中筛选出媒体流质量较佳的设定数量的目标对象。
其中,对于媒体流的质量的具体评估确定方式,本申请实施例不做限定,可以采用现有任意的评估媒体数据质量的方式来确定。可选的,可以根据目标服务器和第二对象的服务器之间的数据传输时延、丢包率等信息来评估第二对象对应的媒体流的质量,也可以采用训练好的神经网络模型来预测各个第二对象对应的媒体流的质量等。
通过上述可选方案,可以避免第一对象可能无法获取到任何媒体流的问题,还可以避免提供过多的媒体流给第一对象,还可以为第一对象尽可能提供质量较佳的媒体流,如质量比较好的视频流或音频流等。
作为本申请的另一可选方案,该方法还可以包括:
获取各第二对象对应的媒体流的质量;根据各第二对象对应的媒体流的质量,从各第二对象中确定出至少一个目标对象。
该方案中,可以依据各个第二对象的媒体流的质量来选择为第一对象提供哪些第二对象的媒体流,实现了依据质量的自动化筛选,使得第一对象可以获取到质量有所保证的媒体流,避免将质量较差的媒体流提供给对象,造成用户使用感知下降。可选的,可以直接将媒体流的质量不低于预设质量的媒体流对应的第二对象作为目标对象,或者按照质量由高到底的顺序,将排序靠前的一定数量的质量对应的第二对象确定为目标对象。
可选的,在从各个第二对象中筛选目标对象时,也可以是结合第一对象与第二对象之间的关联程度以及第二对象的媒体流的质量来共同确定,作为一可选方式,可以根据第二 对象的媒体流的质量,确定第二对象与第一对象之间的关联程度的修正系数,通过修正系数对关联程度进行修正,基于各第二对象对应的修改后的关联程度,从各第二对象中确定出目标对象。例如,修正系数可以属于一定取值范围的整数,各第二对象对应的修正系数与各第二对象的媒体流的质量正相关,比如,在计算得到各个第二对象的媒体流质量之后,可以基于修正系数的取值范围,对各媒体流的质量进行归一化处理,将各媒体流的质量归一化到修正系数的取值范围之内,将每个媒体流归一化后的质量作为修正系数,对于每个第二对象,可以将该对象对应的修正系数对该对象与第一对象之间的关联程度相加或相乘,得到修正后的关联程度。
本申请提供的方案中,可以通过但不限于上述各具体实施方式来从各第二对象中确定出与第一对象相匹配的目标对象,在某些应用场景中,也可以不需要筛选目标对象,可以将各个第二对象都作为目标对象,比如,在多人在线视频会议或语音会议场景中,所有参会人员如果都有发言权,那么所有参会人员既是观众又是主播,如果当前参会人员的数量较少,对于任一参会人员,其他参加人员都可以作为该人员对应的目标对象。
其中,对于最终选取的目标对象的具体数量,本申请实施例不做限定,可以根据实际应用场景配置和调整。可选的,可以通过以下方式确定目标对象的数量:
确定上述流媒体标识对应的对象的总数量;
根据上述总数量,确定目标对象的数量。
基于该可选方案,可以根据当前房间中用户的总数量来确定目标对象的数量,实现自适应的目标对象数量的调整。可选的,上述总数量和目标对象的数量成正相关,总数量越大,目标对象的数量可以越大。作为另一可选方式,上述总数量可以是指当前流媒体标识对应的第二对象的总数量,也就是当前房间中主播的总数量。在实际实施时,如果当前房间中主播的总数量小于一定数量,也可以直接将所有主播都作为目标对象。在先确定目标对象的数量实施例中,可以根据第一对象和各个第二对象的相关信息,从各个第二对象中筛选出相应数量的目标对象,如根据对象之间的关联程度高低,筛选出关联程度比较高的相应数量的第二对象作为目标对象,将这些对象的媒体流提供给第一对象。
在确定出第一对象对应的目标对象的情况下,也就是知晓需要将哪些第二对象的媒体流提供给第一对象时,本申请的可选实施例中,上述相关信息可以包括位置信息,上述展示属性信息包括第二信息,即用于确定媒体流展示方式的信息;上述根据第一对象的相关信息和各第二对象的相关信息,确定各第二对象对应的展示属性信息,可以包括:
对于每个目标对象,根据第一对象的位置信息和该目标对象的位置信息,确定该目标对象与第一对象之间的方位指示信息,将方位指示信息作为该目标对象对应的第二信息;
上述根据各第二对象对应的展示属性信息,生成目标数据流,可以包括:
根据各目标对象对应的方位指示信息和各目标对象对应的媒体流,生成目标数据流,目标数据流还包括各目标对象对应的方位指示信息,用于指示所述目标终端根据各所述目标对象对应的方位指示信息,将各所述目标对象对应的媒体流展示给所述第一对象。
本申请实施例中,在确定出第一对象对应的各个目标对象之后,服务器则可以将各个目标对象的媒体流发送给第一对象,可选的,可以是直接基于各个目标对象的媒体流(或 者媒体流和方位指示信息)生成目标数据流发送给目标终端。作为另一可选方案,在生成目标数据流之前,该方法还可以包括:
将目标对象列表发送给目标终端,以通过目标终端将目标对象列表展示给第一对象,目标对象列表中包括各目标对象的对象标识;
接收目标终端发送的针对目标对象列表的对象选择反馈信息,对象选择反馈信息是目标终端响应于第一对象对于目标对象列表的选择操作生成的;
其中,目标数据流包括的媒体流是对象选择反馈信息对应的至少一个目标对象的媒体流。
也就是说,可以在向第一对象发送目标对象的媒体流之前,让第一对象可以选择希望接收这些目标对象中哪些对象的媒体流,第一对象可以根据自己的需要选择列表中的部分或全部对象,此时,服务器可以根据第一对象的选择将部分或全部目标对象对应的媒体流数据发送给目标终端。
作为另一可选方案,服务器也可以是在接收到第一对象的媒体数据获取触发操作(如进入直播间)时,先通过第一对象的目标终端向第一对象展示媒体数据展示模式选项界面,该选项可以包括第一模式和第二模式,其中,如果第一对象选择第一模式,服务器则可以通过执行上述S110至S140中所示的方法,向第一对象提供媒体流,如果第二对象选择第二模式,可以将各个第二对象的媒体流按照其他方式提供给第一对象。可选的,还可以将上述目标对象列表提供给第一对象,由第一对象自己选择将哪些目标对象的媒体流提供给第一对象。可选的,由于第一对象对应的目标对象可能会发生变化,还可以在第一对象对应的目标对象发生变化时,将最新的目标对象列表发送给第一对象,第一对象可以再次选择。可选的,还可以向目标终端发送可选对象列表,该列表中可以包括各个第二对象的对象标识、或者是目标对象的对象标识,或者是目标对象的对象标识和部分不是目标对象的第二对象的对象标识,从而使得第一对象可以拥有更多的选择,例如,可选对象标识中可以包括全部目标对象的标识和至少一个非目标对象的标识,目标对象和非目标对象的标识还可以显示有不同的提示信息,该提示信息可以告知第一对象关于目标对象和非目标对象的区别,如目标对象是与第一对象距离在多少距离范围内的对象。
本申请实施例中提供的数据处理方法,可以基于对象的位置信息、对象属性信息或媒体流质量等因素向用户智能推荐主播,也可以基于这些因素中的一项或多项,向用户提供媒体数据的智能化展示,可以实现逼真、沉浸式互相交流的实时音视频方案,可以更好的满足用户需求,提升用户感知。
本申请实施例提供的数据处理方法也可以由用户终端执行,在由用户终端执行时,该数据处理方法可以包括:
响应于目标应用程序对应的第一对象的媒体数据获取触发操作,将候选数据集中的各第二对象中的至少一个目标对象的媒体流展示给第一对象;
其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,所述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个所述第二对象对应的展示属性信息用于标识该第二对象的 媒体流对应于所述第一对象的展示属性,该展示属性信息与第一对象的相关信息和该第二对象的相关信息相适配,相关信息包括位置信息或对象属性信息中的至少一项,第一对象和各第二对象是对应于目标应用程序的同一流媒体标识的对象。
本申请实施例中,媒体数据获取触发操作是第一对象触发的向目标应用程序的服务器获取媒体数据的操作,可以包括但不限于第一对象加入到上述同一流媒体标识所对应的虚拟场景中的触发操作,如第一对象在直播类应用程序的用户界面上点击加入直播间的操作,在游戏应用中点击开始一局游戏对局的操作,在在线会议场景中加入会议的操作等等。第一对象的用户终端在获取到第一对象的该操作时,则可以按照第一对象与各个第二对象之间的相对关系,将至少一个第二对象的媒体流展示给第一对象,实现与第一对象相适配的媒体数据的展示。
其中,第一对象的用户终端在获取到第一对象的媒体数据获取触发操作之后,可以向服务器发送媒体数据获取请求,服务器可以通过获取第一对象的相关信息和各个第二对象的相关信息,确定出每个第二对象的媒体流相对于第一对象的展示属性信息,并根据各个第二对象对应的展示属性信息生成相应的目标数据流发送给该用户终端,该用户终端则可以将服务器发送的与第一对象的相关信息相适配的媒体数据展示给第一对象,其中,本申请实施例中,与第一对象的相关信息相适配的媒体数据,可以是指与第一对象相适配的至少一个第二对象的媒体流,或者是指至少一个第二对象的媒体流以及各个第二对象的媒体流的展示方式。
基于本申请提供的该方法,对于同一流媒体标识中的每个拉流端对象(即第一对象,如同一直播间对应的观众),该对象的用户终端所展示给该对象的媒体数据是与该流媒体标识对应的各个推流端对象(即第二对象,如直播间对应的主播)相对于该对象的展示属性信息相匹配的,一个第二对象相对于第一对象的展示属性信息可以反映该第二对象与该第一对象的相对关系(如关联程度),基于该方法,可以实现按照各第二对象与该第一对象的相对关系决定所提供给第一对象的媒体数据,该媒体数据包括至少一个第二对象的媒体流。例如,展示属性信息包括用于表征是否将第二对象的媒体流展示给该第二对象的第一信息,如果一个第二对象对应的该第一信息表示不将该第二对象的媒体流展示该第一对象,第二对象的用户终端所展示给第一对象的媒体流则不包含的该第二对象的媒体流。
其中,该方法还可以包括:响应于各第二对象对应的展示属性信息发生变化,将与变化后的各第二对象对应的展示属性信息相匹配的至少一个目标对象的媒体流展示给第一对象。
也就是说,如果第二对象的媒体流相对于第一对象的展示属性信息发生了改变,第一对象的用户终端所呈现给第一对象的媒体数据也可能是发生了变化的,该变化可能是目标对象发生了变化,也可能是目标对象的媒体流的展示方式发生了变化。
其中,第二对象对应的展示属性信息的变化是由第一对象的相关信息或第二对象的相关信息的变化所导致的,比如,可以根据第一对象的位置信息和第二对象的位置信息确定第一对象与第二对象之间的相对位置关系(如距离),展示属性信息可以是根据第二对象与第二对象之间的相对位置关系确定,第一对象所对应的目标对象可以各第二对象中与第一 对象的距离小于预设值的第二对象,第一对象的用户终端可以将这些目标对象的媒体流展示该该第一对象,可选的,如果当前的目标对象相对于上一时段的目标对象发生变化,可以只将当前的目标对象的媒体流展示给第一对象。如当前的目标对象是对象A和对象B,上一时段的目标对象是对象A和对象C,当前则将对象A和对象B的媒体流提供给第一对象。
本申请的可选实施例中,上述展示属性信息包括第一信息,第一信息用于确定是否将第二对象的媒体流提供给第一对象。
可选的,上述将各第二对象中的至少一个目标对象的媒体流展示给第一对象,包括:
将位于第一对象的可视范围之内和/或位于第一对象的可听距离之内的第二对象作为所述目标对象,并将所述目标对象的媒体流展示给第一对象。
也就是说,位于第一对象的可视范围之内和/或可听距离之内的第二对象属于目标对象,用户终端可以只将这些目标对象的媒体流展示给第一对象。其中,位于第一对象的可视范围之内的第二对象是指第二对象与第一对象的距离位于第一对象的可视距离之内或第二对象位于第一对象的可视角度之内中的至少一项。位于第一对象的可听距离则是指第二对象与第一对象的距离小于设定距离。
本申请的可选实施例中,上述展示属性信息包括第二信息,第二信息用于确定将媒体流展示给第一对象的展示方式。
基于本申请提供的该可选方案,可以为第一对象提供个性化的媒体流展示方式。具体的,可以根据各个目标对象和第一对象之间的相对方位,来向第一对象展示各个目标对象的媒体流。即第一对象的目标终端可以根据各目标对象对应的方位指示信息,确定各目标对象与第一对象之间的相对方位信息,按照各目标对象对应的相对方位信息,将各目标对象的媒体流展示给第一对象。
其中,对于任一目标对象,第一对象和该目标对象之间的方位指示信息可以是任意形式的能够确定两者之间相对方位的信息,可选的,方位指示信息可以是显示的指示信息,例如,每个目标对象对应的方位指示信息可以是目标对象与第一对象的相对方位信息(如第二对象在第一对象的哪个方位),也就是服务器可以直接告知目标终端各个目标对象与第一对象之间的相对方位,方位指示信息也可以是隐式的指示信息,如方位指示信息可以包括目标对象的位置信息,目标终端可以根据第一对象的位置和服务器告知的各个目标对象的位置信息,确定出各个目标对象与第一对象之间的相对方位信息。其中,一个目标对象对应的相对方位信息可以包括但不限于两个对象之间的距离、相对于第一对象而言目标对象的朝向信息等。目标终端可以基于各目标对象的方位指示信息,将各目标对象的媒体流采用与方位指示信息相匹配的展示方式,将各个目标对象的媒体流展示给第一对象,可以让第一对象具有身临其境的感知,能够有效提升用户使用感知,更好的满足实际应用需求。
也就是说,第二对象与第一对象之间的相对方位信息,可以决定第二对象所对应的第一信息或第二信息中的至少一项,第一信息决定了是否将第二对象的媒体流提供给第一对象,第二信息则是决定了在将第二对象的媒体流提供第一对象时,该媒体流的展示方式。展示给第一对象的媒体流可以是各第二对象中满足条件的部分第二对象(例如,与第一对象的距离不大于预设值的第二对象)的媒体流,也可以是所有第二对象的媒体流,在将这 些媒体流展示给第一对象时,可以按照各个目标对象与第一对象之间的相对方位信息展示。比如,对于音频流而言,可以按照各个目标对象与第一对象的相对方位,采用空间音频播放方式将各个目标对象对应的音频流播放给第一对象,对于视频流而言,可以按照各个目标对象与第一对象的相对方位,将各个目标对象的视频图像显示在第一对象的用户终端上。
本申请的可选实施例中,上述媒体流包括音频流,目标终端可以通过以下方式将各目标对象对应的媒体流展示给第一对象:
根据各目标对象对应的方位指示信息,确定各目标对象对应的音频流相对于第一对象的空间播放方向;
根据各目标对象对应的空间播放方向,采用空间音频播放方式将各目标对象对应的音频流播放给第一对象。
可选的,目标终端可以根据各目标对象对应的方位指示信息,确定各目标对象对应的相对方位信息,按照各目标对象对应的相对方位信息,确定各目标对象对应的音频流相对于第一对象的空间播放方向。
可选的,方位指示信息可以是目标对象的位置信息,也可以是目标对象与第一对象的相对方位信息,也可以是目标对象的音频流的空间播放方向指示信息。在方位指示信息是目标对象的位置信息时,目标终端在接收到服务器发送的各个目标对象的音频流和位置信息之后,对于每个目标对象,可以根据第一对象的位置信息和目标对象的位置信息,计算出该目标对象与第一对象的相对方位,从而可以根据该相对方位,确定出该目标对象的音频流的播放方式即上述空间播放方向,如直接将该相对方位作为对应的音频流的声音播放方向,进而目标终端可以将各个目标对象的音频流采用空间音频播放方式播放给第一对象,使第一对象具有更加逼真的沉浸式的音频体验。
作为一个示例,对于任一观众Z(第一对象),假设Z对应的目标主播(目标对象)是主播A、B、C、D,A在Z的后方,B在Z的右边,C在Z的左边,D在Z的正前方,基于本申请提供的该方法,Z可以听到A的声音是从后面传来,听到D的声音在正前方,听到B的声音在右边,C的声音在左边。
本申请的可选实施例中,上述媒体流包括视频流,目标终端可以通过以下方式将各目标对象对应的媒体流展示给第一对象:
根据各目标对象对应的方位指示信息,确定各目标对象对应的视频流在目标终端的用户界面上的视频显示位置;
按照各目标对象对应的视频显示位置,将各目标对象对应的视频流显示给第一对象。
可选的,目标终端可以根据各目标对象对应的方位指示信息,确定各目标对象相对于第一对象而言的相对方位信息,可以根据各目标对象对应的相对方位信息,确定各目标对象的视频流在目标终端上的视频显示位置。
对于视频流,目标终端可以按照第一对象与目标对象之间的相对方位,将各个目标对象的视频流播放给第一对象。可选的,方位指示信息可以是目标对象的位置信息,也可以是目标对象与第一对象的相对方位信息,还可以是各目标对象的视频流在目标终端上的显示位置指示信息,也就是说,各目标视频流的具体显示方式,可以是由目标终端根据接收 到的方位指示信息自己确定的,也可以是服务器根据第一对象与各个第二对象之间的位置信息确定出之后告知目标终端的。
作为一可选方案,目标终端可以包括至少两个显示区域(如多个显示屏),可以根据各目标对象与第一对象之间的相对方位信息,确定各目标对象的视频流对应的目标显示区域,将各个目标对象的视频流按照各自对应的目标显示区域显示给第一对象。如目标终端有相对左右放置的两个显示屏,目标对象有三个,其中目标对象a和b都在第一对象的左方,另一个目标对象c在第二对象的右方,可以将目标对象a和b的视频画面显示在左边的显示屏上,将目标对象c的视频画面显示在右边的显示屏上,再比如,目标对象a是在第一对象的左上方,目标对象b是在第一对象的左下方,目标对象a和目标对象b的视频画面可以相对显示在左边显示屏的上方和下方。
可以理解的是,在实际应用中,视频流可以是只包括图像的视频流,也可以是既包括图像又包括音频的视频流,对于后者,基于本申请实施例提供的方案,可以根据第一对象和第二对象的相关信息,确定出目标对象的视频流中的画面的展示方式或音频的展示方式的至少一项,并根据视频画面或音频中至少一项的展示方式,将各目标对象的视频流展示给第一对象。
本申请实施例提供的方案,可以适用于很多的应用场景中,可以包括但不限于在线直播、在线游戏(包括但不限于桌游、在线射击等各种类型的游戏),在线社交、在线教育、元宇宙、在线招聘、在线签约等等各种在线通信场景。为了更好的说明本申请实施例提供的方法及其实用价值,下面结合具体的场景实施例对本申请实施例提供的方法进行说明。本场景实施例中以在线直播旅游场景为例进行说明。
可选的,图2中示出来本场景实施例所适用的一种实时音视频系统的实施环境的结构示意图,如图2中所示,该实施环境可以包括应用服务器110和用户终端,其中,应用服务器110为目标应用程序的服务器,可以为目标应用程序的用户提供在线直播服务。本场景实施例中,目标应用程序对应的第二对象为主播,第一对象是直播间的观众,需要说明的是,主播可以是真人主播,也可以是AI主播(即虚拟主播)。用户终端可以运行目标应用程序的任一终端,本场景实施例中,用户终端包括观众的用户终端和主播的用户终端,图2中示意性的示出了同一个直播间的n个拉流端(观众1的终端121、观众2的终端122、……、以及观众n的终端12n)和m个推流端130(主播1的终端131、主播2的终端132、……、以及用户m的终端13m),各拉流端120及推流端130和应用服务器110之间可以通过有线网络或无线网络通信,终端用户可以通过其用户终端上运行的目标应用程序作为主播发起在线直播,也可以作为观众加入直播间,其中,一个直播间可以同时包括多个主播。
推流端可以通过相应的视频采集设备(可以是终端,也可以是与终端连接的采集设备)采集主播侧的视频流(即直播内容),并将采集好的视频流压缩、封装后推送给应用服务器110,如终端131将主播1的直播内容(视频流1)发送给应用服务器。拉流端则可以将服务器已有的直播内容拉取到观众的用户终端,并将拉取的直播内容展示给观众。
对于实施音视频系统的后台服务所采用的系统架构,本申请实施例不做限定,理论上可以采用任意的实时音视频通信架构。作为一可选方案,图3中示出了本申请实施例提供的 一种实时音视频系统的结构示意图。对于用户而言,实时音频系统中除了该用户的用户终端之外的设备,都可以理解为是应用服务器中的服务节点,这些服务节点配合工作为用户提供实时音频服务。
图3所示的实时音视频系统架构中,可以按照区域为用户提供分区域的媒体接入服务,图3中示意性的示出了两个不同的区域,即区域A和区域B,对于同一个区域,如果该区域的用户特别多,还可以对区域进行进一步的分区处理,可以为该区域配置多套媒体接入服务片区,图中的区域Aset3为区域A的一个片区,区域Bset1为区域B的一个片区,其中,每个媒体接入服务片区可以包括媒体管理节点和部署了QoS(Quality of Service,服务质量)模块的媒体计入机,媒体管理节点可以包括如图3中所示的set内房间管理模块和set内负载管理模块,媒体接入机可以采用分布式部署方式。
以图3中的区域A为例,在该区域进行部署时,可以依据该地区的总用户数计算出总共需要的媒体接入数,还可以依据运营商比例分配媒体接入运营商的比例。假如地区A需要100台媒体接入,不同运营商a、b、c的比例为4:4:2,则可以在区域A部署40台对应运营商a的媒体接入机、40台对应运营商b的媒体接入机、20台对应运营商a的媒体接入机,再加上set内负载管理和set内房间管理为一个set(片区)。如某地域用户特别多,也可以在该地域部署多个set。
图3所示的系统架构,可以采用分布式负载管理和分布式房间管理,每个set内有自己的负载管理服务即上述set内负载管理模块,set内负载管理负载收集本set内各模块(包括媒体接入机)的负载。每个set内有自己的房间管理模块,负责管理本set的房间、用户。房间管理模块出现死机时,可通过媒体接入和set内房间管理模块做迁移恢复。媒体接入机中部署有分布式QoS模块,用于对本机用户进行QoS调控。
该系统架构的服务系统,可以通过统一媒体接入,每个媒体接入机均可以接入主播的客户端和观众的客户端,可以不再区分接口机和代理机,主播和观众都可以统一连接到媒体接入机,观众上麦不需要走退房、重新进房到接口机的流程,切换角色后可直接上行音视频数据。
在该系统架构中,还配置有上一层级的房间管理模块(图3中的房间管理)和负载管理模块(图3中的负载管理),其中,上述set内负载管理模块和set内房间管理模块可以称为二级管理模块,上一级的房间管理模块和负载管理模块可以称为一级管理模块,二级管理模块用于对set内的设备进行房间、负载进行管理,一级管理模块用于对各个区域进行管理,其中,一级管理模块可以接收各个区域的二级管理模块的上报,以对各个区域进行管理控制。比如,一级负载管理模块可以从各个二级负载管理模块获取到各个区域的负载信息(如已接入的用户数量、媒体接入集的负载情况、设备资源占用情况等),一级负载管理模块可以将各区域的负载信息发送给调度模块,调度模块可以根据接收到的信息,对各个区域进行媒体接入调度。
下面结合图3所示的系统结构,对用户加入直播间的流程进行简要说明,该流程可以包括如下步骤:
a)用户端发送分配媒体接入请求到转包层(转发节点)。其中,媒体接入请求中可以 包括流媒体标识和客户端标识(用户标识)。
b)转包层依据用户标识或者房间号,确定用户所属的集群,并将接入请求转发到相应集群的调度模块(图中所示的调度)。
可选的,转包层可以预先配置有多个流媒体标识区间分别对应的集群,如图3中所示的集群1至集群N,不同的集群对应不同的流媒体标识区间,转包层在接收到媒体接入请求时,可以根据媒体接入请求中的流媒体标识,确定出该流媒体标识所在的流媒体标识区间,并根据流媒体标识区间确定目标集群,并将接入请求转发给目标集群对应的调度模块。
c)调度模块对媒体接入请求分配媒体接入机。
其中,基于图3所示的系统架构,调度模块在分配媒体接入机时,可以不再需要做单机级别的聚集调度,而是set级别的聚集调度,具体的,可以依据就近set的负载情况选择set,再到set内依据媒体接入机的负载情况加权随机的方式选择媒体接入机。由于单个set(每个set中可以有多个媒体接入机)的处理能力百倍于单个媒体接入机的处理能力,可以更简单有效抗住高并发进房,防止单机的过度聚集。整个调度分配的过程可在本地完成,大幅简化流程。最后将分配的媒体接入机返回给用户端,如将媒体接入机的IP地址发送给用户终端。
d)用户端拿到媒体接入机的地址之后,连接媒体接入机(图中所示的媒体接入|QOS)。
其中,如媒体接入机中用户端请求进入的房间存在,则本次直播间进入流程完成,否则可以按照预配置策略执行相应处理,如可以由set内房间管理模块和调度模块按照直播间进入策略,使得用户最终进入到直播间中,或者,也可以提示用户直播间不存在,可以重新执行直播间进入流程。
本场景实施例中,用户在进入直播间之后,在用户授权的前提下,媒体接入机可以获取该直播间中各个用户的属性信息,比如,用户的偏好信息等。可选的,本申请实施例中,目标应用程序的客户端还可以为用户提供各种个性化的设置选项,用户可以根据自身需要对参与直播时的一些数据收发方式进行配置,可选的,设置选项可以包括但不限于听距设置项、视距设置项、可视角度设置项等等,这些选项可以具有各自的默认取值,用户可以通过这些选项修改自己对应的听距范围、可视化范围内等。在用户授权同意的前提下,还可以根据用户在实时音视频过程中的历史操作信息、用户之间的交互情况,通过统计分析为用户设置一些标签信息,这些标签信息也可以作为用户的属性信息,例如,相互聊天比较久的用户,可归纳出共同的标签,被聊天比较久的用户可累计其热度值,在挑选出的目标对象较多时,除了可以考虑对象的媒体流的质量之外,还可以基于目标对象的热度,对目标对象进行进一步的筛选。
对于位置信息,用户终端可以按照一定的时间间隔向其接入的媒体接入机上报其位置信息,图4示出了一种基于图3所示的系统结构的实时位置上报原理示意图,其中,本场景实施例中,流媒体标识可以包括但不限于直播间的房间标识,图4中的用户可以是直播间的任一用户,包括主播和观众。可选的,如图4中所示,用户的终端可以每秒向其接入的媒体接入机发送一次包含用户的位置信息的心跳信号(含坐标心跳),其中,位置信息可以是包括{X轴坐标、Y轴坐标、朝向(如相对Y轴的旋转角度)}的位置信息,对于坐标系统的选 取和设定本申请实施例不做限定。媒体接入机拿到每个用户的实时位置心跳后,可以汇聚各用户的位置信息传递给set内房间管理模块,set内房间管理模块做进一步汇聚之后传递给一级房间管理模块。可选的,媒体接入机可以是按照预设间隔将接收到的用户的位置信息汇聚后发送给set内房间管理模块,也可以是对用户的位置信息进行判断,如果发现用户的当前位置信息相对于上一次接收到的位置信息发生了变动再上报给set内房间管理模块,如用户的相邻两次位置信息之间的距离大于预设距离时才上报。同样的,set内房间管理模块也可以在汇聚后依据用户位置信息有变动才上报给一级房间管理模块。
其中,上述位置信息可以是实际的经纬度坐标和朝向,也可以是虚拟坐标,如在线旅游直播场景中,旅游环境可以是真实环境也可以是虚拟环境,在虚拟环境中,用户可以控制其目标虚拟角色在虚拟环境中移动旅游,在真实环境中,主播可以在真实环境中为观众提供直播服务,带观众体验在线旅游。
可选的,应用服务器110可以将接入到应用服务器110的各个终端对应的对象的相关信息(观众的相关信息、主播的相关信息)存储到数据库140中。对于同一直播间的任一观众,应用服务器110可以基于数据库140中存储的信息,通过执行本申请实施例提供的数据处理方法,媒体接入机可以根据该观众的相关信息与该直播间中各主播的相关信息,为该观众提供与其更加适配的媒体数据。如图2中所示,应用服务器110根据观众1的相关信息和m个主播的相关信息,将包括主播1和主播3的直播内容的视频流13发送给终端121,终端121可以根据主播1与观众1的相对位置关系(相对方位信息可以包括相对位置关系)、以及观众1和主播2的相对位置关系,将主播1的直播画面1和主播3的直播画面3显示在终端121上,具体的,主播1的直播画面1展示在终端121的左侧,主播3的直播画面3展示在终端121的右侧,对于观众2,应用服务器110将包含主播2的直播内容和主播3的直播内容的视频流23发送给终端122,终端122上显示主播2和主播3的直播画面2和直播画面3,终端12n上将服务器发送给该终端的视频流13中的主播1的直播画面1和主播3的直播画面3显示给观众n,其中,终端12n上直播画面1和直播画面3的显示方式如图2中所示,与终端121上直播画面1和直播画面3的展示方式不同。
可选的,如图5所示,一级房间管理模块可以在主播列表(全量主播列表)推送的信令中增加主播的属性信息以及主播的实时位置信息到同房间的set内房间管理模块,set内房间管理模块再将其扩散到set内同房间的所有媒体接入机。媒体接入机可以根据同房间内各主播及观众的相关信息,依据用户的特点定制主播列表并下发给用户,可以包括但不限于以下方式:
1)固定范围推荐:对于音频,可设置用户能听到的最大听距,在这个听距半径以内的主播纳入到定制主播列表;对于视频,可设置用户能看到的其他用户的视距和角度,如视距半径为d,视距视角为用户朝向的正前方150°,即左右各75°。
2)位置智能范围推荐:可以依据当前房间的用户数量,保障用户可看到、听到相对比较近的主播。如视距范围或听距范围内用户较多时,可以优先距离更近的主播。
3)质量智能范围推荐,依据当前房间用户数量,保障用户可看到、听到质量相对较好的主播。如视距范围或听距范围内用户较多时,可以优先选择延时更小、音量更大、画面 更清晰的主播,即媒体流质量更加的主播。
4)对象属性智能范围推荐,依据用户进房携带的属性信息,优先推荐该用户可能更感兴趣的主播。
5)综合智能范围推荐,可以结合上述1)至4)的推荐策略定制主播列表。
下面结合两个示例对上述推荐策略进行介绍,需要说明的是,两个示例也是可以结合使用的。
示例1:基于位置的音频主播列表推荐
图6中示出了一种旅游环境的示意图,该环境可以是真实环境,也可以是虚拟环境,还可以是虚实结合的环境,该示例中的旅游环境中有4个门,图中用户Z是该示例中直播间的任一观众,图中示出的除用户Z之外的各用户是主播。假设用户Z的听距范围即最大听距是半径d。
音频场景一:基于本申请实施例提供的方案,如图6所示,当用户Z或用户Z的虚拟角色移动到景点1附近Z1的位置时,其听距范围内(图中以Z1为中心的虚线所示的圆覆盖的范围,圆的半径为d,半径d为该示例中的预设值)有4个主播(与用户Z的距离不大于预设值的对象),分别是A、B、C、D,此时媒体接入机下发到用户Z的用户终端的主播列表(目标对象列表)中包含A、B、C、D四个主播的相关信息。用户Z可选择自动收听全部的4个主播或对话,也可以手动选择收听部分主播或对话。
基于用户Z的选择,媒体接入机可以将其选择的主播的音频数据发送到用户Z的用户终端,用户终端收到各主播的音频后,可使用空间音频的播放方式将各主播的音频数据播放给Z,如Z的朝向角度与Y的夹角为0(面向图3中地图的正北方),A在Z的后方,Z听到A的声音是从后面传来,D在Z的正前方,Z听到D的声音在正前方,B在Z的右边,Z听到B的声音在右边;C在Z的左边,Z听到C的声音在左边。如图7所示,用户Z可以收听到来自各个主播的声音在不同方位的音频数据,图中虚线部分表示不同方向的音频数据。
音频场景二:当用户Z移动到地图中景点2附近的Z2位置时,其听距范围之内有多个主播,如图6中所示共有7个主播,此时媒体接入机可以下发的主播列表包含所有的这7个主播,如果觉得过于嘈杂,也可以依据上述位置智能范围推荐、质量智能范围推荐、属性智能范围推荐或综合质量范围来选择最佳的主播,以保障用户Z的最优听觉体验。比如,可以将7个主播作为候选对象,根据7个主播各自与Z的距离,选择较近的4个主播作为最终的目标对象,将这些主播的音频数据发送给用户Z,同样的,Z收到各主播的音频后,可以使用空间音频的播放方式将各个主播的音频数据播放给用户Z。
音频场景三:当用户Z移动到地图的Z3位置时,用户Z的默认听距范围(以Z3为中心的小圆)之内没有主播,可选的,可以智能放大听距范围即调整距离的预设值(以Z3为中心的大院),适当放大听距范围之后,如图6中所示,有2个主播在听距范围之内,之后可参考上述音频场景一的处理方式,将这2个主播的音频数据提供给用户Z,并可以采用空间音频的播放方式播放给用户。
示例2:基于位置的视频主播列表推荐
该示例中,假设用户Z的视觉半径(能够看到的最远距离)为d,可视角度为α(用户 视线正前方左右各α/2的夹角范围)。
视频场景一:如图8所示,当用户Z移动到地图的Z1位置且用户Z的朝向角度与Y轴的夹角为0°的时候,用户Z的视距范围和可视角度之内(图中虚拟构成的弧覆盖的范围)有3个主播,分别是B、C、D,此时向用户Z下发的主播列表可以包含B、C、D四个主播。用户Z可选择自动收看全部的3个主播或对话,也可以手动选择收看部分主播或对话。Z收到各主播的视频后,可根据B、C、D的方位来排布画面的布局,如如图9所示,主播C在直播画面(图中的C画面)在Z的左侧,D的直播画面在Z的正前方,B在直播画面Z的右侧。
视频场景二:当用户Z移动到地图中的Z2位置,并且Z的朝向角度与Y轴的夹角为90°的时候,其视距范围之内有6个主播,此时可以下发的主播列表包含所有的这6个主播,如果业务侧或用户希望更精选主播(可以为用户提供主播数量设置功能,用户可以在其终端上针对目标应用程序进行设置其同一时间可以最多接受几个主播的媒体数据),也可以依据上述位置智能范围推荐、质量智能范围推荐、属性智能范围推荐或综合质量范围来选择最佳的主播,以保障用户Z的最优体验。Z收到媒体接入机发送的各主播的视频后,用户Z的目标终端可以同样可根据各个主播相对Z的视距方位来排布画面的布局。
视频场景三:当用户Z移动到地图中的Z3位置且Z的朝向角度与Y轴的夹角为0°的时候,用户Z的默认视距范围之内没有主播,可以智能放大视距范围,适当放大视距范围之后,有两个主播在用户Z的视距范围之内,之后可参考上述视频场景一的处理方式为用户Z提供视频流。
可以理解的是,对于音视频都有的应用场景,可以综合上述基于音频的主播列表生成方式和基于视频的主播列表生成方式一起使用。比如,可以根据主播与用户Z的相对方位,排布视频画面以及播放音频信号。
站在第一对象的角度,基于本申请实施例提供的方案,该第一对象的用户终端可以按照各个主播与该第一对象的相对关系,将至少一个主播的媒体流提供给该第一对象,如可以将与该对象的关联关系满足一定条件的部分主播(如与该对象的距离小于设定值的主播)的媒体流提供给该对象,也可以是将部分或全部主播的媒体流按照各个主播与该对象的相对方位信息展示方式展示给该用户。站在服务器的角度,服务器可以按照第一对象与各个主播的相关信息,将与该第一对象的关联关系满足一定条件的部分主播的媒体流发送给该第一对象的用户终端并展示给该对象,或者服务器可以将部分或全部主播的媒体流以及各个媒体流的展示方式提示信息一起发送给该第一对象的用户终端,由该终端按照各个媒体流的展示方式提示信息,将接收到的各个媒体流展示给第一对象。
采用本申请提供的可选实施方式,同一直播间的不同用户收到的媒体数据不再都是相同的,而是与各个用户自身更加匹配的媒体数据,可以提升用户感知。比如,在千人、万人等多人大房间中,可智能的根据主播的音视频质量、用户与其他主播之间的位置关系、兴趣爱好共同点等方式推荐最佳的主播,实现更加逼真的沉浸式元宇宙实时音视频方案。用户看到的画面不再是无差别,而是可以根据相互之间的位置关系来展示画面。用户听到的声音不再是无差别,用户听到主播的声音位置、方向依据实际的位置关系、方向来定。该方案可实现更加逼真的沉浸式在线课堂、会议、游戏、直播等等场景。
基于与本申请实施例提供的方法相同的原理,本申请实施例还提供了一种数据处理装置,如图10所示,该数据处理装置100包括相关信息获取模块110、展示属性信息确定模块120、目标数据生成模块130和数据传输模块140。
相关信息获取模块110,用于获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息,其中,相关信息包括位置信息或对象属性信息中的至少一项,第一对象和各第二对象是对应于同一流媒体标识的对象;
展示属性信息确定模块120,用于根据第一对象的相关信息和各第二对象的相关信息,确定各第二对象对应的展示属性信息,其中,一个第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于第一对象的展示属性;
目标数据生成模块130,用于根据各第二对象对应的展示属性信息,生成目标数据流,其中,目标数据流包括至少一个目标对象的媒体流,至少一个目标对象是从各所述第二对象中确定的第二对象;
数据传输模块140,用于将目标数据流发送给第一对象对应的目标终端,以使目标终端将目标数据流中的媒体流展示给第一对象。
可选的,上述展示属性信息包括第一信息或第二信息中的至少一项,第一信息用于确定是否将第二对象的媒体流提供给第一对象,第二信息用于确定将媒体流展示给第一对象的展示方式。
可选的,展示属性信息包括第一信息;展示属性信息确定模块在根据第一对象的相关信息和各第二对象的相关信息,确定各第二对象对应的展示属性信息时,可以用于:
对于每个第二对象,根据第一对象的相关信息和该第二对象的相关信息,确定第一对象和该第二对象的关联程度,将关联程度作为该第二对象对应的第一信息;
目标数据生成模块可以用于:根据各第二对象对应的关联程度,从各第二对象中确定与第一对象相匹配的至少一个目标对象,基于至少一个目标对象的媒体流生成目标数据流。
可选的,相关信息包括位置信息;对于每个第二对象,展示属性信息确定模块可以用于:
根据第一对象的位置信息和该第二对象的位置信息,确定第一对象和该第二对象之间的距离,采用距离表征第一对象和该第二对象之间的关联程度,其中,距离与关联程度成负相关。
可选的,媒体流包括视频流,位置信息包括对象的朝向和位置坐标,目标数据生成模块可以用于:
对于每个第二对象,根据该第二对象的位置坐标以及第一对象的位置信息,确定该第二对象相对于第一对象的视角偏差;
根据各第二对象对应的距离和视角偏差,从各第二对象中确定出至少一个目标对象。
可选的,目标数据生成模块可以用于:将各第二对象对应的距离中不大于预设值的目标距离对应的第二对象,确定为目标对象。
可选的,目标数据生成模块还可以用于:
若目标距离的数量不符合预设数量要求,通过调整预设值,以从各第二对象对应的距 离中确定出符合预设数量要求的目标距离,将各目标距离对应的第二对象确定为目标对象;
若目标距离的数量大于设定数量,将目标距离对应的各第二对象作为候选对象,获取各候选对象对应的媒体流的质量,根据各候选对象对应的媒体流的质量,从各候选对象中确定出至少一个目标对象。
可选的,目标数据生成模块还可以用于:获取各第二对象的媒体流的质量;根据各第二对象的媒体流的质量,从各第二对象中确定出至少一个目标对象。
可选的,相关信息包括位置信息,展示属性信息包括第二信息;展示属性信息确定模块可以用于:
对于每个目标对象,根据第一对象的位置信息和该目标对象的位置信息,确定该目标对象与第一对象之间的方位指示信息,将方位指示信息作为该目标对象对应的第二信息;
目标数据生成模块可以用于:根据各目标对象对应的方位指示信息和各目标对象对应的媒体流,生成目标数据流,目标数据流还包括各目标对象对应的方位指示信息,用于指示所述目标终端根据各所述目标对象对应的方位指示信息,将各所述目标对象对应的媒体流展示给所述第一对象。
可选的,目标数据生成模块还可以用于:确定流媒体标识对应的对象的总数量;根据总数量,确定目标对象的数量。
可选的,数据传输模块还可以用于:
在生成目标数据流之前,将目标对象列表发送给目标终端,以通过目标终端将目标对象列表展示给第一对象,目标对象列表中包括各目标对象的对象标识;
接收目标终端发送的针对目标对象列表的对象选择反馈信息,对象选择反馈信息是目标终端响应于第一对象对于目标对象列表的选择操作生成的;
其中,目标数据流包括的媒体流是对象选择反馈信息对应的至少一个目标对象的媒体流。
可选的,任一对象的位置信息包括以下至少一项:
该对象的真实位置信息;该对象对应于目标应用程序的虚拟场景中虚拟位置信息。
本申请实施例还提供了一种数据处理装置,可选的,该数据处理装置可以是用户终端,该数据处理装置可以包括:
获取模块,用于获取目标应用程序对应的第一对象的媒体数据获取触发操作;
显示模块,用于响应于媒体数据获取触发操作,将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给第一对象;
其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,第一对象和各第二对象是对应于目标应用程序的同一流媒体标识的对象;上述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性,该展示属性信息与第一对象的相关信息和该第二对象的相关信息相适配,相关信息包括位置信息或对象属性信息中的至少一项。
可选的,展示属性信息包括第一信息或第二信息中的至少一项,第一信息用于确定是 否将第二对象的媒体流提供给第一对象,第二信息用于确定将媒体流展示给第一对象的展示方式。
可选的,显示模块可以用于:将位于第一对象的可视范围之内和/或位于第一对象的可听距离之内的第二对象作为所述目标对象,并将所述目标对象的媒体流展示给第一对象。
可选的,显示模块可以用于:
根据各目标对象对应的方位指示信息,确定各目标对象与第一对象之间的相对方位信息;
按照各目标对象对应的相对方位信息,将各目标对象的媒体流展示给所述第一对象。
可选的,媒体流包括音频流,显示模块可以用于:
根据各目标对象对应的方位指示信息,确定各目标对象对应的音频流相对于第一对象的空间播放方向;
根据各目标对象对应的空间播放方向,采用空间音频播放方式将各目标对象对应的音频流播放给第一对象。
可选的,媒体流包括视频流,显示模块可以用于:
根据各目标对象对应的方位指示信息,确定各目标对象对应的视频流在目标终端的用户界面上的视频显示位置;
按照各目标对象对应的视频显示位置,将各目标对象对应的视频流显示给第一对象。可选的,上述显示模块还可以用于:响应于各第二对象对应的展示属性信息发生变化,将与变化后的第二对象对应的展示属性信息相匹配的至少一个目标对象的媒体流展示给第一对象。
可以理解的是,本申请实施例的装置可执行本申请实施例所提供的方法,其实现原理相类似,本申请各实施例的装置中的各模块所执行的动作是与本申请各实施例的方法中的步骤相对应的,对于装置的各模块的详细功能描述具体可以参见前文中所示的对应方法中的描述,此处不再赘述。
本申请实施例中还提供了一种电子设备,包括存储器、处理器及存储在存储器上的计算机程序,该处理器执行存储器中存储的计算机程序时可实现本申请任一可选实施例中的方法。
图11示出了本申请实施例所适用的一种电子设备的结构示意图,如图11所示,该电子设备可以为服务器或者用户终端,该电子设备可以用于实施本申请任一实施例中提供的方法。
如图11中所示,该电子设备2000主要可以包括至少一个处理器2001(图11中示出了一个)、存储器2002、通信模块2003和输入/输出接口2004等组件,可选的,各组件之间可以通过总线2005实现连接通信。需要说明的是,图11中示出的该电子设备2000的结构只是示意性的,并不构成对本申请实施例提供的方法所适用的电子设备的限定。
其中,存储器2002可以用于存储操作系统和应用程序等,应用程序可以包括在被处理器2001调用时实现本申请实施例所示方法的计算机程序,还可以包括用于实现其他功能或服务的程序。存储器2002可以是ROM(Read Only Memory,只读存储器)或可存储静态信 息和指令的其他类型的静态存储设备,RAM(Random Access Memory,随机存取存储器)或者可存储信息和计算机程序的其他类型的动态存储设备,也可以是EEPROM(Electrically Erasable Programmable Read Only Memory,电可擦可编程只读存储器)、CD-ROM(Compact Disc Read Only Memory,只读光盘)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。
处理器2001通过总线2005与存储器2002连接,通过调用存储器2002中所存储的应用程序实现相应的功能。其中,处理器2001可以是CPU(Central Processing Unit,中央处理器),通用处理器,DSP(Digital Signal Processor,数据信号处理器),ASIC(Application Specific Integrated Circuit,专用集成电路),FPGA(Field Programmable Gate Array,现场可编程门阵列)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合,其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器2001也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,DSP和微处理器的组合等。
电子设备2000可以通过通信模块2003(可以包括但不限于网络接口等组件)连接到网络,以通过网络与其它设备(如用户终端或服务器等)的通信,实现数据的交互,如向其他设备发送数据或从其他设备接收数据。其中,通信模块2003可以包括有线网络接口和/或无线网络接口等,即通信模块可以包括有线通信模块或无线通信模块中的至少一项。
电子设备2000可以通过输入/输出接口2004可以连接所需要的输入/输出设备,如键盘、显示设备等,电子设备200自身可以具有显示设备,还可以通过接口2004外接其他显示设备。可选的,通过该接口2004还可以连接存储装置,如硬盘等,以可以将电子设备2000中的数据存储到存储装置中,或者读取存储装置中的数据,还可以将存储装置中的数据存储到存储器2002中。可以理解的,输入/输出接口2004可以是有线接口,也可以是无线接口。根据实际应用场景的不同,与输入/输出接口2004连接的设备,可以是电子设备2000的组成部分,也可以是在需要时与电子设备2000连接的外接设备。
用于连接各组件的总线2005可以包括一通路,在上述组件之间传送信息。总线2005可以是PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。根据功能的不同,总线2005可以分为地址总线、数据总线、控制总线等。
可选的,对于本申请实施例所提供的方案而言,存储器2002可以用于存储执行本申请方案的计算机程序,并由处理器2001来运行,处理器2001运行该计算机程序时实现本申请实施例提供的方法或装置的动作。
基于与本申请实施例提供的方法相同的原理,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,计算机程序被处理器执行时可实现前述方法实施例的相应内容。
本申请实施例还提供了一种计算机程序产品,该产品包括计算机程序,该计算机程序 被处理器执行时可实现前述方法实施例的相应内容。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”、“1”、“2”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除图示或文字描述以外的顺序实施。
应该理解的是,虽然本申请实施例的流程图中通过箭头指示各个操作步骤,但是这些步骤的实施顺序并不受限于箭头所指示的顺序。除非本文中有明确的说明,否则在本申请实施例的一些实施场景中,各流程图中的实施步骤可以按照需求以其他的顺序执行。此外,各流程图中的部分或全部步骤基于实际的实施场景,可以包括多个子步骤或者多个阶段。这些子步骤或者阶段中的部分或全部可以在同一时刻被执行,这些子步骤或者阶段中的每个子步骤或者阶段也可以分别在不同的时刻被执行。在执行时刻不同的场景下,这些子步骤或者阶段的执行顺序可以根据需求灵活配置,本申请实施例对此不限制。
以上所述仅是本申请部分实施场景的可选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请的方案技术构思的前提下,采用基于本申请技术思想的其他类似实施手段,同样属于本申请实施例的保护范畴。

Claims (21)

  1. 一种数据处理方法,所述方法由服务器执行,所述方法包括:
    获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息,其中,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于同一流媒体标识的对象;
    根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,其中,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性;
    根据各所述第二对象对应的展示属性信息,生成目标数据流,其中,所述目标数据流包括至少一个目标对象的媒体流,所述至少一个目标对象是从各所述第二对象中确定的第二对象;
    将所述目标数据流发送给所述第一对象对应的目标终端,以使所述目标终端将所述目标数据流中的媒体流展示给所述第一对象。
  2. 根据权利要求1所述的方法,所述展示属性信息包括第一信息或第二信息中的至少一项,所述第一信息用于确定是否将第二对象的媒体流提供给所述第一对象,所述第二信息用于确定将媒体流展示给所述第一对象的展示方式。
  3. 根据权利要求2所述的方法,所述展示属性信息包括所述第一信息;
    所述根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,包括:
    对于每个所述第二对象,根据所述第一对象的相关信息和该第二对象的相关信息,确定所述第一对象和该第二对象的关联程度,将所述关联程度作为该第二对象对应的第一信息;
    所述根据各所述第二对象对应的展示属性信息,生成目标数据流,包括:
    根据各所述第二对象对应的关联程度,从各所述第二对象中确定与所述第一对象相匹配的至少一个目标对象;
    基于所述至少一个目标对象的媒体流生成所述目标数据流。
  4. 根据权利要求3所述的方法,所述相关信息包括位置信息;
    对于每个所述第二对象,所述根据所述第一对象的相关信息和该第二对象的相关信息,确定所述第一对象和该第二对象的关联程度,包括:
    根据所述第一对象的位置信息和该第二对象的位置信息,确定所述第一对象和该第二对象之间的距离,采用所述距离表征所述第一对象和该第二对象之间的关联程度,其中,所述距离与所述关联程度成负相关。
  5. 根据权利要求4所述的方法,所述媒体流包括视频流,所述位置信息包括对象的朝向和位置坐标,所述方法还包括:
    对于每个所述第二对象,根据该第二对象的位置坐标以及所述第一对象的位置信息,确定该第二对象相对于所述第一对象的视角偏差;
    所述根据各所述第二对象对应的所述距离,从各所述第二对象中确定出至少一个目标 对象,包括:
    根据各所述第二对象对应的所述距离和所述视角偏差,从各所述第二对象中确定出至少一个目标对象。
  6. 根据权利要求4所述的方法,所述根据各所述第二对象对应的所述距离,从各所述第二对象中确定出至少一个目标对象,包括:
    将各所述第二对象对应的所述距离中不大于预设值的距离确定为目标距离,并将所述目标距离对应的第二对象确定为目标对象;
  7. 根据权利要求6所述的方法,所述将各所述第二对象对应的所述距离中不大于预设值的距离确定为目标距离,并将所述目标距离对应的第二对象确定为目标对象,包括:
    若目标距离的数量不符合预设数量要求,通过调整所述预设值,以从各所述第二对象对应的所述距离中确定出符合预设数量要求的目标距离,将各目标距离对应的第二对象确定为所述目标对象;
    若目标距离的数量大于设定数量,将所述目标距离对应的各第二对象作为候选对象,获取各所述候选对象对应的媒体流的质量,根据各所述候选对象对应的媒体流的质量,从各所述候选对象中确定出至少一个目标对象。
  8. 根据权利要求1所述的方法,所述方法还包括:
    获取各所述第二对象的媒体流的质量;
    根据各所述第二对象的媒体流的质量,从各所述第二对象中确定出至少一个目标对象。
  9. 根据权利要求2至5任一项所述的方法,所述相关信息包括位置信息,所述展示属性信息包括所述第二信息;
    所述根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,包括:
    对于每个所述目标对象,根据所述第一对象的位置信息和该目标对象的位置信息,确定该目标对象与所述第一对象之间的方位指示信息,将所述方位指示信息作为该目标对象对应的第二信息;
    所述根据各所述第二对象对应的展示属性信息,生成目标数据流,包括:
    根据各所述目标对象对应的方位指示信息和各所述目标对象对应的媒体流,生成目标数据流,所述目标数据流中还包括各所述目标对象对应的方位指示信息,用于指示所述目标终端根据各所述目标对象对应的方位指示信息,将各所述目标对象对应的媒体流展示给所述第一对象。
  10. 根据权利要求1至5中任一项所述的方法,所述生成目标数据流之前,还包括:
    将目标对象列表发送给所述目标终端,以通过所述目标终端将所述目标对象列表展示给所述第一对象,所述目标对象列表中包括各所述目标对象的对象标识;
    接收所述目标终端发送的针对所述目标对象列表的对象选择反馈信息,所述对象选择反馈信息是所述目标终端响应于所述第一对象对于所述目标对象列表的选择操作生成的;
    其中,所述目标数据流包括的媒体流是所述对象选择反馈信息对应的至少一个目标对象的媒体流。
  11. 一种数据处理方法,所述方法由用户终端执行,所述方法包括:
    响应于目标应用程序对应的第一对象的媒体数据获取触发操作,将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给所述第一对象;
    其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,所述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性,该展示属性信息与所述第一对象的相关信息和该第二对象的相关信息相适配,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于所述目标应用程序的同一流媒体标识的对象。
  12. 根据权利要求11所述的方法,所述将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给所述第一对象,包括:
    将位于所述第一对象的可视范围之内和/或位于所述第一对象的可听距离之内的第二对象作为所述目标对象,并将所述目标对象的媒体流展示给所述第一对象。
  13. 根据权利要求11所述的方法,所述将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给所述第一对象,包括:
    根据各所述目标对象与所述第一对象之间的方位指示信息,确定各所述目标对象与所述第一对象之间的相对方位信息;
    按照各所述目标对象对应的相对方位信息,将各所述目标对象的媒体流展示给所述第一对象。
  14. 根据权利要求13所述的方法,所述媒体流包括音频流,所述按照各所述目标对象对应的相对方位信息,将各所述目标对象的媒体流展示给所述第一对象,包括:
    根据各所述目标对象对应的相对方位信息,确定各所述目标对象对应的音频流相对于所述第一对象的空间播放方向;
    根据各所述目标对象对应的空间播放方向,采用空间音频播放方式将各所述目标对象对应的音频流播放给所述第一对象。
  15. 根据权利要求13所述的方法,所述媒体流包括视频流,所述按照各所述目标对象对应的相对方位信息,将各所述目标对象的媒体流展示给所述第一对象,包括:
    根据各所述目标对象对应的相对方位信息,确定各所述目标对象对应的视频流在所述目标终端的用户界面上的视频显示位置;
    按照各所述目标对象对应的视频显示位置,将各所述目标对象对应的视频流显示给所述第一对象。
  16. 根据权利要求11或12所述的方法,所述方法还包括:
    响应于各所述第二对象对应的展示属性信息发生变化,将与变化后的各所述第二对象对应的展示属性信息相匹配的至少一个目标对象的媒体流展示给所述第一对象。
  17. 一种数据处理装置,包括:
    相关信息获取模块,用于获取目标应用程序对应的第一对象的相关信息、以及候选对象集中各第二对象的相关信息,其中,所述相关信息包括位置信息或对象属性信息中的至 少一项,所述第一对象和各所述第二对象是对应于同一流媒体标识的对象;
    展示属性信息确定模块,用于根据所述第一对象的相关信息和各所述第二对象的相关信息,确定各所述第二对象对应的展示属性信息,其中,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性;
    目标数据生成模块,用于根据各所述第二对象对应的展示属性信息,生成目标数据流,其中,所述目标数据流包括至少一个目标对象的媒体流,所述至少一个目标对象是从各所述第二对象中确定的第二对象;
    数据传输模块,用于将目标数据流发送给所述第一对象对应的目标终端,以使所述目标终端将目标数据流中的媒体流展示给所述第一对象。
  18. 一种数据处理装置,所述装置包括:
    获取模块,用于获取目标应用程序对应的第一对象的媒体数据获取触发操作;
    显示模块,用于响应于所述媒体数据获取触发操作,将候选对象集中的各第二对象中的至少一个目标对象的媒体流展示给所述第一对象;
    其中,各所述第二对象具有对应的展示属性信息,所述至少一个目标对象是从各所述第二对象中确定的第二对象,所述至少一个目标对象的媒体流与所述至少一个目标对象对应的展示属性信息相匹配,一个所述第二对象对应的展示属性信息用于标识该第二对象的媒体流对应于所述第一对象的展示属性,该展示属性信息与所述第一对象的相关信息和该第二对象的相关信息相适配,所述相关信息包括位置信息或对象属性信息中的至少一项,所述第一对象和各所述第二对象是对应于所述目标应用程序的同一流媒体标识的对象。
  19. 一种电子设备,所述电子设备包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器执行所述计算机程序以实现权利要求1-10任一项所述的方法,或者实现权利要求11-16任一项所述的方法。
  20. 一种计算机可读存储介质,所述存储介质中存储有计算机程序,所述计算机程序被处理器执行时实现权利要求1-10任一项所述的方法,或者实现权利要求11-16任一项所述的方法。
  21. 一种包括计算机程序的计算机程序产品,当其在计算机上运行时,使得所述计算机执行权利要求1-10任一项所述的方法,或者实现权利要求11-16任一项所述的方法。
PCT/CN2023/111133 2022-09-20 2023-08-04 数据处理方法、装置、电子设备、存储介质和程序产品 WO2024060856A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211146126.0A CN117793279A (zh) 2022-09-20 2022-09-20 数据处理方法、装置、电子设备及存储介质
CN202211146126.0 2022-09-20

Publications (1)

Publication Number Publication Date
WO2024060856A1 true WO2024060856A1 (zh) 2024-03-28

Family

ID=90395101

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/111133 WO2024060856A1 (zh) 2022-09-20 2023-08-04 数据处理方法、装置、电子设备、存储介质和程序产品

Country Status (2)

Country Link
CN (1) CN117793279A (zh)
WO (1) WO2024060856A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124662A (zh) * 2017-05-10 2017-09-01 腾讯科技(上海)有限公司 视频直播方法、装置、电子设备及计算机可读存储介质
CN112887653A (zh) * 2021-01-25 2021-06-01 联想(北京)有限公司 一种信息处理方法和信息处理装置
WO2021143255A1 (zh) * 2020-01-13 2021-07-22 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备以及可读存储介质
WO2021217385A1 (zh) * 2020-04-28 2021-11-04 深圳市大疆创新科技有限公司 一种视频处理方法和装置
CN114331495A (zh) * 2021-12-02 2022-04-12 腾讯科技(深圳)有限公司 多媒体数据处理方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107124662A (zh) * 2017-05-10 2017-09-01 腾讯科技(上海)有限公司 视频直播方法、装置、电子设备及计算机可读存储介质
WO2021143255A1 (zh) * 2020-01-13 2021-07-22 腾讯科技(深圳)有限公司 数据处理方法、装置、计算机设备以及可读存储介质
WO2021217385A1 (zh) * 2020-04-28 2021-11-04 深圳市大疆创新科技有限公司 一种视频处理方法和装置
CN112887653A (zh) * 2021-01-25 2021-06-01 联想(北京)有限公司 一种信息处理方法和信息处理装置
CN114331495A (zh) * 2021-12-02 2022-04-12 腾讯科技(深圳)有限公司 多媒体数据处理方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN117793279A (zh) 2024-03-29

Similar Documents

Publication Publication Date Title
US10579243B2 (en) Theming for virtual collaboration
TWI533198B (zh) 於虛擬區域及實體空間之間通訊的技術
US8191001B2 (en) Shared virtual area communication environment based apparatus and methods
US9571793B2 (en) Methods, systems and program products for managing resource distribution among a plurality of server applications
US8849900B2 (en) Method and system supporting mobile coalitions
US9621958B2 (en) Deferred, on-demand loading of user presence within a real-time collaborative service
CN110856011B (zh) 一种分组进行直播互动的方法、电子设备及存储介质
KR20070005690A (ko) 네트워크 채팅 환경에서의 채팅 부하 관리 시스템 및 방법
WO2012055315A1 (zh) 一种提供和管理互动服务的系统和方法
US9485596B2 (en) Utilizing a smartphone during a public address system session
CN111741351B (zh) 一种视频数据处理方法、装置及存储介质
CN104618785A (zh) 音视频播放方法、装置及系统
US11838572B2 (en) Streaming video trunking
WO2017045590A1 (zh) 一种多媒体信息交互方法及系统
WO2019096307A1 (zh) 视频播放方法、装置、计算设备及存储介质
US11431770B2 (en) Method, system, apparatus, and electronic device for managing data streams in a multi-user instant messaging system
WO2024060856A1 (zh) 数据处理方法、装置、电子设备、存储介质和程序产品
CN108668140B (zh) 音视频交互状态同步方法及装置
WO2022143255A1 (zh) 实时信息交互方法及装置、设备、存储介质
US20230412762A1 (en) Systems and methods for video conferencing
KR20180082672A (ko) 영상 회의 방청 서비스 제공 방법 및 장치
CN114173162B (zh) 互动直播系统、发布-订阅关系的维护方法及相关设备
JP7376035B1 (ja) レコメンデーションのためのシステム、方法、及びコンピュータ可読媒体
US20210320959A1 (en) System and method for real-time massive multiplayer online interaction on remote events
US20240037478A1 (en) Virtual event platform event engagement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23867151

Country of ref document: EP

Kind code of ref document: A1