WO2024027272A1 - 多媒体资源的传输方法、装置、电子设备及存储介质 - Google Patents

多媒体资源的传输方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2024027272A1
WO2024027272A1 PCT/CN2023/094123 CN2023094123W WO2024027272A1 WO 2024027272 A1 WO2024027272 A1 WO 2024027272A1 CN 2023094123 W CN2023094123 W CN 2023094123W WO 2024027272 A1 WO2024027272 A1 WO 2024027272A1
Authority
WO
WIPO (PCT)
Prior art keywords
resources
server
direct
direct link
voice call
Prior art date
Application number
PCT/CN2023/094123
Other languages
English (en)
French (fr)
Other versions
WO2024027272A9 (zh
Inventor
薛政
周煜
唐思宇
郭泽辉
黄晓萍
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2024027272A1 publication Critical patent/WO2024027272A1/zh
Publication of WO2024027272A9 publication Critical patent/WO2024027272A9/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1069Session establishment or de-establishment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/164Adaptation or special uses of UDP protocol

Definitions

  • Embodiments of the present application relate to the field of multimedia technology, and in particular to a multimedia resource transmission method, device, electronic device, and storage medium.
  • Common communication software includes instant messaging software and conferencing software.
  • multimedia resources can be transmitted between devices participating in voice calls in the form of data streams.
  • This application provides a multimedia resource transmission method, device, electronic device, and storage medium.
  • the technical solution includes the following contents.
  • a multimedia resource transmission method includes:
  • the first device determines a second device among at least one terminal device participating in the same voice call as the first device, and the presenter refers to A device used to output multimedia resources during voice calls;
  • the first device creates a direct link between the first device and the second device
  • the first device sends the multimedia resources of the first device to the second device through the direct link;
  • the first device sends the multimedia resources of the first device to the server, so that the server forwards the multimedia resources of the first device to the second device.
  • a multimedia resource transmission method includes:
  • the second device In response to the direct connection request of the first device forwarded by the server, the second device creates a direct link between the second device and the first device.
  • the second device is the first device during the voice call.
  • the role is changed to the presenter, it is determined in at least one terminal device participating in the same voice call as the first device.
  • the presenter refers to the device used to output multimedia resources during the voice call;
  • the second device receives the multimedia resources of the first device sent by the first device through the direct link
  • the second device receives the multimedia resources of the first device forwarded by the server.
  • a multimedia resource transmission device which is provided in the first device, and the device includes:
  • a determining module configured to determine a second device among at least one terminal device participating in the same voice call as the first device in response to the role of the first device changing to the presenter during the voice call, and the presenter refers to the person who is the presenter.
  • a creation module configured to create a direct link between the first device and the second device
  • a sending module configured to send the multimedia resources of the first device to the second device through the direct link
  • the sending module is also configured to send the multimedia resources of the first device to the server, so that the server forwards the multimedia resources of the first device to the second device.
  • a multimedia resource transmission device which is provided in the second device, and the device includes:
  • a creation module configured to respond to the direct connection request of the first device forwarded by the server, create a connection between the second device and the first device.
  • a direct link between devices, and the second device is at least one terminal device participating in the same voice call as the first device when the role of the first device changes to the presenter during the voice call. It is determined that the speaker refers to a device used to output multimedia resources during a voice call;
  • a receiving module configured to receive the multimedia resources of the first device sent by the first device through the direct link
  • the receiving module is also configured to receive the multimedia resources of the first device forwarded by the server.
  • a multimedia resource transmission system includes a first device, a second device and a server;
  • the first device is configured to perform the functions performed by the first device in the multimedia resource transmission method shown in the above aspect;
  • the second device is configured to perform the functions performed by the second device in the multimedia resource transmission method shown in the above aspect;
  • the server is configured to perform functions performed by the server in the multimedia resource transmission method shown in the above aspect.
  • an electronic device includes a processor and a memory. At least one computer program is stored in the memory. The at least one computer program is loaded and executed by the processor, so that the The electronic device implements any one of the above multimedia resource transmission methods.
  • a computer-readable storage medium is also provided. At least one computer program is stored in the computer-readable storage medium. The at least one computer program is loaded and executed by the processor to enable the electronic device to implement any of the above. 1. The transmission method of multimedia resources.
  • a computer program or computer program product is also provided. At least one computer program is stored in the computer program or computer program product. The at least one computer program is loaded and executed by the processor to enable the electronic device to implement Any of the above multimedia resource transmission methods.
  • the second device when the role of the first device changes to the presenter during the voice call, the second device is determined in at least one terminal device participating in the same voice call as the first device, and creates a connection between the first device and the first device.
  • the direct link between the second device realizes the dynamic creation of the direct link.
  • the multimedia resources of the first device are sent to the second device through a direct link to reduce the transmission delay and packet loss rate of the multimedia resources.
  • the multimedia resources of the first device are sent to the server, so that the server forwards the multimedia resources of the first device to the second device, thereby improving the transmission quality of the multimedia resources.
  • Figure 1 is a schematic diagram of the implementation environment of a multimedia resource transmission method provided by an embodiment of the present application
  • Figure 2 is a flow chart of a multimedia resource transmission method provided by an embodiment of the present application.
  • Figure 3 is a flow chart of a multimedia resource transmission method provided by an embodiment of the present application.
  • Figure 4 is a flow chart of another multimedia resource transmission method provided by an embodiment of the present application.
  • Figure 5 is a flow chart of yet another multimedia resource transmission method provided by an embodiment of the present application.
  • Figure 6 is a flow chart of yet another multimedia resource transmission method provided by an embodiment of the present application.
  • Figure 7 is a schematic diagram of role switching provided by an embodiment of the present application.
  • Figure 8 is a schematic diagram of the creation and closure of a direct link provided by an embodiment of the present application.
  • Figure 9 is a schematic diagram of the transmission of multimedia resources provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of a multimedia resource transmission device provided by an embodiment of the present application.
  • Figure 11 is a schematic structural diagram of another multimedia resource transmission device provided by an embodiment of the present application.
  • Figure 12 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • Figure 13 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • multimedia resources include audio resources. Any device participating in the voice call can forward audio resources to other devices participating in the voice call through the relay server. That is to say, any device first sends audio resources to the relay server, and then the relay server forwards the audio resources to other devices.
  • This method of transmitting audio resources through transit links has large transmission delays and packet loss rates, which affects the transmission quality of audio resources.
  • Figure 1 is a schematic diagram of an implementation environment of a multimedia resource transmission method provided by an embodiment of the present application.
  • the implementation environment includes at least two terminal devices and a server (the server includes a relay server mentioned below).
  • the multimedia resource transmission method in the embodiment of the present application can be executed by any terminal device.
  • Direct links are created between each terminal device and the server.
  • the transit link between two terminal devices includes a direct link between one terminal device and the server and a direct link between the other terminal device and the server. That is to say, the transmission of multimedia resources between two terminal devices through a transit link means: one terminal device sends multimedia resources to the server through a direct link with the server, and the server sends multimedia resources to another terminal device through a direct link with the server.
  • the direct link forwards the multimedia resource to another terminal device.
  • any terminal device can send multimedia resources to the server, and the server can forward the multimedia resources to other terminal devices, thereby realizing that any terminal device can forward the multimedia resources to other terminal devices through the transfer link.
  • the terminal device 1 can send multimedia resources to the server, and the server can forward the multimedia resources to the terminal device 2 and the terminal device 3, thereby realizing that the terminal device 1 can forward the multimedia resources through the transit link. to terminal device 2 and terminal device 3.
  • any terminal device can create a direct link with other terminal devices, so that the terminal device can directly send multimedia resources to other terminal devices.
  • terminal device 1 has created direct links with terminal device 2 and terminal device 3 respectively, so that terminal device 1 can directly send multimedia resources to terminal device 2 and terminal device 3.
  • the terminal device can be a smartphone, a game console, a desktop computer, a tablet computer, a laptop computer, a smart TV, a smart car device, a smart voice interaction device, a smart home appliance, etc.
  • the server may be one server, a server cluster composed of multiple servers, or any one of a cloud computing platform and a virtualization center, which is not limited in the embodiments of this application.
  • the server can communicate with the terminal device through a wired network or a wireless network.
  • the server may have functions such as data processing, data storage, and data sending and receiving, which are not limited in the embodiments of this application.
  • the number of terminal devices and servers is not limited and can be one or more.
  • Transit link The transit link is also called the Selective Forwarding Unit (SFU) transit link.
  • SFU is a transmission architecture that consists of a transit server and multiple terminal devices. It is a star structure. Each terminal device sends its own multimedia resources to be shared to the transit server, and the transit server forwards the multimedia resources to other terminal devices. .
  • a voice call when multimedia resources are transmitted between two terminal devices through a transit link, they need to be forwarded by a transit server. That is, one terminal device can send multimedia resources to another terminal device through a transit link, and can also receive multimedia resources sent by another terminal device through a transit link.
  • the establishment success rate and availability of the transit link are high, but the transmission cost is also high.
  • the transit link belongs to the basic transmission link.
  • the direct link is also called the wireless mesh network (Mesh) direct link.
  • Mesh is a network structure formed by connecting multiple terminal devices in pairs. For example, three terminal devices, terminal device A, terminal device B, and terminal device C, are connected in pairs.
  • terminal device A wants to share multimedia resources (such as audio resources and video resources)
  • terminal device A sends the multimedia resources to terminal device B and terminal device C respectively.
  • terminal device B wants to share multimedia resources, it needs to send multimedia resources to terminal device A and terminal device C respectively, and so on.
  • the Mesh network architecture requires the establishment of direct links between each terminal device and all other terminal devices, which is highly complex and difficult.
  • a direct link between two terminal devices is dynamically created and closed based on the role of the terminal device in the voice call process. Therefore, the direct link is an auxiliary transmission link.
  • the transit link is congested, using direct links to transmit multimedia resources can effectively improve transmission quality.
  • NAT Network Address Translation
  • IP Internet Protocol
  • NAT can solve the problem of insufficient Internet Protocol version 4 (IPv4) addresses, and can effectively avoid attacks from outside the network, hide and protect computers inside the network.
  • IPv4 Internet Protocol version 4
  • NAT is divided into full cone type, restricted type, port restricted type, symmetric type, etc. Among them, port restricted type and symmetric type have the highest security level, the most demanding connection conditions, and the application The widest.
  • NAT penetration It can also be called NAT hole punching. NAT will map the internal IPv4 address to the external network address. In addition, after NAT receives packets from the external network, it will filter them according to certain rules. These make communication between two NAT internal hosts complicated. NAT penetration technology is Used to break NAT barriers and establish direct links between two NAT intranet hosts.
  • communication software is a common software. At least two terminal devices installed with communication software can participate in the same voice call. In the process of at least two terminal devices participating in a voice call, any terminal device can collect multimedia resources such as sounds and images, and send the multimedia resources to other terminal devices participating in the voice call.
  • any terminal device sends multimedia resources to a transfer server, and the transfer server forwards the multimedia resources to other terminal devices, thereby enabling any terminal device to send multimedia resources to other terminal devices through a transfer link. Since multimedia resources need to be forwarded by a transit server, the above technology has a large transmission delay and packet loss rate, resulting in poor transmission quality of multimedia resources.
  • the embodiments of the present application provide a multimedia resource transmission method, which can be used to solve the above problems.
  • the method provided by the embodiments of the present application can be applied in the above implementation environment.
  • this method can be executed by any terminal device (such as terminal device 1) in FIG. 1 .
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal equipment other than the terminal equipment is called other terminal equipment.
  • the method at least includes steps 201 to 204 as shown below.
  • Step 201 In response to the first device changing its role to the presenter during the voice call, the first device determines the second device among at least one terminal device participating in the same voice call with the first device.
  • the presenter refers to the second device during the voice call.
  • Step 202 The first device creates a direct link between the first device and the second device.
  • Step 203 The first device sends the multimedia resources of the first device to the second device through a direct link.
  • Step 204 The first device sends the multimedia resources of the first device to the server, so that the server forwards the multimedia resources of the first device to the second device.
  • the first device sends multimedia resources of the first device to the second device through a direct link or a server.
  • the multimedia resources of the first device include at least one of audio resources or video resources.
  • audio resources and Video resources can be sent in different ways.
  • the embodiment in Figure 3 below explains in detail how to send audio resources
  • the embodiment in Figure 4 below explains in detail how to send video resources.
  • this method can be executed by any terminal device (such as terminal device 1) in Figure 1 . Since at least two terminal devices are involved in the environment of the above embodiments, for the convenience of description, the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device. Among the at least two terminal devices, except the first device Terminal equipment other than the terminal equipment is called other terminal equipment. As shown in Figure 3, the method at least includes steps 301 to 304 as shown below. The embodiment of FIG. 3 takes sending audio resources as an example for description.
  • Step 301 In response to the first device changing its role to the presenter during the voice call, the first device determines the second device among at least one terminal device participating in the same voice call as the first device.
  • the presenter refers to the second device during the voice call.
  • At least two terminal devices participate in the same voice call.
  • the at least two terminal devices can be divided into a first device and other terminal devices, where the other terminal devices are equivalent to "with the first device" in step 301.
  • the first device and other terminal devices are installed with the same communication software. Through this communication software, the first device and other terminal devices can participate in the same voice call.
  • the communication software is instant messaging software
  • the first device and other terminal devices can participate in a two-person voice call.
  • the communication software is conference software
  • the first device and other terminal devices can participate in the audio and video conference. In audio and video conferencing, two or more individuals or groups in different places can transmit sounds, images, files and other resources to each other through transmission links between terminal devices, thereby achieving instant interactive communication.
  • the embodiment of the present application adopts an adaptive role switching strategy to set the role of the terminal device participating in the voice call in the voice call process.
  • the role of the terminal device in the voice call process is related to the device status of the terminal device.
  • the role of the first device in the voice call process is related to the device status of the first device.
  • the speaker is a role of the terminal device in the voice call process, responsible for outputting multimedia resources, and other terminal devices are responsible for receiving the multimedia resources output by the terminal device serving as the speaker.
  • Multimedia resources here include but are not limited to at least one of text, image, audio, video, file and other resources. Among them, when the multimedia resources include audio resources and video resources, the multimedia resources may be called audio and video resources. During a voice call, there may be multiple presenters at the same time.
  • step 301 also includes: in response to the device status of the first device changing to a video resource collection status, the first device determines that the role of the first device during the voice call is changed to the presenter.
  • the first device turns on video resource collection
  • the first device turns on screen sharing (at this time, the first device can collect the information on the screen) or the first device turns on the camera (at this time, the first device can collect the information captured by the camera). information
  • the device status of the first device is changed from the status of turning off video resource collection to the status of video resource collection.
  • the server can detect this change information in real time.
  • the change information is information indicating that the device status of the first device is changed from a status of turning off video resource collection to a status of video resource collection, or, the first device
  • This change information can be sent to the server, and the server determines that the role of the first device during the voice call is changed to the presenter based on the change information, and sends the information to the first device that the role of the first device is changed to the presenter during the voice call.
  • Notification information when the first device receives the notification information, determines that the role of the first device during the voice call is changed to the presenter.
  • the role of the first device in the voice call process may be briefly described as the role of the first device.
  • step 301 also includes: the first device responds to the device status of the first device being changed to the audio resource collection status, acquires the audio resources collected by the first device, and the first device responds to detecting in the collected audio resources Object Audio determines the role of the first device during the voice call to change to the presenter.
  • the server can detect change information in real time, and the change information is information indicating that the device status of the first device has changed from a status of turning off audio resource collection to a status of audio resource collection, or the first device can Send this change information to the server. Based on this change information, the server first determines that the role of the first device has been changed to the intermediate state, and sends notification information that the role of the first device is changed to the intermediate state to the first device.
  • the first device determines that the first device has changed its role to the intermediate state.
  • the device's role during the voice call changes to an intermediate state.
  • the intermediate state is a role of the terminal device in the voice call process.
  • the terminal device can receive multimedia sent by other terminal devices that play the role of presenter.
  • resources on the other hand, the terminal device can collect audio resources, but the audio resources have not yet been sent to the terminal device whose role is the presenter or the audience.
  • the intermediate state can be changed to the presenter or returned to the audience.
  • the first device can collect audio resources in the intermediate state.
  • the first device performs Voice Activity Detection (VAD) on the audio resource, obtains the detection result, and sends the detection result to the server.
  • VAD Voice Activity Detection
  • the server determines that the role of the first device has been changed from the intermediate state to the presenter, and sends notification information to the first device that the role of the first device has been changed to the presenter.
  • the first device receives the When the notification information is received, it is determined that the role of the first device during the voice call is changed to the presenter.
  • the server determines that the role of the first device is changed from the intermediate state to the listener, and sends notification information that the role of the first device is changed to the listener to the first device.
  • the first device When receiving the notification information, it is determined that the role of the first device during the voice call is changed to a listener.
  • the audience is a role of the terminal device in the voice call process, and is responsible for receiving and playing the multimedia resources output by the terminal device as the speaker.
  • the listener can become the presenter, and the presenter can also become the listener.
  • the first device sends the audio resource to the server, and the server performs voice activity detection on the audio resource.
  • the server performs voice activity detection on the audio resource.
  • the device status of the first device changes to the audio resource collection status
  • the role of the first device may be a listener, may have no role, or may be in an intermediate state. Before the role of the first device is changed to an intermediate state, the role of the first device can be a listener or has no role.
  • the above describes the case where the role of the first device is changed to the presenter.
  • the role of the first device can be changed to an intermediate state first, and then changed to the presenter or the audience. The following is a brief description of other ways in which the role of the first device is changed to the audience. situations (denoted as case 13 and case 14), and the case where the role of the first device is changed to no role (denoted as case 15).
  • no role refers to a device that does not participate in voice calls. It can be understood that when the device status of the first device is not participating in the voice call, since the first device is not participating in the voice call, the first device cannot play any role in the voice call process. At this time, the first device The role is no role.
  • the first device can create and participate in a voice call. At this time, the device status of the first device is changed from a status of not participating in the voice call to a status of participating in the voice call.
  • the first device that does not participate in the voice call can send a request to the server to participate in the voice call.
  • the first device receives the server's response to the request, the first device participates in the voice call.
  • the device status of the first device is changed from the status of not participating in the voice call to the status of participating in the voice call.
  • the other terminal devices send invitation information to participate in the voice call to the first device that does not participate in the voice call.
  • the first device responds to the invitation information, the first device participates in the voice call.
  • the device status of the first device is changed from the status of not participating in the voice call to the status of participating in the voice call.
  • the server can detect this change information in real time, or the first device can send this change information to the server, and the server can based on this change information.
  • the change information determines that the role of the first device during the voice call is changed from no role to a listener, and sends notification information to the first device that the role of the first device is changed to a listener.
  • Case 14 When the device status of the first device is changed from a multimedia resource collection state to a multimedia resource collection closed state, it is determined that the role of the first device is changed to a listener.
  • the state of multimedia resource collection here includes at least one of the state of audio resource collection or the state of video resource collection.
  • the state of turning off multimedia resource collection here includes the state of turning off audio resource collection and the state of turning off video resource collection.
  • the server can detect this change information in real time, or the first device can send this change information to the server. If within a period of time (such as within 1 minute), the server obtains that the device status of the first device has been in a state of turning off multimedia resource collection, at this time, the server determines that the role of the first device has been changed from the speaker or intermediate state to the audience, and Send notification information to the first device that the role of the first device is changed to a listener.
  • a period of time such as within 1 minute
  • the device status of the first device changes from the status of participating in the voice call to the status of not participating in the voice call.
  • the server can detect this change information in real time, or the first device can detect this change information in real time. sent to the server. At this time, the server determines that the role of the first device is changed from presenter, intermediate mode or audience to no role, and sends notification information to the first device that the role of the first device is changed to no role. When the first device receives the notification information, the first device exits the voice call.
  • the first device plays the role of the speaker during the voice call
  • the first device is the producer of multimedia resources and is mainly used to collect the multimedia resources, encode the multimedia resources, and then encode the multimedia resources.
  • the final multimedia resources are transmitted to other terminal devices.
  • the first device is also a recipient of multimedia resources and can receive multimedia resources sent by the terminal device playing the role of presenter, decode the multimedia resources and play the decoded multimedia resources.
  • the first device When the first device plays the role of a listener during a voice call, the first device is a receiver of multimedia resources and is mainly used to receive multimedia resources sent by a terminal device that plays the role of a presenter. Since multimedia resources usually need to be encoded before transmission, after receiving the multimedia resources, the terminal device can first decode the multimedia resources and then play the decoded multimedia resources.
  • the first device can receive multimedia resources sent by the terminal device whose role is to present, decode the multimedia resources and play the decoded multimedia resources.
  • the first device since the first device has turned on audio collection, the first device can collect audio resources, but the audio resources have not yet been sent to the terminal device that plays the role of presenter or listener.
  • the first device When the role of the first device is no role, the first device does not participate in the voice call, so the first device does not need to output multimedia resources, nor does it need to receive multimedia resources output by other devices.
  • the number of terminal devices serving as speakers and the number of terminal devices serving as listeners in a voice call also differs.
  • the role of either terminal device can be the speaker or the listener, and the speaker and listener of each terminal device can be Real-time switching, that is to say, any terminal device plays the role of the speaker at the last moment and the role of the listener at the next moment.
  • the speaker and listener of each terminal device can be Real-time switching, that is to say, any terminal device plays the role of the speaker at the last moment and the role of the listener at the next moment.
  • Some of these terminal devices may always play the role of listeners, while the role of other terminal devices can be switched from presenter to listener in real time. Switch from audience to presenter.
  • terminal devices For online classroom scenarios, in general, there may be dozens or even hundreds of terminal devices. Most of these terminal devices have always played the role of listeners, and some of the terminal devices have always played the role of lecturers. For example, the teacher's terminal device The role of the student's terminal device has always been that of the speaker, while the role of the student's terminal device has always been that of the listener.
  • the second device when the role of the first device is changed to the presenter during the voice call, the second device is determined among other terminal devices.
  • the number of the second device is at least one, and the role of the second device can be a lecturer, a listener, or an intermediate state.
  • the principle of the role change of the second device is the same as the principle of the role change of the first device. Please refer to the relevant description of the role change of the first device (such as cases 1 to 5) above, which will not be described again here.
  • Step 302 The first device creates a direct link between the first device and the second device.
  • any second device if the second device has already created a direct link with the first device, the first device does not need to create a direct link with the second device. path, that is, for the second device, the first device may not perform step 302, but directly perform steps 303 and 304. If the second device does not create a direct link with the first device, the first device needs to create a direct link with the second device, that is, for the second device, the first device needs to perform steps 302.
  • the first device when the device status of the first device changes to a state of participating in a voice call, the first device will create Establish a direct connection to the user data packet socket (User Datagram Protocol Socket, UDP Socket).
  • the direct connection to the UDP Socket can be referred to as the direct connection Socket.
  • Direct Socket is a data structure used to store status information of all direct links in the first device.
  • the direct link here includes the first device and the server (the server includes at least one of a direct hole punching server or a transit server, where the direct hole punching server can determine the external network address of the terminal device and feed it back to the terminal device) Direct link between the first device and the server.
  • the direct link between the first device and the server needs to be determined based on the direct Socket, and based on the determined direct link Perform resource transfer.
  • the second device when the device status of the second device changes to a state of participating in a voice call, the second device will also create a direct connection Socket. After the first device and the second device create their respective direct connection Sockets, a direct connection link between the first device and the second device is created based on the respective direct connection Sockets of the first device and the second device.
  • the first device creates a direct link between the first device and the second device, including: the first device sends a direct connection request for the second device to the server, the direct connection request carries The external network address of the first device receives the direct connection response forwarded by the server.
  • the direct connection response is the response information sent by the second device to the server for the direct connection request.
  • the direct connection request is forwarded by the server to the second device.
  • the direct connection response is in progress. Carry the external network address of the second device.
  • the first device extracts the external network address of the second device from the direct connection response, and creates a direct link between the first device and the second device based on the external network address of the first device and the external network address of the second device. .
  • the first device determines the direct link between the first device and the direct hole punching server, and sends a UDP data packet to the direct hole punching server through the direct link.
  • the first device generates an original UDP data packet and sends the original UDP data packet to the router.
  • the router fills in the external network address of the first device in the original UDP packet, obtains the target UDP packet, and sends the target UDP packet to the direct hole punching server.
  • the external network address of the first device includes the Internet Protocol Address (IP address) of the first device and the port address of the first device.
  • IP address of the first device is used to locate the first device and the port of the first device.
  • the address is used to locate the application on the first device.
  • the direct connection hole punching server After receiving the target UDP data packet, the direct connection hole punching server parses the target UDP data packet, thereby parsing out the external network address of the first device, and sends the external network address of the first device to the first device. Therefore, the first device can obtain and store its own external network address. Based on the same principle as the first device, the second device can also obtain and store its own external network address.
  • the first device sends a direct connection request for the second device to the transit server.
  • the direct connection request carries the external network address of the first device, the identification information of the first device (as the source device) and the second device (as the target device). Identification information and other content.
  • the transfer server parses the direct connection request to obtain the identification information of the second device, and forwards the direct connection request to the second device based on the identification information of the second device.
  • the direct connection request forwarded by the transit server to the second device includes at least the identification information of the first device and the external network address of the first device, and may include the identification information of the second device, or may not include the second device's identification information. Identification information.
  • the second device After receiving the direct connection request, the second device parses the direct connection request to obtain the identification information of the first device and the external network address of the first device, and stores the identification information of the first device and the external network address of the first device. In addition, The second device also sends a direct connection response to the direct connection request to the transit server.
  • the direct connection response carries the external network address of the second device, the identification information of the first device (as the target device) and the second device (as the source device). ) identification information.
  • the transit server After receiving the direct connection response, the transit server parses the direct connection response to obtain the identification information of the first device, and forwards the direct connection request to the first device based on the identification information of the first device.
  • the direct connection response forwarded by the transit server to the first device includes at least the identification information of the second device and the external network address of the second device. It may include the identification information of the first device, or may not include the first device's identification information. Identification information.
  • the first device After receiving the direct connection response, the first device parses the direct connection response to obtain the identification information of the second device and the external network address of the second device, and stores the identification information of the second device and the external network address of the second device. From then on, both the first device and the second device can obtain and store the other party's external network address. After the first device and the second device obtain each other's external network address, a direct link between the first device and the second device is equivalent to being created.
  • the first device m sends a direct connection request for the second device n to the transit server, and the direct connection request carries (IP_m, port_m), id(m) and id(n).
  • IP_m, port_m is the external network address of the first device m, where, IP_m is the IP address of the first device m, port_m is the port address of the first device m, id(m) is the identification information of the first device m, and id(n) is the identification information of the second device n.
  • the second device n parses out the id (m) and (IP_m, port_m) and stores them, and the second device n sends the direct connection request for the first device m to the relay server.
  • the direct connection response carries (IP_n, port_n), id(m) and id(n).
  • IP_n, port_n) is the external network address of the second device n, where IP_n is the IP address of the second device n, and port_n is the port address of the second device n.
  • the transfer server forwards the direct connection response to the first device m
  • the first device m parses out the id(n) and (IP_n, port_n) and stores them.
  • the first device creates a direct link between the first device and the second device based on the external network address of the first device and the external network address of the second device, including: based on the external network address of the second device , sending multiple direct connection data packets to the second device, and upon receiving a reception response sent by the second device, it is determined that a direct connection link between the first device and the second device has been created, and the reception response refers to the first device.
  • the second device feeds back response information for each directly connected data packet based on the external network address of the first device.
  • the first device may send multiple direct connection data packets to the second device based on the external network address of the second device.
  • the direct connection data packets are the above-mentioned UDP data packets.
  • the first device receives the reception response for each direct connection data packet fed back by the second device it means that the external network address of the second device obtained by the first device is the correct address, and the second device also successfully received the first The external network address of the first device.
  • the first device determines that a direct link between the first device and the second device has been successfully created.
  • the second device can send multiple direct connection data packets to the first device based on the external network address of the first device.
  • the first device receives one or more direct connection data packets, it feeds back a reception response to the one or more direct connection data packets to the second device based on the external network address of the second device.
  • the second device receives the reception response for each direct connection data packet fed back by the first device, it means that the external network address of the first device obtained by the second device is the correct address, and the first device also successfully received the first device.
  • the external network address of the second device determines that a direct link between the second device and the first device has been successfully created.
  • the above-mentioned method by which the first device and the second device each determine that a direct link between the two devices has been successfully created is also called direct hole punching.
  • direct connection hole drilling you can test whether a direct link has been successfully created between the first device and the second device, and test the stability of the direct link to facilitate subsequent transmission of multimedia resources through the direct link. , improve the transmission success rate of multimedia resources.
  • Step 303 The first device sends the audio resources of the first device to the second device through a direct link.
  • the first device when the first device plays the role of a speaker, the first device collects the audio resources of the first device and sends the audio resources of the first device to the second device through a direct link.
  • the number of second devices is at least one, and the first device sends the audio resources of the first device to each second device through a direct link.
  • every second device that has established a direct link with the first device can quickly and accurately obtain the audio resources of the first device, making full use of the direct link to reduce audio Resource transmission delay and packet loss rate.
  • Step 304 The first device sends the audio resources of the first device to the server, so that the server forwards the audio resources of the first device to the second device. 2 shows that step 304 is executed after step 303. Alternatively, step 304 can also be executed before step 303, or in parallel with step 303, which is not limited here.
  • the first device collects the audio resources of the first device, it sends the audio resources of the first device to the server so that the server forwards the audio resources of the first device to the second device.
  • the second device can not only receive the audio resources of the first device through the direct link, but also receive the audio resources of the first device through the transit link, reducing the risk of congestion on a certain link. The situation that causes the second device to be unable to receive the audio resources of the first device improves the transmission quality of the audio resources.
  • the server can forward the audio resources of the first device to the second device within a set time period, that is, the server unconditionally forwards the audio resources of the first device to the second device within the set time period.
  • the server conditionally forwards the audio resources of the first device to the second device.
  • the server forwards the audio resources of the first device to the second device if the forwarding conditions are met, and the server stops forwarding the audio resources of the first device to the second device if the forwarding conditions are not met.
  • the server determines whether the second device meets the forwarding conditions. like If the second device satisfies the forwarding condition, the audio resources of the first device are forwarded to the second device. If the second device does not meet the forwarding condition, stop forwarding the audio resources of the first device to the second device.
  • the forwarding condition includes that the utilization rate of the transit data packets of the second device is greater than the first utilization threshold, and the utilization rate of the transit data packets of the second device represents the proportion of data packets forwarded by the server used by the second device, that is, the second The ratio of the number of packets used by a device forwarded through the server to the total number of packets used by the second device.
  • the total number of data packets used by the second device is equal to the sum of the number of data packets used by the second device forwarded through the server and the number of data packets sent by the second device through the direct link.
  • the first device sends the audio resource of the first device to the server.
  • the server receives the audio resources of the first device, for any second device, if the second device's transfer data packet utilization rate for the first device is greater than the first utilization threshold, the server forwards the second device to the second device.
  • a device's audio resource is a device's audio resource.
  • the audio resources of the first device are continuously sent to the server and the second device in the form of a stream, and the server also continuously forwards the audio resources of the first device to the second device in the form of a stream.
  • the data packet of the first device includes the audio resources of the first device.
  • the second device continuously receives data packets sent by the first device through the direct link; on the other hand, the second device continuously receives data packets forwarded by the server through the transit link.
  • the second device can obtain the sequence number of the data packet. If the second device has received the data packet with this sequence number, the second device discards the data packet. If the second device has not received the data packet with this sequence number, the second device can use the data packet, that is, the second device The device parses the data packet and plays the corresponding audio resource.
  • the second device deduplicates the data packets received through the direct link and the transit link, and counts the number of data packets received through the transit link used by the second device and the number of data packets received through the transit link used by the second device.
  • the number of data packets received by the direct link can be used to obtain the transit data packet utilization rate of the second device for the first device.
  • the second device uses the data packets received through the transit link, which means the second device uses the data packets forwarded through the server.
  • the transit data packet utilization rate of the second device for the first device is equal to the ratio of the number of data packets received by the second device through the transit link divided by the total number of data packets used by the second device, where, The total number of data packets used by the second device is equal to the sum of the number of data packets received by the second device through the transit link and the number of data packets received by the second device through the direct link. That is to say, the utilization rate of the transit data packets of the second device with respect to the first device satisfies the following formula (1).
  • the data packets in formula (1) are the data packets sent by the first device i
  • the second device may send the transit data packet utilization rate for the first device to the server, so that the server determines whether to forward the data packet of the first device to the second device based on the transit data packet utilization rate.
  • the server can first forward the data packet of the first device to the second device, so that the second device can calculate the utilization rate of the transit data packet for the first device within a set time period and use the transit data packet. rate forwarded to the server. Afterwards, if the server determines that the utilization rate of the transit data packet of the second device is greater than the first utilization threshold, the server forwards the data packet of the first device to the second device. If the server determines that the utilization rate of the transit data packet of the second device is not greater than the first utilization threshold, the server does not forward the data packet of the first device to the second device.
  • the first device sends the audio resources of the first device to the second device through a dual-transmission strategy of a transit link and a direct link.
  • the utilization rate of transit data packets is not greater than the first utilization threshold, it means that the second device will most likely use the data packets received through the direct link (this also shows that the transmission quality and stability of the direct link are high High), there is a high probability that the data packets forwarded by the server will be discarded. Therefore, the data packets sent by the server are likely to be invalid transmissions.
  • Dangdi When the utilization rate of the transit data packets of the second device is not greater than the first utilization threshold, the server stops forwarding the data packets of the first device to the second device, which can reduce invalid transmission and save downlink resources.
  • the transit data packet utilization rate of the second device is greater than or equal to 0 and less than or equal to 1.
  • the first utilization threshold is an adjustable coefficient. The first utilization threshold is recorded as ⁇ , then 0 ⁇ 1.
  • the transit data packet utilization rate of the second device is greater than or equal to ⁇ .
  • the server when the utilization rate of the transfer data packets of the second device is greater than ⁇ , the server can forward the data packets of the first device to the second device; when the utilization rate of the transfer data packets of the second device is equal to ⁇ , the server can forward the data packets of the first device to the second device.
  • the second device forwards the data packet of the first device, or it may not forward the data packet of the first device to the second device.
  • the server forwards the audio resources of the first device to the second device.
  • the transit data packet utilization rate of the second device is less than or equal to ⁇ .
  • the server when the utilization rate of the transfer data packets of the second device is less than ⁇ , the server does not forward the data packets of the first device to the second device; when the utilization rate of the transfer data packets of the second device is equal to ⁇ , the server can forward the data packets of the first device to the second device.
  • the second device forwards the data packet of the first device, or may not forward the data packet of the first device to the second device. In a possible implementation, no matter whether the transit packet utilization rate of the second device is less than or equal to ⁇ , the server does not forward the audio resources of the first device to the second device.
  • the first device cannot send the audio resources of the first device to them through the direct link, but can only send them the audio resources of the first device through the transit link. Therefore, this part of the terminal equipment is not suitable for the third part of the terminal equipment.
  • the transit data packet utilization rate of a device is 1, which is greater than the first utilization threshold. Therefore, the server will continue to send data packets of the first device to these terminal devices, ensuring that terminal devices that have not established a direct link with the first device can also receive the audio resources of the first device.
  • any terminal device participating in the same voice call with the first device may have a direct link with the first device, and the terminal device may be the second device mentioned above.
  • the terminal device may not have a direct link established with the first device.
  • C ij can also be called direct reachability.
  • a strategy of dual transmission of direct links and transit links is adopted.
  • the first device i sends the audio and video resources of the first device to the terminal device j
  • link resources are fully utilized and the transmission quality of audio resources is improved.
  • the multimedia resources also include video resources, please refer to Figure 4.
  • Figure 4 is a flow chart of another multimedia resource transmission method provided by an embodiment of the present application. The method also includes step 305. 4 shows that step 305 is executed after step 302. Alternatively, step 305 may also be executed after step 303 or step 304, which is not limited here.
  • Step 305 The first device determines the video code rate of the video resource of the first device, and if the first sending condition is met, sends the video resource of the first device to the second device through the direct link.
  • the first sending condition is It means that the available uplink bandwidth of the first device is not less than the product of the video bit rate of the video resource of the first device and the reference quantity.
  • the first device when the first device plays the role of a lecturer, the first device can collect the video resources of the first device. After acquiring the video resources of the first device, the first device determines the content of the video resources of the first device.
  • Video bit rate the video The frequency code rate represents the number of data bits transmitted per unit time when transmitting the video resources of the first device.
  • the video bit rate can also be called the sampling rate, and its unit is usually kilobits per second (kbps) or megabits per second (mbps).
  • the first device calculates a product between the video bit rate of the video resource of the first device and the reference number.
  • the reference number may be a set data, for example, the reference number is the number of terminal devices participating in the voice call, or the reference number is the number of second devices. In this embodiment of the present application, any terminal device that has a direct link with the first device is the second device mentioned above. Therefore, the number of the second device is at least one.
  • the first device also determines the available uplink bandwidth of the first device.
  • the available uplink bandwidth of the first device is the total uplink bandwidth of the first device minus the utilized uplink bandwidth of the first device.
  • the utilized uplink bandwidth of the first device includes the uplink bandwidth required by the first device to send audio resources to the server. Bandwidth, the uplink bandwidth required by the first device to send audio resources to the second device through a direct link, the uplink bandwidth required by the first device to send video resources to the server, etc.
  • the available uplink bandwidth of the first device can be roughly estimated as the total uplink bandwidth of the first device and the number of times the first device sends video resources to the server. The difference between the required upstream bandwidth.
  • the first device transmits data to the third device through the direct link between the first device and the second device.
  • the second device sends the video resources of the first device.
  • the available uplink bandwidth of the first device is not less than the product, indicating that the available uplink bandwidth of the first device is large enough to support sending the first device to each second device through a direct link with each second device.
  • Video resources of the device thereby reducing the transmission delay and packet loss rate of video resources.
  • step 306. 4 shows that step 306 is executed after step 302.
  • step 306 may also be executed after step 303 or step 304, which is not limited here.
  • Step 306 The first device determines the video code rate of the video resource of the first device, and if the second sending condition is met, sends the video resource of the first device to the second device through the direct link.
  • the second sending condition is It means that the available uplink bandwidth of the first device is less than the product of the video bit rate of the video resource of the first device and the reference number, and the second device meets the transmission conditions.
  • the available uplink bandwidth of the first device is less than the product of the video code rate and the reference number, it indicates that the available uplink bandwidth of the first device is not large enough and does not support communication with each second device.
  • the direct link sends the video resources of the first device to each second device. That is to say, the first device cannot send the video resources of the first device to all second devices through direct links. At this time, for each second device, the first device needs to determine whether the second device meets the transmission conditions. .
  • the second device meets the transmission conditions, including: the direct connection data packet utilization rate of the second device is not less than the second utilization threshold, and the direct connection data packet utilization rate of the second device indicates that the second device uses the direct connection link
  • the proportion of data packets sent by the route That is to say, the direct-connected data packet utilization rate of the second device represents the ratio between the number of data packets sent by the second device used by the direct-connected link and the total number of data packets used by the second device.
  • the total number of data packets used by the second device is equal to the sum of the number of data packets used by the second device forwarded through the server and the number of data packets sent by the second device through the direct link.
  • the second device deduplicates the data packets received through the direct link and the transit link, and counts the number of data packets received through the transit link used by the second device and the usage of the second device.
  • the number of data packets received through the direct link can be used to obtain the direct data packet utilization rate of the second device for the first device.
  • the direct-connect data packet utilization rate of the second device for the first device is equal to the ratio of the number of data packets received by the second device using the direct link divided by the total number of data packets used by the second device,
  • the total number of data packets used by the second device is equal to the sum of the number of data packets received by the second device through the transit link and the number of data packets received by the second device through the direct link. That is to say, the directly connected data packet utilization ratio of the second device to the first device satisfies the following formula (3).
  • the data packets in formula (3) are the data packets sent by the first device i
  • the sum of the direct data packet utilization rate of the second device for the first device and the transit data packet utilization rate of the second device for the first device is 1, that is, Therefore, after the second device determines the direct data packet utilization rate for the first device, it is equivalent to determining the transit data packet utilization rate for the first device. Similarly, after the second device determines the utilization rate of the transit data packets for the first device, it is equivalent to determining the utilization rate of direct data packets for the first device.
  • the second device may send the direct connection data packet utilization rate for the first device to the first device through the direct link between the first device and the second device, so that the first device obtains the direct connection of the second device. Packet utilization.
  • the first device can determine whether the direct connection data packet utilization rate of the second device for the first device is not less than the second utilization threshold. If it is not less than the second utilization threshold, the second device meets the transmission condition. At this time, the first device passes the direct connection The link sends the video resources of the first device to the second device, thereby reducing the transmission delay and packet loss rate of the video resources.
  • the second utilization threshold may be based on empirical device data, for example, the second utilization threshold is 0.5.
  • the number of second devices is at least one, and each second device can send direct connection data directed to the first device to the first device through a direct link between it and the first device.
  • Packet utilization so that the first device can obtain the directly connected data packet utilization of each second device.
  • the directly connected data packet utilization of each second device is sorted, and the second utilization threshold can be determined based on the sorting result. For example, first calculate the ratio between the available uplink bandwidth of the first device and the video bit rate of the video resource of the first device, and round down the comparison value to obtain the maximum number of devices, and then calculate the direct connection data of each second device
  • the packet utilization is sorted from large to small, and the utilization of the directly connected data packet with the largest number of devices is determined as the second utilization threshold.
  • the packets can also be sorted in ascending order, and the utilization of the direct-connected data packets with the largest number of devices from the bottom of the ranking can be determined as the second utilization threshold.
  • step 307. 3 shows that step 307 is executed after step 302.
  • step 307 may also be executed after step 303 or step 304, which is not limited here.
  • Step 307 The first device determines the video code rate of the video resource of the first device, and when the first device meets the third sending condition, sends the video resource of the first device to the server, so that the server forwards the third sending condition to the second device.
  • the third transmission condition means that the available uplink bandwidth of the first device is less than the product of the video bit rate of the video resource of the first device and the reference quantity, and the second device does not meet the transmission conditions.
  • the first device may send the identification information of the second device to the server.
  • the first device sends the video resources of the first device to the server, so that the server can forward the video resources of the first device to the second device based on the identification information of the second device, thereby realizing the utilization of direct-connected data packets on the second device.
  • the rate is small, the video resources of the first device can be received through the transit link.
  • the first device may send the identification information of this part of the terminal equipment to the server.
  • the server will use the identification information of this part of the terminal equipment to send the first device to the server.
  • the video resources are forwarded to these terminal devices, ensuring that even if the terminal device does not establish a direct link with the first device, it can still receive the video resources of the first device.
  • any terminal device participating in the same voice call with the first device may have a direct link with the first device, and the terminal device may be the second device mentioned above.
  • the terminal device may not have a direct link established with the first device.
  • C ij 1
  • For video resources adopt a strategy of direct link transmission as much as possible to reduce server equipment costs and bandwidth costs. Assume that there are Q terminal devices participating in the voice call. Among them, the role of the first device i is the speaker, and the first device i The available uplink bandwidth is B, and the video bit rate of the video resource of the first device i is b. When any of the following situations 21 and 22 occurs, a transit link needs to be used to transmit video resources.
  • Case 22 In the case where all Q-1 terminal devices have created direct links with the first device, B ⁇ b*(Q-1) is satisfied. This means that the available uplink bandwidth of the first device i cannot support simultaneously sending the video resources of the first device i to other Q-1 terminal devices through direct links. At this time, the first device i needs to use the broadcast capability of the server to assist in completing the transmission of video resources.
  • the available uplink bandwidth of the first device i can also support T terminal devices through T direct links.
  • the direct data packet utilization rate of Q-1 terminal devices Sort in order from high to low, select the top T items
  • the corresponding terminal device serves as the second device that meets the transmission conditions.
  • the first device sends the video resources of the first device to these second devices through direct links.
  • the remaining QT-1 terminal devices are used as second devices that do not meet the transmission conditions.
  • the first device sends the video resources of the first device to the server, so that the server forwards the time-frequency resources of the first device to these terminal devices.
  • the utilization rate of the direct link is maximized and the transmission quality of the video resources is improved while not exceeding the available uplink bandwidth of the first device. If situations 21 and 22 do not occur, the first device will not use the transit link to transmit the video resources of the first device, thereby effectively reducing the equipment cost and bandwidth cost of the server.
  • the data of the first device can be divided into two types of data.
  • the first type of data is critical data, which is characterized by relatively high requirements on transmission quality (such as packet loss rate, transmission delay, etc.) and a small amount of data.
  • audio resources, instruction data (such as encoding parameters when encoding audio resources and video resources), etc. are critical data.
  • the other type of data is non-critical data.
  • the characteristics of non-critical data are that the transmission quality requirements are not too high and the data volume is large.
  • video resources and so on are non-critical data.
  • key data adopts the strategy of dual transmission of transit link and direct link. Please refer to the above description of the audio resources of the first device.
  • Non-key data should be transmitted by direct link as much as possible. See above for a description of the first device's video resources.
  • other transmission methods can be used for key data and non-key data, for example, encoding parameters can be transmitted through signaling.
  • step 308 may also be included after step 302.
  • step 308 may also be performed after any one of steps 303 to 307.
  • Step 308 In response to the change in the role of the first device in the voice call process, the first terminal sends a direct connection shutdown request to the second device through the direct connection link, and receives the direct connection shutdown request sent by the second device through the direct connection link. Close the direct connection response of the direct connection request, close the direct link based on the close direct connection response.
  • a direct link has been successfully created between the first device and the second device. Since this application adopts an adaptive role switching strategy, when the device status of the first device changes, the role of the first device may also change accordingly. As the device status of the first device changes, the role of the first device may change accordingly. At this time, the first device needs to close the direct link with the second device, that is, the first device needs to close the direct link with the second device. The link sends a close direct connection request to the second device. There are at least case 31 and case 32 shown below.
  • Case 31 In response to the change in the role of the first device in the voice call process, the first terminal sends a direct connection close request to the second device through the direct link, including: the first device responds to the device status corresponding to the first device Change to the state of turning off multimedia resource collection, determine that the role of the first device during the voice call has changed and the role of the first device during the voice call is changed to a listener, if the role of the second device during the voice call is a listener , then a direct connection close request is sent to the second device through the direct link.
  • the audience refers to the device used to receive the multimedia resources output by the speaker during the voice call.
  • the device state of the first device is changed from the state of multimedia resource collection to the state of turning off multimedia resources. Collection status.
  • the role of the first device is changed from a speaker to a listener, and this change in role is a change in the role of the first device. Since the role of the first device is a listener, the first device is a recipient of multimedia resources and is mainly used for receiving multimedia resources.
  • the first device sends a direct connection close request to the second device through the direct link to close the direct connection and the role is the listener.
  • Case 32 The first device responds to the change in the role of the first device in the voice call process and sends a direct connection close request to the second device through the direct link, including: the first device responds to the first device's device status change
  • the first device responds to the first device's device status change
  • a direct connection close link is sent to the second device through the direct link.
  • Request, no role refers to a device that is not participating in the voice call.
  • the first device can withdraw from the voice call it has participated in. At this time, the device status of the first device is changed from the status of participating in the voice call to the status of withdrawing from the voice call. At this time, the role of the first device is changed from audience, presenter or intermediate state to no role. This change of role is a change of the role of the first device.
  • the role of the first device is no role, the first device is neither a recipient of multimedia resources nor a producer of multimedia information. That is to say, the first device neither needs to receive multimedia resources nor collect multimedia resources.
  • the first device may send a direct connection close request to the second device through the direct link to close the direct link with the second device corresponding to the role of listener.
  • the first device When the first device closes the direct link with the second device, the first device sends a direct connection close request to the second device through the direct link with the second device. After receiving the direct connection close request, the second device sends a direct connection close response to the first device through the direct link with the first device, and updates the direct link between the first device and the second device.
  • the link state is to update the link state of the direct link between the first device and the second device from the open state to the closed state.
  • the first device receives the close direct connection response sent by the second device through the direct link with the second device
  • the first device updates the link of the direct link between the first device and the second device.
  • the link status is to update the link status of the direct link between the first device and the second device from the open state to the closed state. Since then, the direct link between the first device and the second device is closed.
  • the information including but not limited to user equipment information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • signals involved in this application All are authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data need to comply with relevant laws, regulations and standards of relevant countries and regions.
  • the audio resources, video resources, etc. involved in this application were obtained with full authorization.
  • the second device when the role of the first device is changed to the presenter during the voice call, the second device is determined among at least one terminal device participating in the same voice call as the first device, and a link between the first device and the second device is created.
  • the direct link between the two devices realizes the dynamic creation of the direct link.
  • the multimedia resources of the first device are sent to the second device through a direct link to reduce the transmission delay and packet loss rate of the multimedia resources.
  • the multimedia resources of the first device are sent to the server, so that the server forwards the multimedia resources of the first device to the second device, thereby improving the transmission quality of the multimedia resources.
  • this method can be executed by any terminal device (such as terminal device 2) in FIG. 1 .
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal equipment other than the terminal equipment is called other terminal equipment.
  • the method at least includes steps 501 to 503 as shown below.
  • Step 501 The second device responds to the direct connection request of the first device forwarded by the server and creates a direct link between the second device and the first device.
  • the second device changes the role of the first device during the voice call.
  • it is determined among at least one terminal device participating in the same voice call as the first device.
  • the presenter refers to a device used to output multimedia resources during the voice call.
  • Step 502 The second device receives the multimedia resources of the first device sent by the first device through the direct link.
  • Step 503 The second device receives the multimedia resources of the first device forwarded by the server.
  • the second device receives the multimedia resources sent by the first device through a direct link or a server.
  • the multimedia resources of the first device include at least one of audio resources or video resources.
  • audio resources and video resources Different methods can be used for sending.
  • steps 601 to 603 describe the sending method of audio resources in detail
  • steps 604 to 606 describe the sending method of video resources in detail.
  • this method can be executed by any terminal device (such as terminal device 2) in FIG. 1 .
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal device that performs the multimedia resource transmission method provided by the embodiments of the present application is called the first device.
  • the terminal equipment other than the terminal equipment is called other terminal equipment.
  • the method at least includes steps 601 to 603 as shown below.
  • Step 601 The second device responds to the direct connection request of the first device forwarded by the server and creates a direct link between the second device and the first device.
  • the second device changes the role of the first device during the voice call.
  • it is determined among at least one terminal device participating in the same voice call as the first device.
  • the presenter refers to a device used to output multimedia resources during the voice call.
  • step 601 please refer to the above description of step 301 and step 302. The implementation principles of the two are the same and will not be described again here.
  • the second device creates a direct link between the second device and the first device, including: receiving a direct connection request directed to the second device forwarded by the server, and the direct connection request is a request from the first device.
  • the direct connection request sent to the server carries the external network address of the first device.
  • the second device sends a direct connection response to the direct connection request to the server.
  • the direct connection response carries the external network address of the second device. Based on the external network address of the first device and the external network address of the second device, a connection between the second device and the second device is created.
  • a direct link between devices is a direct link between devices.
  • the second device creates a direct link between the second device and the first device based on the external network address of the first device and the external network address of the second device, including: the second device creates a direct link between the second device and the first device based on the external network address of the first device.
  • External network address send multiple direct connection data packets to the first device, upon receiving a reception response sent by the first device, determine that a direct link between the second device and the first device has been created, and receive the response It refers to the response information for each directly connected data packet fed back by the first device based on the external network address of the second device.
  • step 601 after step 601, it also includes: the second device receives a request to close the direct connection sent by the first device through the direct link.
  • the request to close the direct connection is sent by the first device during the voice call. Sent to the second device when the role changes.
  • the second device sends a direct connection close response to the direct connection close request to the first device through the direct connection link, and closes the direct connection link based on the direct connection close response.
  • Step 602 The second device receives the audio resources of the first device sent by the first device through the direct link.
  • step 602 please refer to the description of step 303 above.
  • the implementation principles of the two are the same and will not be described again here.
  • Step 603 The second device receives the audio resources of the first device forwarded by the server.
  • step 603 please refer to the description of step 304 above.
  • step 304 The implementation principles of the two are the same and will not be described again here.
  • the method further includes step 604: the second device receives the video resources of the first device sent by the first device through the direct link, and the video resources are generated when the first sending condition is met. If sent through a direct link, the first sending condition means that the available uplink bandwidth of the first device is not less than the video bit rate of the video resource of the first device. The product of the reference quantity.
  • step 604 please refer to the description of step 305 above.
  • the implementation principles of the two are the same and will not be described again here.
  • the method also includes step 605, in which the second device receives the video resources of the first device sent by the first device through a direct link, and the video resources are sent through the direct link when the second sending condition is met.
  • the second transmission condition means that the available uplink bandwidth of the first device is smaller than the product of the video code rate of the video resource of the first device and the reference number, and the second device meets the transmission condition.
  • the second device meets the transmission conditions, including: the direct connection data packet utilization rate of the second device is not less than the second utilization threshold, and the direct connection data packet utilization rate of the second device indicates that the second device uses the direct connection link The proportion of data packets sent by the route.
  • step 605 For the description of step 605, please refer to the description of step 306 above. The implementation principles of the two are the same and will not be described again here.
  • the method also includes step 606.
  • the second device receives the video resources of the first device forwarded by the server.
  • the video resources are forwarded by the server when a third sending condition is met.
  • the third sending condition refers to the availability of the first device.
  • the utilization uplink bandwidth is smaller than the product of the video code rate of the video resource of the first device and the number of references, and the second device does not meet the transmission conditions.
  • the second device meets the transmission conditions, including: the direct connection data packet utilization rate of the second device is not less than the second utilization threshold, and the direct connection data packet utilization rate of the second device indicates that the second device uses the direct connection link The proportion of data packets sent by the route.
  • step 606 For the description of step 606, please refer to the description of step 307 above. The implementation principles of the two are the same and will not be described again here.
  • the second device when the role of the first device is changed to the presenter during the voice call, the second device is determined among at least one terminal device participating in the same voice call as the first device, and a link between the first device and the second device is created.
  • the direct link between the two devices realizes the dynamic creation of the direct link.
  • the multimedia resources of the first device are sent to the second device through a direct link to reduce the transmission delay and packet loss rate of the multimedia resources.
  • the multimedia resources of the first device are sent to the server, so that the server forwards the multimedia resources of the first device to the second device, thereby improving the transmission quality of the multimedia resources.
  • An embodiment of the present application also provides a multimedia resource transmission system, which includes a first device, a second device, and a server.
  • the first device is used to perform the transmission method of multimedia resources related to Figures 2, 3, and 4; the second device is used to perform the transmission method of multimedia resources related to Figures 5 and 6; and the server is used to perform the transmission method of multimedia resources related to Figures 5 and 6. Functions performed by the servers associated with Figures 2 to 6.
  • the multimedia resource transmission method provided by the embodiment of the present application from the perspective of method steps, which will be further described below with reference to FIGS. 7 to 9 .
  • multiple terminal devices participate in the same voice call
  • any terminal device participating in the voice call among the multiple terminal devices is regarded as the first device
  • any terminal device among the multiple terminal devices other than the first device is regarded as the first device.
  • Terminal devices are treated as other terminal devices.
  • the first device plays different roles during the voice call.
  • the role of the first device can be used to describe the role the first device plays during the voice call.
  • the role of the first device may also change accordingly.
  • Figure 7, is a schematic diagram of role switching provided by an embodiment of the present application.
  • the device status of the first device is the status of not participating in the voice call.
  • the role of the first device is no role.
  • the device status of the first device changes from the status of not participating in the voice call to the status of participating in the voice call.
  • the role of the first device is changed from no role to a listener. Under the listener, the first device may perform at least one of receiving and playing audio or receiving and playing video.
  • the device state of the first device may also be called a state of turning off video resource collection and a state of turning off audio resource collection.
  • the device status of the first device changes from the status of turning off video resource collection to the status of video resource collection.
  • the role of the first device is changed from the audience to the presenter.
  • the first equipment Video can be recorded and sent.
  • the device status of the first device changes from the status of turning off audio resource collection to the status of audio resource collection.
  • the role of the first device first changes from the listener to the intermediate state.
  • the first device may perform at least one of receiving and playing video or receiving and playing audio.
  • the first device can also record audio and perform voice activity detection on the audio. The voice activity detection will last for a period of time (such as one minute). If no object audio is detected from the audio during this period, the role of the first device returns from the intermediate state to the listener. If the object audio is detected from the audio during this period, the role of the first device is changed from the intermediate state to the presenter. At this time, under the presenter, the first device can record and send the audio.
  • the first device when serving as a presenter, may also perform at least one of receiving and playing video or receiving and playing audio.
  • the role of the first device is directly changed from listener to presenter. At this time, if the first device stops collecting audio or the first device stops collecting video, the role of the first device will not change. Change, that is to say, the role of the first device is still the speaker.
  • the device state of the first device is changed from the state of video resource collection to the state of turning off video resource collection.
  • the device state of the first device is only the state of audio resource collection and the first device stops collecting audio
  • the device state of the first device is changed from the state of audio resource collection to the state of turning off audio resource collection.
  • the device status of the first device is the status of video resource collection and the status of audio resource collection, and the first device stops collecting audio and video
  • the device status of the first device is changed from the status of video resource collection to the status of turning off video resource collection.
  • state, and the device state of the first device is changed from the state of audio resource collection to the state of turning off audio resource collection.
  • the first device can exit the voice call at any time during the voice call.
  • the device status of the first device changes from participating in the voice call to not participating in the voice call.
  • the role of the first device changes from listener or presenter or an intermediate state to no role.
  • a direct link may exist between the first device and the terminal device whose role is the speaker among other terminal devices, so that the first device can receive and play at least one of the audio or video content.
  • a direct link may exist between the first device and any of the other terminal devices, so that the first device records and sends at least one of audio or video , receiving and playing at least one of audio or video, and the role of this terminal device may be a listener, a presenter, or even an intermediate state.
  • the first device Since the first device has different roles and other terminal devices with direct links to the first device are also different, the first device can create and close the direct links in real time.
  • the first device when the role of the first device is changed from no role to a listener, the first device can create a connection with other terminal devices that have not established a direct link with the first device and whose role is corresponding to the presenter. Direct link.
  • the first device may create a direct link with a terminal device among other terminal devices that has not established a direct link with the first device.
  • Figure 8 is a schematic diagram of the creation and closing of a direct link provided by an embodiment of the present application.
  • the first device When the first device participates in the voice call, the first device can create a direct connection socket and send a UDP data packet to the direct connection hole punching server based on its direct connection socket.
  • the direct connection hole punching server parses the UDP data packet. Obtain the external network address of the first device, and send the external network address of the first device to the first device, so that the first device obtains its external network address. In the same way, the first device can also obtain its external network address.
  • the first device When the first device needs to create a direct link with the second device, the first device sends a direct connection request to the transit server, and the transit server can forward the direct connection request to the second device.
  • the direct connection request carries the external network address of the first device, and the second device obtains the external network address of the first device by parsing the direct connection request, and stores the external network address of the first device.
  • the second device sends a direct connection response to the direct connection request to the transfer server, and the transfer server can forward the direct connection response to the first device.
  • the direct connection response carries the external network address of the second device, and the first device obtains the external network address of the second device by parsing the direct connection response, and stores the external network address of the second device.
  • direct hole drilling begins between the first device and the second device.
  • the first device is based on the external network address of the second device, Send multiple direct packets to the second device.
  • Each time the second device receives a direct connection data packet it sends a reception response to the direct connection data packet to the first device.
  • the first device determines that a direct connection link has been successfully created with the second device.
  • the first device can send the multimedia resources of the first device to the second device through the direct link between the first device and the second device, so that the second device receives the multimedia resources of the first device.
  • the second device can also send the multimedia resources of the second device to the first device through the direct link between the first device and the second device, so that the first device receives the multimedia resources of the second device.
  • the multimedia resources here include audio resources and/or video resources.
  • the first device When the first device needs to close the direct link with the second device, the first device sends a direct connection close request to the second device through the direct link between the first device and the second device. After receiving the request to close the direct connection, the second device sends a close direct connection response to the first device through the direct link between the first device and the second device, and determines to close the connection between the first device and the second device. direct link. When receiving the direct connection close response, the first device determines to close the direct connection link between the first device and the second device.
  • Figure 9 is a schematic diagram of the transmission of multimedia resources provided by an embodiment of the present application.
  • the first device may send the multimedia resources of the first device to the second device through the direct link between the first device and the second device, so that the second device receives the multimedia resources through the direct link.
  • Multimedia resources for the first device can send the multimedia resources of the first device to the relay server, and the relay server forwards the multimedia resources of the first device to the second device, so that the second device receives the multimedia resources of the first device through the relay link.
  • the multimedia resources here are sent in the form of streams, therefore, the multimedia resources correspond to data packets.
  • the second device can receive the multimedia resources of the first device through the direct link and the multimedia resources of the first device through the transit link, for each data packet, the second device can receive twice.
  • the second device uses the data packet received first and discards the data packet received later, thereby deduplicating the received data packet.
  • the utilization rate of direct data packets and the utilization rate of transit data packets can be obtained, and the utilization rate of direct data packets is sent to the first device, and the utilization rate of transit data packets is sent to the transit server.
  • multimedia resources of the first device include audio resources and video resources.
  • This embodiment of the present application adopts different transmission strategies for audio resources and video resources.
  • the first device sends audio resources of the first device to the second device through a direct link.
  • the first device sends the audio resource of the first device to the transit server, and the transit server determines whether the transit data packet utilization of the second device is not greater than the first utilization threshold. If the transfer data packet utilization rate of the second device is not greater than the first utilization threshold, the transfer server stops sending the audio resources of the first device to the second device. If the transfer data packet utilization rate of the second device is greater than the first utilization threshold, the transfer server sends the audio resources of the first device to the second device. In this way, it is ensured that the second device can receive the audio resources of the first device, while reducing invalid transmission and saving downlink resources.
  • For video resources adopt a strategy of direct link transmission as much as possible. Assume that there are Q terminal devices participating in the same voice call. The role of the first device m is the lecturer. The available uplink bandwidth of m is B. The video bit rate of m's video resource is b. The number of other terminal devices is Q-1. , and there are direct links between the first device and other terminal devices. Then when B ⁇ b*(Q-1), the first device sends the video resources of the first device to each other terminal device through the direct link. When B ⁇ b*(Q-1), the first device sorts the direct-connected data packet utilization rates of each other terminal device from large to small, and determines the top T other terminal devices based on the sorting results.
  • the first device sends the video resources of the first device to T other terminal devices through direct links; on the other hand, the first device sends the video resources of the first device to the relay server, and the relay server Q-T- One terminal device sends the video resources of the first device.
  • the transfer server will also send the video resources of the first device to the terminal device.
  • each other terminal device participating in the voice call can receive the video resources of the first device through a direct link or a transit link.
  • the equipment cost and bandwidth cost of the transfer server can be reduced, and the transmission quality of video resources can be improved.
  • the direct link is fully utilized and the overall transmission quality of the direct link is improved.
  • Figure 10 is a schematic structural diagram of a multimedia resource transmission device provided by an embodiment of the present application. It is provided in the first device. As shown in Figure 10, the device includes:
  • Determination module 1001 configured to determine the second device among at least one terminal device participating in the same voice call as the first device in response to the first device changing its role to the presenter during the voice call.
  • the presenter refers to the second device during the voice call.
  • Creation module 1002 configured to create a direct link between the first device and the second device
  • the sending module 1003 is configured to send the multimedia resources of the first device to the second device through a direct link
  • the sending module 1003 is also configured to send the multimedia resources of the first device to the server, so that the server forwards the multimedia resources of the first device to the second device.
  • the device further includes:
  • An acquisition module configured to acquire the audio resources collected by the first device in response to the device status of the first device changing to the audio resource collection status
  • the determination module 1001 is also configured to determine that the role of the first device during the voice call is changed to the presenter in response to detecting the object audio in the collected audio resources.
  • the determining module 1001 is also configured to determine that the role of the first device during the voice call is changed to the presenter in response to the device status of the first device changing to a video resource collection status.
  • the creation module 1002 is configured to send a direct connection request for the second device to the server, where the direct connection request carries the external network address of the first device; receive a direct connection response forwarded by the server, and directly connect The response is response information to the direct connection request sent by the second device to the server.
  • the direct connection request is forwarded to the second device by the server.
  • the direct connection response carries the external network address of the second device; the second device is extracted from the direct connection response.
  • the external network address of the device based on the external network address of the first device and the external network address of the second device, create a direct link between the first device and the second device.
  • the creation module 1002 is configured to send multiple direct connection data packets to the second device based on the external network address of the second device; upon receiving a reception response sent by the second device, It is determined that a direct link between the first device and the second device has been created, and receiving a response refers to response information for each directly connected data packet fed back by the second device based on the external network address of the first device.
  • the multimedia resources include audio resources
  • the server forwards the audio resources of the first device to the second device when the forwarding condition is met.
  • the forwarding condition means that the transfer packet utilization rate of the second device is greater than the first utilization rate.
  • Rate threshold the second device's forwarding data packet utilization rate represents the proportion of data packets forwarded by the second device used by the server; the server stops forwarding the audio resources of the first device to the second device when the forwarding conditions are not met.
  • the sending module 1003 is configured to send the audio resources of the first device to the second device through a direct link; determine the video code rate of the video resources of the first device; and when the first sending condition is satisfied
  • the first sending condition means that the available uplink bandwidth of the first device is not less than the video code rate and reference quantity of the video resources of the first device. product between.
  • the sending module 1003 is configured to send the audio resources of the first device to the second device through a direct link; determine the video code rate of the video resources of the first device; and when the second sending condition is satisfied
  • the second sending condition means that the available uplink bandwidth of the first device is less than the video code rate of the video resources of the first device and the reference quantity. The product between , and the second device meets the transmission conditions;
  • the sending module 1003 is also used to send the audio resources of the first device to the server, so that the server forwards the audio resources of the first device to the second device; when the third sending condition is met, send the video of the first device to the server. resources, so that the server forwards the video resources of the first device to the second device.
  • the third sending condition refers to that the available uplink bandwidth of the first device is less than the product of the video code rate and the reference quantity of the video resources of the first device, And the second device does not meet the transmission conditions.
  • the second device meets the transmission conditions, including: the direct connection data packet utilization rate of the second device is not less than the second utilization threshold, and the direct connection data packet utilization rate of the second device represents the second device Use the fraction of packets sent over the direct link.
  • the sending module 1003 is also configured to respond to the first device's request during the voice call.
  • the role changes and a direct connection close request is sent to the second device through the direct link;
  • the installation also includes:
  • the receiving module is configured to receive a direct connection close response sent by the second device in response to the direct connection close request through the direct connection link, and close the direct connection link based on the direct connection close response.
  • the sending module 1003 is configured to, in response to the device status corresponding to the first device changing to a status of closing multimedia resource collection, determining that the role of the first device during the voice call has changed and the role has changed to Listener, if the second device plays the role of a listener during the voice call, it will send a direct connection close request to the second device through the direct link.
  • the listener refers to the device used to receive the multimedia resources output by the speaker during the voice call.
  • a shutdown is sent to the second device through the direct link Direct connection request, no role refers to the device not participating in the voice call.
  • the second device when the role of the first device is changed to the presenter during the voice call, the second device is determined among at least one terminal device participating in the same voice call as the first device, and a link between the first device and the second device is created.
  • the direct link between the two devices realizes the dynamic creation of the direct link.
  • the multimedia resources of the first device are sent to the second device through a direct link to reduce the transmission delay and packet loss rate of the multimedia resources.
  • the multimedia resources of the first device are sent to the server, so that the server forwards the multimedia resources of the first device to the second device, thereby improving the transmission quality of the multimedia resources.
  • FIG 11 is a schematic structural diagram of a multimedia resource transmission device provided by an embodiment of the present application. As shown in Figure 11, the device includes:
  • Creation module 1101 configured to create a direct link between the second device and the first device in response to receiving the direct connection request sent by the first device forwarded by the server.
  • the second device is the first device during the voice call.
  • the presenter refers to the device used to output multimedia resources during the voice call;
  • the receiving module 1102 is configured to receive the multimedia resources of the first device sent by the first device through the direct link;
  • the receiving module 1102 is also configured to receive the multimedia resources of the first device forwarded by the server.
  • the creation module 1101 is configured to receive a direct connection request directed to the second device forwarded by the server.
  • the direct connection request is sent by the first device to the server, and the direct connection request carries the external network of the first device. address; sends a direct connection response to the direct connection request to the server, and the direct connection response carries the external network address of the second device; based on the external network address of the first device and the external network address of the second device, create a connection between the second device and the first device. Direct link between devices.
  • the creation module 1101 is configured to send multiple direct connection data packets to the first device based on the external network address of the first device; in the case of receiving a reception response sent by the first device, It is determined that a direct link between the second device and the first device has been created, and receiving a response refers to response information for each directly connected data packet fed back by the second device based on the external network address of the first device.
  • the receiving module 1102 is also configured to receive audio resources of the first device sent by the first device through a direct link; receive video of the first device sent by the first device through a direct link. Resources, video resources are sent through the direct link when the first sending condition is met.
  • the first sending condition means that the available uplink bandwidth of the first device is not less than the video code rate and reference of the video resource of the first device. product between quantities.
  • the receiving module 1102 is also configured to receive audio resources of the first device sent by the first device through a direct link; receive video of the first device sent by the first device through a direct link. resources.
  • the video resources are sent through the direct link when the second sending condition is met.
  • the second sending condition means that the available uplink bandwidth of the first device is less than the video code rate and reference quantity of the video resources of the first device. product between them, and the second device meets the transmission conditions.
  • the receiving module 1102 is also configured to receive the audio information of the first device forwarded by the server.
  • Source receive video resources of the first device forwarded by the server.
  • the video resources are forwarded by the server when the third sending condition is met.
  • the third sending condition means that the available uplink bandwidth of the first device is smaller than the video of the first device.
  • the product of the video bitrate of the resource and the number of references, and the second device does not meet the transmission conditions.
  • the second device meets the transmission conditions, including: the direct connection data packet utilization rate of the second device is not less than the second utilization threshold, and the direct connection data packet utilization rate of the second device represents the second device Use the fraction of packets sent over the direct link.
  • the receiving module 1102 is also configured to receive a direct connection close request sent by the first device through the direct link.
  • the direct connection close request is the role of the first device in the voice call process. Sent to the second device when changing;
  • the installation also includes:
  • the sending module is configured to send a direct connection close response to the direct connection close request to the first device through the direct connection link, and close the direct connection link based on the direct connection close response.
  • the second device when the role of the first device is changed to the presenter during the voice call, the second device is determined among at least one terminal device participating in the same voice call as the first device, and a link between the first device and the second device is created.
  • the direct link between the two devices realizes the dynamic creation of the direct link.
  • the multimedia resources of the first device are sent to the second device through a direct link to reduce the transmission delay and packet loss rate of the multimedia resources.
  • the multimedia resources of the first device are sent to the server, so that the server forwards the multimedia resources of the first device to the second device, thereby improving the transmission quality of the multimedia resources.
  • Figure 12 shows a structural block diagram of a terminal device 1200 provided by an exemplary embodiment of the present application.
  • the terminal device 1200 includes: a processor 1201 and a memory 1202.
  • the processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc.
  • the processor 1201 can adopt at least one hardware form among DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), and PLA (Programmable Logic Array, programmable logic array).
  • DSP Digital Signal Processing, digital signal processing
  • FPGA Field-Programmable Gate Array, field programmable gate array
  • PLA Programmable Logic Array, programmable logic array
  • the processor 1201 can also include a main processor and a co-processor.
  • the main processor is a processor used to process data in the wake-up state, also called CPU (Central Processing Unit, central processing unit); the co-processor is A low-power processor used to process data in standby mode.
  • the processor 1201 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is responsible for rendering and drawing content that needs to be displayed on the display screen.
  • the processor 1201 may also include an AI (Artificial Intelligence, artificial intelligence) processor, which is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • Memory 1202 may include one or more computer-readable storage media, which may be non-transitory. Memory 1202 may also include high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash memory storage devices. In some embodiments, the non-transitory computer-readable storage medium in the memory 1202 is used to store at least one computer program, and the at least one computer program is used to be executed by the processor 1201 to implement the methods provided by the method embodiments in this application. The transmission method of multimedia resources.
  • the terminal device 1200 optionally further includes: a peripheral device interface 1203 and at least one peripheral device.
  • the processor 1201, the memory 1202 and the peripheral device interface 1203 may be connected through a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1203 through a bus, a signal line, or a circuit board.
  • the peripheral device includes: at least one of a radio frequency circuit 1204, a display screen 1205, a camera assembly 1206, and an audio circuit 1207.
  • the peripheral device interface 1203 may be used to connect at least one I/O (Input/Output) related peripheral device to the processor 1201 and the memory 1202 .
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 1201, the memory 1202, and the peripheral device interface 1203 or Both can be implemented on separate chips or circuit boards, this embodiment Not limited.
  • the radio frequency circuit 1204 is used to receive and transmit RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals. Radio frequency circuit 1204 communicates with communication networks and other communication devices through electromagnetic signals. The radio frequency circuit 1204 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals. Optionally, the radio frequency circuit 1204 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a user identity module card, and the like. Radio frequency circuitry 1204 can communicate with other terminals through at least one wireless communication protocol.
  • RF Radio Frequency, radio frequency
  • the wireless communication protocol includes but is not limited to: World Wide Web, metropolitan area network, intranet, mobile communication networks of all generations (2G, 3G, 4G and 5G), wireless LAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.
  • the radio frequency circuit 1204 may also include NFC (Near Field Communication) related circuits, which is not limited in this application.
  • the display screen 1205 is used to display UI (User Interface, user interface).
  • the UI can include graphics, text, icons, videos, and any combination thereof.
  • display screen 1205 is a touch display screen
  • display screen 1205 also has the ability to collect touch signals on or above the surface of display screen 1205 .
  • the touch signal can be input to the processor 1201 as a control signal for processing.
  • the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 1205 may be a flexible display screen, disposed on a curved surface or a folding surface of the terminal device 1200. Even, the display screen 1205 can also be set in a non-rectangular irregular shape, that is, a special-shaped screen.
  • the display screen 1205 can be made of LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, organic light-emitting diode) and other materials.
  • the camera component 1206 is used to capture images or videos.
  • the camera assembly 1206 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • there are at least two rear cameras one of which is a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the integration of the main camera and the depth-of-field camera to realize the background blur function.
  • camera assembly 1206 may also include a flash.
  • the flash can be a single color temperature flash or a dual color temperature flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • Audio circuitry 1207 may include a microphone and speakers.
  • the microphone is used to collect sound waves from the user and the environment, and convert the sound waves into electrical signals that are input to the processor 1201 for processing, or to the radio frequency circuit 1204 to implement voice communication. For the purpose of stereo collection or noise reduction, there may be multiple microphones, which are respectively arranged at different parts of the terminal device 1200 .
  • the microphone can also be an array microphone or an omnidirectional collection microphone.
  • the speaker is used to convert electrical signals from the processor 1201 or the radio frequency circuit 1204 into sound waves.
  • the loudspeaker can be a traditional membrane loudspeaker or a piezoelectric ceramic loudspeaker.
  • audio circuitry 1207 may also include a headphone jack.
  • FIG. 12 does not constitute a limitation on the terminal device 1200, and may include more or fewer components than shown, or combine certain components, or adopt different component arrangements.
  • FIG. 13 is a schematic structural diagram of a server provided by an embodiment of the present application.
  • the server 1300 may vary greatly due to different configurations or performance, and may include one or more processors 1301 and one or more memories 1302, where, At least one computer program is stored in the one or more memories 1302, and the at least one computer program is loaded and executed by the one or more processors 1301 to implement the multimedia resource transmission method provided by the above method embodiments.
  • the processor 1301 is the CPU.
  • the server 1300 may also have components such as wired or wireless network interfaces, keyboards, and input and output interfaces for input and output.
  • the server 1300 may also include other components for implementing device functions, which will not be described again here.
  • a computer-readable storage medium is also provided. At least one computer program is stored in the storage medium. The at least one computer program is loaded and executed by the processor to enable the electronic device to implement any of the above.
  • the transmission method of multimedia resources is also provided.
  • the above computer-readable storage medium may be read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), read-only compact disc (Compact Disc Read-Only Memory, CD-ROM) ), tapes, floppy disks and optical data storage devices, etc.
  • a computer program or computer program product is also provided. At least one computer program is stored in the computer program or computer program product. The at least one computer program is loaded and executed by the processor, so that the electronic device Implement any of the above multimedia resource transmission methods.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Security & Cryptography (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种多媒体资源的传输方法、装置、电子设备及存储介质,属于多媒体技术领域。方法包括:第一设备响应于第一设备在语音通话过程中的角色变更为主讲,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,主讲是指在语音通话过程中用于输出多媒体资源的设备(201);第一设备创建第一设备与第二设备之间的直连链路(202);第一设备通过直连链路向第二设备发送第一设备的多媒体资源(203);第一设备向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源(204)。本申请通过直连链路以及服务器两个通道传输多媒体资源,提高了多媒体资源的传输质量。

Description

多媒体资源的传输方法、装置、电子设备及存储介质
本申请要求于2022年08月03日提交、申请号为202210927973.4、发明名称为“多媒体资源的传输方法、装置、电子设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及多媒体技术领域,特别涉及一种多媒体资源的传输方法、装置、电子设备及存储介质。
背景技术
随着终端和互联网的不断发展,越来越多的对象选择使用通信软件进行通信。常见的通信软件包括即时通信软件和会议软件等,对于这类软件,多媒体资源可以通过数据流的形式在参与语音通话的设备之间传输。
发明内容
本申请提供了一种多媒体资源的传输方法、装置、电子设备及存储介质,所述技术方案包括如下内容。
一方面,提供了一种多媒体资源的传输方法,所述方法包括:
第一设备响应于所述第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
所述第一设备创建所述第一设备与所述第二设备之间的直连链路;
所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源;
所述第一设备向服务器发送所述第一设备的多媒体资源,以使所述服务器向所述第二设备转发所述第一设备的多媒体资源。
另一方面,提供了一种多媒体资源的传输方法,所述方法包括:
第二设备响应于服务器转发的第一设备的直连请求,创建第二设备与所述第一设备之间的直连链路,所述第二设备是所述第一设备在语音通话过程中的角色变更为主讲时,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定的,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源;
所述第二设备接收所述服务器转发的所述第一设备的多媒体资源。
另一方面,提供了一种多媒体资源的传输装置,设置于第一设备中,所述装置包括:
确定模块,用于响应于第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
创建模块,用于创建所述第一设备与所述第二设备之间的直连链路;
发送模块,用于通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源;
所述发送模块,还用于向服务器发送所述第一设备的多媒体资源,以使所述服务器向所述第二设备转发所述第一设备的多媒体资源。
另一方面,提供了一种多媒体资源的传输装置,设置于第二设备中,所述装置包括:
创建模块,用于响应于服务器转发的第一设备的直连请求,创建第二设备与所述第一设 备之间的直连链路,所述第二设备是所述第一设备在语音通话过程中的角色变更为主讲时,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定的,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
接收模块,用于通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源;
所述接收模块,还用于接收所述服务器转发的所述第一设备的多媒体资源。
另一方面,提供了一种多媒体资源的传输系统,所述系统包括第一设备、第二设备和服务器;
所述第一设备用于执行如上述方面所示的多媒体资源的传输方法中第一设备所执行的功能;
所述第二设备用于执行如上述方面所示的多媒体资源的传输方法中第二设备所执行的功能;
所述服务器用于执行如上述方面所示的多媒体资源的传输方法中服务器所执行的功能。
另一方面,提供了一种电子设备,所述电子设备包括处理器和存储器,所述存储器中存储有至少一条计算机程序,所述至少一条计算机程序由所述处理器加载并执行,以使所述电子设备实现上述任一所述的多媒体资源的传输方法。
另一方面,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述至少一条计算机程序由处理器加载并执行,以使电子设备实现上述任一所述的多媒体资源的传输方法。
另一方面,还提供了一种计算机程序或计算机程序产品,所述计算机程序或计算机程序产品中存储有至少一条计算机程序,所述至少一条计算机程序由处理器加载并执行,以使电子设备实现上述任一种多媒体资源的传输方法。
本申请提供的技术方案至少带来如下有益效果:
本申请提供的技术方案中,第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,并创建第一设备与第二设备之间的直连链路,实现了直连链路的动态创建。一方面,通过直连链路向第二设备发送第一设备的多媒体资源,以降低多媒体资源的传输延时和丢包率。另一方面,向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源,提高了多媒体资源的传输质量。
附图说明
图1是本申请实施例提供的一种多媒体资源的传输方法的实施环境示意图;
图2是本申请实施例提供的一种多媒体资源的传输方法的流程图;
图3是本申请实施例提供的一种多媒体资源的传输方法的流程图;
图4是本申请实施例提供的另一种多媒体资源的传输方法的流程图;
图5是本申请实施例提供的又一种多媒体资源的传输方法的流程图;
图6是本申请实施例提供的又一种多媒体资源的传输方法的流程图;
图7是本申请实施例提供的一种角色切换的示意图;
图8是本申请实施例提供的一种直连链路的创建和关闭的示意图;
图9是本申请实施例提供的一种多媒体资源的传输示意图;
图10是本申请实施例提供的一种多媒体资源的传输装置的结构示意图;
图11是本申请实施例提供的另一种多媒体资源的传输装置的结构示意图;
图12是本申请实施例提供的一种终端设备的结构示意图;
图13是本申请实施例提供的一种服务器的结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
相关技术中,多媒体资源包括音频资源。参与语音通话的任一个设备可以通过中转服务器将音频资源转发至参与语音通话的其他设备。也就是说,任一个设备先向中转服务器发送音频资源,再由中转服务器将该音频资源转发给其他设备。这种通过中转链路传输音频资源的方式,存在较大的传输延时和丢包率,影响了音频资源的传输质量。
图1是本申请实施例提供的一种多媒体资源的传输方法的实施环境示意图,如图1所示,该实施环境包括至少两个终端设备和服务器(服务器包括下文提及的中转服务器)。在各个终端设备通过服务器参与同一个语音通话的情况下,本申请实施例中的多媒体资源的传输方法可以由任一终端设备执行。
各个终端设备分别和服务器之间创建有直连链路。两个终端设备之间的中转链路包括一个终端设备和服务器之间的直连链路以及另一个终端设备和服务器之间的直连链路。也就是说,两个终端设备之间通过中转链路传输多媒体资源指的是:一个终端设备通过与服务器之间的直连链路向服务器发送多媒体资源,由服务器通过与另一个终端设备之间的直连链路向另一个终端设备转发该多媒体资源。
本申请实施例中,任一个终端设备可以将多媒体资源发送给服务器,且服务器可以将该多媒体资源转发给其他终端设备,从而实现任一个终端设备可以通过中转链路将多媒体资源转发给其他终端设备。如图1中示出了终端设备1可以将多媒体资源发送给服务器,且服务器可以将该多媒体资源转发给终端设备2和终端设备3,从而实现了终端设备1可以通过中转链路将多媒体资源转发给终端设备2和终端设备3。
此外,任一终端设备可以和其他终端设备创建有直连链路,使得终端设备可以将多媒体资源直接发送给其他终端设备。如图1中示出了终端设备1分别和终端设备2、终端设备3创建有直连链路,使得终端设备1可以将多媒体资源直接发送给终端设备2和终端设备3。
终端设备可以是智能手机、游戏主机、台式计算机、平板电脑、膝上型便携计算机、智能电视、智能车载设备、智能语音交互设备、智能家电等。服务器可以为一台服务器,也可以为多台服务器组成的服务器集群,或者为云计算平台和虚拟化中心中的任意一种,本申请实施例对此不加以限定。服务器可以与终端设备通过有线网络或无线网络进行通信连接。服务器可以具有数据处理、数据存储以及数据收发等功能,在本申请实施例中不加以限定。终端设备和服务器的数量不受限制,可以是一个或多个。
为了便于理解本申请实施例,下面对本申请实施例可能涉及到的名词进行解释和说明。
中转链路:中转链路也叫选择性转发单元(Selective Forwarding Unit,SFU)中转链路。SFU是一种传输架构,由中转服务器和多个终端设备组成,是一种星形结构,各终端设备将自己要共享的多媒体资源发送给中转服务器,中转服务器将该多媒体资源转发给其他终端设备。在语音通话过程中,两个终端设备之间通过中转链路传输多媒体资源时,需要经过中转服务器的转发。即一个终端设备可以通过中转链路向另一个终端设备发送多媒体资源,也可以通过中转链路接收另一个终端设备发送的多媒体资源。中转链路的建立成功率和可用性高,但同时传输成本也较高。在本申请实施例中,中转链路属于基础的传输链路。
直连链路:直连链路也叫无线网格网络(Mesh)直连链路。Mesh是多个终端设备之间两两进行连接形成的网状结构。比如终端设备A、终端设备B、终端设备C这三个终端设备进行两两连接。当终端设备A想要共享多媒体资源(比如音频资源、视频资源)时,终端设备A分别向终端设备B和终端设备C发送多媒体资源。同样地,终端设备B想要共享多媒体资源,就需要分别向终端设备A、终端设备C发送多媒体资源,依此类推。Mesh网络架构需要每个终端设备和其它所有终端设备之间建立直连链路,复杂性和难度较高。在语音通话过程中,两个终端设备之间通过直连链路传输多媒体资源时,不需要经过中转服务器的转发(可能会有网络节点如路由器、交换机等转发)。由于直连链路没有经过中转服务器,因此, 传输成本较低,可以用来传输大流量数据(比如音频资源、视频资源等)。在本申请实施例中,基于终端设备在语音通话过程中的角色,动态创建和关闭两个终端设备之间的直连链路,因此,直连链路属于辅助的传输链路。在中转链路发送拥塞时,采用直连链路传输多媒体资源可以有效提升传输质量。
网络地址转换(Network Address Translation,NAT):是一种在网际互连协议(Internet Protocol,IP)数据包通过路由器或防火墙时重写源IP地址、源端口(Port)地址、目的IP地址、目的Port地址中的至少一项的技术。NAT能够解决网际协议版本4(Internet Protocol version 4,IPv4)地址不足的问题,而且能够有效地避免来自网络外部的攻击,隐藏并保护网络内部的计算机。根据映射规则和过滤规则的不同,NAT分为全锥型、受限型、端口受限型、对称型等类型,这其中端口受限型和对称型的安全等级最高、连通条件最苛刻、应用最广。
NAT穿透:也可称为NAT打洞。NAT会把内网IPv4地址映射为外网地址,另外,NAT收到外网来包后会按照一定的规则进行过滤,这些导致两个NAT内网主机进行通信变得复杂,NAT穿透技术就是用于打破NAT壁垒,在两个NAT内网主机之间建立直连链路。
在多媒体技术领域中,通信软件是一种常见的软件,安装有通信软件的至少两个终端设备可以参与同一个语音通话。在至少两个终端设备参与语音通话的过程中,任一个终端设备可以采集声音、图像等多媒体资源,并将该多媒体资源发送给参与语音通话的其他终端设备。
相关技术中,任一终端设备向中转服务器发送多媒体资源,由中转服务器向其他终端设备转发该多媒体资源,从而实现了任一终端设备通过中转链路向其他终端设备发送多媒体资源。由于多媒体资源需要经过中转服务器的转发,因此,上述技术存在较大的传输延时和丢包率,导致多媒体资源的传输质量差。
本申请实施例提供了一种多媒体资源的传输方法,该方法可用于解决上述问题,此外,本申请实施例提供的方法可应用于上述实施环境中。
以图2所示的本申请实施例提供的一种多媒体资源的传输方法的流程图为例,该方法可由图1中的任一终端设备(如终端设备1)执行。由于上述实施例环境中涉及至少两个终端设备,为便于描述,将执行本申请实施例提供的多媒体资源的传输方法的终端设备称为第一设备,将至少两个终端设备中除第一设备之外的终端设备称为其他终端设备。如图2所示,该方法至少包括如下所示的步骤201至步骤204。
步骤201,第一设备响应于第一设备在语音通话过程中的角色变更为主讲,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,主讲是指在语音通话过程中用于输出多媒体资源的设备。
步骤202,第一设备创建第一设备与第二设备之间的直连链路。
步骤203,第一设备通过直连链路向第二设备发送第一设备的多媒体资源。
步骤204,第一设备向服务器发送所述第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源。
本申请实施例中,第一设备通过直连链路或者服务器向第二设备发送第一设备的多媒体资源,第一设备的多媒体资源包括音频资源或视频资源中的至少一项,针对音频资源和视频资源,可以采用不同的方式进行发送,以下图3的实施例对音频资源的发送方式进行了详细说明,以下图4的实施例对视频资源的发送方式进行了详细说明。
以图3所示的本申请实施例提供的一种多媒体资源的传输方法的流程图为例,该方法可由图1中的任一终端设备(如终端设备1)执行。由于上述实施例环境中涉及至少两个终端设备,为便于描述,将执行本申请实施例提供的多媒体资源的传输方法的终端设备称为第一设备,将至少两个终端设备中除第一设备之外的终端设备称为其他终端设备。如图3所示,该方法至少包括如下所示的步骤301至步骤304。其中,图3的实施例以发送音频资源为例进行说明。
步骤301,第一设备响应于第一设备在语音通话过程中的角色变更为主讲,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,主讲是指在语音通话过程中用于输出多媒体资源的设备。
本申请实施例中,至少两个终端设备参与同一个语音通话,这至少两个终端设备可以分为第一设备和其他终端设备,其中,其他终端设备相当于步骤301中的“与第一设备参与同一个语音通话的至少一个终端设备”。第一设备和其他终端设备安装有同一个通信软件,通过这个通信软件,第一设备和其他终端设备可以参与同一个语音通话。示例性地,当通信软件为即时通信软件时,第一设备和其他终端设备可以参与双人语音通话。当通信软件为会议软件时,第一设备和其他终端设备可以参与音视频会议。在音视频会议中,两个或两个以上不同地方的个人或群体,通过终端设备之间的传输链路,可以将声音、影像、文件等资源互传,从而实现了即时互动的沟通。
本申请实施例采用自适应的角色切换策略,为参与语音通话的终端设备设置其在语音通话过程中的角色。其中,终端设备在语音通话过程中的角色与终端设备的设备状态相关,以终端设备是第一设备为例,第一设备在语音通话过程中的角色与第一设备的设备状态相关,随着第一设备的设备状态发生改变,第一设备在语音通话过程中的角色可能变为主讲。主讲是终端设备在语音通话过程中的一种角色,负责输出多媒体资源,其他终端设备负责接收作为主讲的终端设备输出的多媒体资源。这里的多媒体资源包括但不限于文本、图像、音频、视频、文件等资源中的至少一项。其中,当多媒体资源包括音频资源和视频资源时,多媒体资源可以称为音视频资源。在语音通话过程中,可能同时存在多个主讲。
第一设备在语音通话过程中的角色变更为主讲至少存在如下所示的情况11和情况12。
情况11,步骤301之前还包括:第一设备响应于第一设备的设备状态变更为视频资源采集的状态,确定第一设备在语音通话过程中的角色变更为主讲。
当第一设备开启了视频资源采集时,如第一设备开启屏幕共享(此时,第一设备可以采集屏幕中的信息)或者第一设备打开摄像头(此时,第一设备可以采集摄像头拍摄的信息)时,第一设备的设备状态由关闭视频资源采集的状态变更为视频资源采集的状态。在一种可能实现方式中,服务器可以实时检测到此变更信息,该变更信息为指示第一设备的设备状态由关闭视频资源采集的状态变更为视频资源采集的状态的信息,或者,第一设备可以将此变更信息发送给服务器,由服务器基于此变更信息确定第一设备在语音通话过程中的角色变更为主讲,并向第一设备发送第一设备在语音通话过程中的角色变更为主讲的通知信息,第一设备在接收到该通知信息时,确定第一设备在语音通话过程中的角色变更为主讲。为便于描述,在下文描述时,可以将第一设备在语音通话过程中的角色简述为第一设备的角色。
情况12,步骤301之前还包括:第一设备响应于第一设备的设备状态变更为音频资源采集的状态,获取第一设备采集的音频资源,第一设备响应于在采集的音频资源中检测出对象音频,确定第一设备在语音通话过程中的角色变更为主讲。
当第一设备开启了音频资源采集时,例如,当第一设备打开了麦克风时(此时,第一设备可以采集对象的讲话声音或者音乐播放软件播放的音乐等),第一设备的设备状态由关闭音频资源采集的状态变更为音频资源采集的状态。在一种可能实现方式中,服务器可以实时检测到变更信息,该变更信息为指示第一设备的设备状态由关闭音频资源采集的状态变更为音频资源采集的状态的信息,或者,第一设备可以将此变更信息发送给服务器。服务器基于此变更信息先确定第一设备的角色变更为中间态,并向第一设备发送第一设备的角色变更为中间态的通知信息,第一设备在接收到该通知信息时,确定第一设备在语音通话过程中的角色变更为中间态。其中,中间态是终端设备在语音通话过程中的一种角色,当终端设备在语音通话过程中的角色为中间态时,一方面,该终端设备可以接收角色为主讲的其他终端设备发送的多媒体资源,另一方面,该终端设备可以采集音频资源,但该音频资源暂时还未发送给角色为主讲或者听众的终端设备。中间态可以变更为主讲,也可以回退至听众。
第一设备的角色变更为中间态时,第一设备可以在中间态下采集得到音频资源。在一种 可能的实现方式,第一设备对该音频资源进行语音活动检测(Voice Activity Detection,VAD),得到检测结果,并将该检测结果发送给服务器。若该检测结果为音频资源中存在对象音频,则服务器确定第一设备的角色由中间态变更为主讲,并向第一设备发送第一设备的角色变更为主讲的通知信息,第一设备在接收到该通知信息时,确定第一设备在语音通话过程中的角色变更为主讲。若该检测结果为音频资源中不存在对象音频,则服务器确定第一设备的角色由中间态变更为听众,并向第一设备发送第一设备的角色变更为听众的通知信息,第一设备在接收到该通知信息时,确定第一设备在语音通话过程中的角色变更为听众。其中,听众是终端设备在语音通话过程中的一种角色,负责接收作为主讲的终端设备输出的多媒体资源并播放。在语音通话过程中,听众可以变成主讲,同样地,主讲也可以变成听众。
在另一种可能的实现方式,第一设备将该音频资源发送给服务器,由服务器对该音频资源进行语音活动检测。当检测出音频资源中存在对象音频时,确定第一设备的角色由中间态变更为主讲,并向第一设备发送第一设备的角色变更为主讲的通知信息。当检测出音频资源中不存在对象音频时,确定第一设备的角色由中间态变更为听众,并向第一设备发送第一设备的角色变更为听众的通知信息。
本申请实施例中,当第一设备的设备状态变更为音频资源采集的状态,先确定第一设备的角色变更为中间态,再对第一设备在中间态下采集的音频资源进行检测,只有在检测出对象音频后,才确定第一设备的角色变更为主讲,使得在对象打开麦克风(或者误打开麦克风)不说话时,不会将第一设备的角色变更为主讲,可以在一定程度上避免建立第一设备与其他终端设备之间的直连链路。
需要说明的是,第一设备的角色在变更为主讲之前,第一设备的角色可以为听众,也可以为无角色,还可以为中间态。第一设备的角色变更为中间态之前,第一设备的角色可以为听众,也可以为无角色。上文描述了第一设备的角色变更为主讲的情况下,第一设备的角色可以先变更为中间态,再变更为主讲或者听众,下面简述第一设备的角色变更为听众的另外几种情况(记为情况13和情况14),以及第一设备的角色变更为无角色的情况(记为情况15)。其中,无角色是指未参与语音通话的设备。可以理解的是,第一设备的设备状态为未参与语音通话的状态时,由于第一设备未参与语音通话,因此,第一设备在语音通话过程中不能扮演任何角色,此时,第一设备的角色为无角色。
情况13,第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态时,确定第一设备的角色由无角色变更为听众。
第一设备可以创建并参与一个语音通话,此时,第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态。或者,在服务器参与语音通话的情况下,未参与语音通话的第一设备可以向服务器发送参与该语音通话的请求,当第一设备接收到服务器对该请求的响应时,第一设备参与该语音通话,此时,第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态。或者,在其他终端设备参与语音通话的情况下,其他终端设备向未参与语音通话的第一设备发送参与语音通话的邀请信息,当第一设备响应于该邀请信息时,第一设备参与该语音通话,此时,第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态。
当第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态时,服务器可以实时检测到此变更信息,或者,第一设备可以将此变更信息发送给服务器,由服务器基于此变更信息确定第一设备在语音通话过程中的角色由无角色变更为听众,并向第一设备发送第一设备的角色变更为听众的通知信息。
情况14,第一设备的设备状态由多媒体资源采集的状态变更为关闭多媒体资源采集的状态时,确定第一设备的角色变更为听众。这里的多媒体资源采集的状态包括音频资源采集的状态或视频资源采集的状态中的至少一种。这里的关闭多媒体资源采集的状态包括关闭音频资源采集的状态和关闭视频资源采集的状态。
当第一设备的设备状态由多媒体资源采集的状态变更为关闭多媒体资源采集的状态时, 服务器可以实时检测到此变更信息,或者,第一设备可以将此变更信息发送给服务器。若在一段时间内(比如1分钟内),服务器获取到第一设备的设备状态一直处于关闭多媒体资源采集的状态,此时,服务器确定第一设备的角色由主讲或者中间态变更为听众,并向第一设备发送第一设备的角色变更为听众的通知信息。
情况15,第一设备的设备状态由参与语音通话的状态变更为未参与语音通话的状态时,确定第一设备的角色变更为无角色。
当第一设备申请退出语音通话时,第一设备的设备状态由参与语音通话的状态变更为未参与语音通话的状态,服务器可以实时检测到此变更信息,或者,第一设备可以将此变更信息发送给服务器。此时,服务器确定第一设备的角色由主讲、中间态或者听众变更为无角色,并向第一设备发送第一设备的角色变更为无角色的通知信息。第一设备接收到该通知信息时,第一设备退出语音通话。
需要说明的是,当第一设备在语音通话过程中的角色为主讲时,第一设备是多媒体资源的生产者,主要用于采集得到多媒体资源,对多媒体资源进行编码处理,然后,将编码处理后的多媒体资源传输给其他终端设备。此外,第一设备也是多媒体资源的接收者,可以接收角色为主讲的终端设备发送的多媒体资源,对多媒体资源进行解码处理并播放解码处理后的多媒体资源。
当第一设备在语音通话过程中的角色为听众时,第一设备是多媒体资源的接收者,主要用于接收角色为主讲的终端设备发送的多媒体资源。由于多媒体资源在传输前通常需要进行编码处理,因此,终端设备接收到多媒体资源后,可以先对多媒体资源进行解码处理,再对解码处理后的多媒体资源进行播放。
当第一设备在语音通话过程中的角色为中间态时,一方面,第一设备可以接收角色为主讲的终端设备发送的多媒体资源,对多媒体资源进行解码处理并播放解码处理后的多媒体资源。另一方面,由于第一设备开启了音频采集,因此,第一设备可以采集音频资源,但该音频资源暂时没有发送给角色为主讲或者听众的终端设备。
当第一设备的角色为无角色时,第一设备未参与语音通话,因此第一设备无需输出多媒体资源,也无需接收其他设备输出的多媒体资源。
根据应用场景的不同,语音通话中作为主讲的终端设备以及作为听众的终端设备的数量也存在区别。比如,针对双人语音通话的场景,一般情况下,涉及两个终端设备,这两个终端设备中任一个终端设备的角色可以为主讲,也可以为听众,且每个终端设备的主讲、听众可以实时切换,也就是说,任一个终端设备在上一刻的角色为主讲,在下一刻的角色可能为听众。针对远程会议的场景,一般情况下,可能涉及几个甚至十几个终端设备,这些终端设备中有部分终端设备的角色可能一直为听众,另一部分终端设备的角色可以实时从主讲切换到听众或者从听众切换到主讲。针对在线课堂的场景,一般情况下,可能包含几十甚至几百个终端设备,这些终端设备中大部分终端设备的角色一直为听众,个别终端设备的角色一直为主讲,例如,教师的终端设备的角色一直为主讲,而学生的终端设备的角色一直为听众。
本申请实施例中,第一设备在语音通话过程中的角色变更为主讲时,在其他终端设备中确定第二设备。其中,第二设备的数量为至少一个,第二设备的角色可以为主讲、听众或者中间态。第二设备的角色变更的原理与第一设备的角色变更的原理相同,可以见上文有关第一设备的角色变更的相关描述(如情况1至5),在此不再赘述。
步骤302,第一设备创建第一设备与第二设备之间的直连链路。
需要说明的是,对任一个第二设备,若该第二设备已与第一设备之间创建了直连链路,则第一设备可以不用再创建与该第二设备之间的直连链路,即针对该第二设备,第一设备可以不执行步骤302,直接执行步骤303和步骤304。若该第二设备未与第一设备之间创建直连链路,则第一设备需要创建与该第二设备之间的直连链路,即针对该第二设备,第一设备需要执行步骤302。
本申请实施例中,在第一设备的设备状态变更为参与语音通话的状态时,第一设备会创 建直连用户数据包套接字(User Datagram Protocol Socket,UDP Socket),直连UDP Socket可以简称为直连Socket。直连Socket是一种数据结构,用于存储第一设备中所有直连链路的状态信息。这里的直连链路包括第一设备与服务器(服务器包括直连打洞服务器或中转服务器中的至少一项,其中,直连打洞服务器可以确定终端设备的外网地址并反馈给终端设备)之间的直连链路,第一设备与服务器之间进行资源传输时,需要基于直连Socket来确定出第一设备与服务器之间的直连链路,并基于确定出的直连链路进行资源传输。
同样地,在第二设备的设备状态变更为参与语音通话的状态时,第二设备也会创建直连Socket。在第一设备和第二设备创建了各自的直连Socket之后,基于第一设备和第二设备各自的直连Socket,创建第一设备与第二设备之间的直连链路。
在一种可能的实现方式中,第一设备创建第一设备与第二设备之间的直连链路,包括:第一设备向服务器发送针对第二设备的直连请求,直连请求中携带第一设备的外网地址,接收服务器转发的直连响应,直连响应是第二设备向服务器发送的针对直连请求的响应信息,直连请求由服务器转发给第二设备,直连响应中携带第二设备的外网地址。第一设备从直连响应中提取出第二设备的外网地址,基于第一设备的外网地址和第二设备的外网地址,创建第一设备与第二设备之间的直连链路。
第一设备基于其直连Socket,确定第一设备和直连打洞服务器之间的直连链路,并通过该直连链路向直连打洞服务器发送UDP数据包。可选地,第一设备生成原始UDP数据包,并向路由器发送原始UDP数据包。路由器在原始UDP数据包中填入第一设备的外网地址,得到目标UDP数据包,并将目标UDP数据包发送给直连打洞服务器。第一设备的外网地址包括第一设备的互联网协议地址(Internet Protocol Address,IP地址)和第一设备的端口地址,第一设备的IP地址用于定位出第一设备,第一设备的端口地址用于定位出第一设备的应用程序。
直连打洞服务器接收到目标UDP数据包后,对目标UDP数据包进行解析,从而解析出第一设备的外网地址,并将第一设备的外网地址发送给第一设备。由此,第一设备可以获取并存储自身的外网地址。基于与第一设备同样的原理,第二设备也可以获取并存储自身的外网地址。
第一设备向中转服务器发送针对第二设备的直连请求,该直连请求携带第一设备的外网地址、第一设备(作为源设备)的标识信息和第二设备(作为目标设备)的标识信息等内容。中转服务器接收到该直连请求后,解析该直连请求得到第二设备的标识信息,并基于第二设备的标识信息将该直连请求转发给第二设备。需要说明的是,中转服务器向第二设备转发的直连请求至少包括第一设备的标识信息和第一设备的外网地址,可以包括第二设备的标识信息,也可以不包括第二设备的标识信息。
第二设备接收到直连请求后,解析该直连请求得到第一设备的标识信息和第一设备的外网地址,并存储第一设备的标识信息和第一设备的外网地址,另外,第二设备还向中转服务器发送针对直连请求的直连响应,该直连响应中携带第二设备的外网地址、第一设备(作为目标设备)的标识信息和第二设备(作为源设备)的标识信息。中转服务器接收到该直连响应后,解析该直连响应得到第一设备的标识信息,并基于第一设备的标识信息将该直连请求转发给第一设备。需要说明的是,中转服务器向第一设备转发的直连响应至少包括第二设备的标识信息和第二设备的外网地址,可以包括第一设备的标识信息,也可以不包括第一设备的标识信息。
第一设备接收到直连响应后,解析该直连响应得到第二设备的标识信息和第二设备的外网地址,并存储第二设备的标识信息和第二设备的外网地址。自此,第一设备和第二设备均可以获取并存储对方的外网地址。在第一设备和第二设备获取到对方的外网地址之后,相当于创建了第一设备与第二设备之间的直连链路。
例如,第一设备m向中转服务器发送针对第二设备n的直连请求,该直连请求中携带(IP_m,port_m)、id(m)和id(n)。(IP_m,port_m)为第一设备m的外网地址,其中, IP_m为第一设备m的IP地址,port_m为第一设备m的端口地址,id(m)为第一设备m的标识信息,id(n)为第二设备n的标识信息。中转服务器将该直连请求转发给第二设备n之后,第二设备n解析出id(m)和(IP_m,port_m)并存储,且第二设备n向中转服务器发送针对第一设备m的直连响应,该直连响应中携带(IP_n,port_n)、id(m)和id(n)。(IP_n,port_n)为第二设备n的外网地址,其中,IP_n为第二设备n的IP地址,port_n为第二设备n的端口地址。中转服务器将该直连响应转发给第一设备m之后,第一设备m解析出id(n)和(IP_n,port_n)并存储。
可选地,第一设备基于第一设备的外网地址和第二设备的外网地址,创建第一设备与第二设备之间的直连链路,包括:基于第二设备的外网地址,向第二设备发送多个直连数据包,在接收到第二设备发送的接收响应的情况下,确定已创建第一设备与第二设备之间的直连链路,接收响应是指第二设备基于第一设备的外网地址反馈的针对各个直连数据包的响应信息。
第一设备可以基于第二设备的外网地址,向第二设备发送多个直连数据包,该直连数据包为上文提及的UDP数据包。第二设备每接收到一个或者多个直连数据包,基于第一设备的外网地址,向第一设备反馈针对该一个或者多个直连数据包的接收响应。当第一设备接收到第二设备反馈的针对各个直连数据包的接收响应时,说明第一设备获取到的第二设备的外网地址是正确的地址,且第二设备也成功接收到了第一设备的外网地址,此时第一设备确定已成功创建了第一设备和第二设备之间的直连链路。
基于与第一设备同样的原理,第二设备可以基于第一设备的外网地址,向第一设备发送多个直连数据包。第一设备每接收到一个或者多个直连数据包,基于第二设备的外网地址,向第二设备反馈针对该一个或者多个直连数据包的接收响应。当第二设备接收到第一设备反馈的针对各个直连数据包的接收响应时,说明第二设备获取到的第一设备的外网地址是正确的地址,且第一设备也成功接收到了第二设备的外网地址,此时第二设备确定已成功创建了第二设备和第一设备之间的直连链路。
上述第一设备和第二设备各自确定已成功创建了两个设备之间的直连链路的方式,也称为直连打洞。通过直连打洞,可以测试出第一设备和第二设备之间是否已成功创建了直连链路,并测试出直连链路的稳定性,便于后续通过直连链路传输多媒体资源时,提高多媒体资源的传输成功率。
步骤303,第一设备通过直连链路向第二设备发送第一设备的音频资源。
本申请实施例中,当第一设备的角色为主讲时,第一设备采集得到第一设备的音频资源,并通过直连链路向第二设备发送第一设备的音频资源。其中,第二设备的数量为至少一个,第一设备通过直连链路向各个第二设备发送第一设备的音频资源。通过直连链路传输,可以使得每一个已与第一设备创建直连链路的第二设备均可以快速准确地获取到第一设备的音频资源,充分地利用了直连链路来降低音频资源的传输延时和丢包率。
步骤304,第一设备向服务器发送第一设备的音频资源,以使服务器向第二设备转发第一设备的音频资源。其中,图2示出了步骤304在步骤303之后执行,可选地,步骤304还可以在步骤303之前执行,或者与步骤303并行执行,在此不做限定。
第一设备采集得到第一设备的音频资源时,向服务器发送第一设备的音频资源,以便于服务器向第二设备转发第一设备的音频资源。通过步骤303和步骤304,使得第二设备不仅可以通过直连链路接收到第一设备的音频资源,还可以通过中转链路接收到第一设备的音频资源,降低了因某一链路拥塞而导致第二设备无法接收到第一设备的音频资源的情况,提高了音频资源的传输质量。
本申请实施例中,服务器可以在设定时间段内向第二设备转发第一设备的音频资源,即在设定时间段内服务器无条件地向第二设备转发第一设备的音频资源。经过设定时间段后,服务器有条件地向第二设备转发第一设备的音频资源。可选地,服务器在满足转发条件的情况下向第二设备转发第一设备的音频资源,服务器在不满足转发条件的情况下停止向第二设备转发第一设备的音频资源。本申请实施例中,服务器判断第二设备是否满足转发条件。若 该第二设备满足转发条件,则向该第二设备转发第一设备的音频资源。若该第二设备不满足转发条件,则停止向该第二设备转发第一设备的音频资源。
可选地,转发条件包括第二设备的中转数据包利用率大于第一利用率阈值,第二设备的中转数据包利用率表征第二设备使用通过服务器转发的数据包的占比,即第二设备使用通过服务器转发的数据包的数量与第二设备使用数据包的总数量之间的比值。其中,第二设备使用数据包的总数量等于第二设备使用通过服务器转发的数据包的数量与第二设备使用通过直连链路发送的数据包的数量之和。
其中,第一设备向服务器发送第一设备的音频资源。服务器在接收到第一设备的音频资源后,对于任一个第二设备,若该第二设备针对第一设备的中转数据包利用率大于第一利用率阈值,则服务器向该第二设备转发第一设备的音频资源。
需要说明的是,第一设备的音频资源是以流的形式不断地向服务器和第二设备发送的,服务器也是以流的形式不断地向第二设备转发第一设备的音频资源。第一设备的音频资源以流的形式传输时,第一设备的数据包包括第一设备的音频资源。
一方面,第二设备通过直连链路不断地接收第一设备发送的数据包,另一方面,第二设备通过中转链路不断地接收服务器转发的数据包。第二设备每接收到一个数据包,可以得到该数据包的序列号。若第二设备已接收过这个序列号的数据包,则第二设备丢弃该数据包,若第二设备未接收过这个序列号的数据包,则第二设备可以使用该数据包,即第二设备解析该数据包并播放对应的音频资源。通过这种方式,第二设备对通过直连链路和中转链路接收到的数据包进行去重,通过统计第二设备使用通过中转链路接收到的数据包的数量和第二设备使用通过直连链路接收到的数据包的数量,可以得到第二设备针对第一设备的中转数据包利用率。其中,第二设备使用通过中转链路接收到的数据包即为第二设备使用通过服务器转发的数据包。
可选地,第二设备针对第一设备的中转数据包利用率等于第二设备使用通过中转链路接收到的数据包的数量除以第二设备使用的数据包的总数量之比,其中,第二设备使用的数据包的总数量等于第二设备使用通过中转链路接收到的数据包的数量和第二设备使用通过直连链路接收到的数据包的数量之和。也就是说,第二设备针对第一设备的中转数据包利用率满足如下所示的公式(1)。
其中,表征第二设备j针对第一设备i的中转数据包利用率,此时,公式(1)中的数据包为第一设备i发送的数据包,
第二设备可以将针对第一设备的中转数据包利用率发送给服务器,以使服务器基于该中转数据包利用率确定是否将第一设备的数据包转发第二设备。
可选地,服务器可以先将第一设备的数据包转发给第二设备,以便于第二设备在设定时间段统计出针对第一设备的中转数据包利用率,并将该中转数据包利用率转发给服务器。之后,若服务器确定第二设备的中转数据包利用率大于第一利用率阈值,则服务器向该第二设备转发第一设备的数据包。若服务器确定第二设备的中转数据包利用率不大于第一利用率阈值,则服务器不向该第二设备转发第一设备的数据包。
由上述内容可知,第一设备通过中转链路和直连链路双发的策略,向第二设备发送第一设备的音频资源。其中,中转数据包利用率不大于第一利用率阈值时,说明第二设备大概率会使用通过直连链路接收到的数据包(这也说明了直连链路的传输质量高,稳定性高),大概率会丢弃通过服务器转发的数据包,因此,服务器发送的数据包大概率属于无效传输。当第 二设备的中转数据包利用率不大于第一利用率阈值时,服务器停止向第二设备转发第一设备的数据包,可以减少无效传输,节约下行资源。
可以理解的是,第二设备的中转数据包利用率大于等于0且小于等于1。第一利用率阈值是一个可调控的系数,将第一利用率阈值记为α,则0≤α≤1。
若α=0,则第二设备的中转数据包利用率大于或者等于α。
在这种情况下,第二设备的中转数据包利用率大于α时,则服务器向第二设备转发第一设备的数据包;第二设备的中转数据包利用率等于α时,服务器可以向第二设备转发第一设备的数据包,也可以不向第二设备转发第一设备的数据包。在一种可能的实现方式中,无论第二设备的中转数据包利用率大于还是等于α,服务器均向第二设备转发第一设备的音频资源。
若α=1,则第二设备的中转数据包利用率小于或者等于α。
在这种情况下,第二设备的中转数据包利用率小于α时,则服务器不向第二设备转发第一设备的数据包;第二设备的中转数据包利用率等于α时,服务器可以向第二设备转发第一设备的数据包,也可以不向第二设备转发第一设备的数据包。在一种可能的实现方式中,无论第二设备的中转数据包利用率小于还是等于α,服务器均不向第二设备转发第一设备的音频资源。
需要说明的是,和第一设备参与同一个语音通话的其他终端设备中,可能存在未与第一设备创建直连链路的终端设备。对于这部分终端设备,第一设备无法通过直连链路向其发送第一设备的音频资源,而只能通过中转链路向其发送第一设备的音频资源,因此,这部分终端设备针对第一设备的中转数据包利用率为1,大于第一利用率阈值。因此,服务器会持续向这部分终端设备发送第一设备的数据包,保证了未与第一设备建立直连链路的终端设备也可以接收到第一设备的音频资源。
综上,和第一设备参与同一个语音通话的任一个终端设备,该终端设备可能与第一设备之间建立了直连链路,该终端设备可以为上文提及的第二设备,该终端设备可能未与第一设备之间建立直连链路。将第一设备记为i,将任一个终端设备记为j,则可以按照如下所示的公式(2)来定义第一设备i与终端设备j之间是否建立了直连链路。
其中,Cij也可以称为直连可达性,Cij=1表征第一设备i与终端设备j之间建立了直连链路,Cij=0表征第一设备i与终端设备j之间未建立直连链路。
对于音频资源来说,采用直连链路和中转链路双发的策略。第一设备i向终端设备j发送第一设备的音视资源时,一方面,第一设备i会使用中转链路向终端设备j传输该音频资源,另一方面,第一设备i会检查和终端设备j之间的直连可达性。若Cij=1,则第一设备i会使用直连链路向终端设备j传输该音频资源,若Cij=0,则第一设备i无法使用直连链路向终端设备j传输该音频资源。通过直连链路和中转链路双发的策略,实现了充分利用链路资源,提升音频资源的传输质量。
在一种可能的实现方式中,多媒体资源还包括视频资源,则请参见图4,图4是本申请实施例提供的另一种多媒体资源的传输方法的流程图,该方法还包括步骤305。其中,图4示出了步骤305在步骤302之后执行,可选地,步骤305还可以在步骤303或者步骤304之后执行,在此不做限定。
步骤305,第一设备确定第一设备的视频资源的视频码率,在满足第一发送条件的情况下,通过直连链路向第二设备发送第一设备的视频资源,第一发送条件是指第一设备的可利用上行带宽不小于第一设备的视频资源的视频码率与参考数量之间的乘积。
本申请实施例中,当第一设备的角色为主讲时,第一设备可以采集得到第一设备的视频资源,第一设备在获取第一设备的视频资源后,确定第一设备的视频资源的视频码率,该视 频码率表征在传输第一设备的视频资源时单位时间传送的数据位数。视频码率也可以称为采样率,其单位通常为千位每秒(kilobit per second,kbps)或者兆位每秒(megabits per second,mbps)。
第一设备计算第一设备的视频资源的视频码率与参考数量之间的乘积。其中,参考数量可以是一个设定的数据,例如,参考数量为参与语音通话的终端设备的数量,或者,参考数量为第二设备的数量。本申请实施例中,与第一设备之间创建有直连链路的任一个终端设备均为上文提及的第二设备,因此,第二设备的数量为至少一个。
此外,第一设备还确定第一设备的可利用上行带宽。其中,第一设备的可利用上行带宽为第一设备的总上行带宽减去第一设备的已利用上行带宽,第一设备的已利用上行带宽包括第一设备向服务器发送音频资源所需的上行带宽、第一设备通过直连链路向第二设备发送音频资源所需的上行带宽、第一设备向服务器发送视频资源所需的上行带宽等。可选地,由于发送音频资源所需的上行带宽较小,可以忽略不计,因此,第一设备的可利用上行带宽可以粗略估计为第一设备的总上行带宽与第一设备向服务器发送视频资源所需的上行带宽之间的差值。
本申请实施例中,若第一设备的可利用上行带宽不小于视频码率与参考数量之间的乘积,则第一设备通过第一设备与第二设备之间的直连链路,向第二设备发送第一设备的视频资源。其中,第一设备的可利用上行带宽不小于乘积,表征第一设备的可利用上行带宽足够大,可以支持通过与各个第二设备之间的直连链路,向各个第二设备发送第一设备的视频资源,从而降低了视频资源的传输延迟和丢包率。
在一种可能的实现方式中,请参见图4,该方法还包括步骤306。其中,图4示出了步骤306在步骤302之后执行,可选地,步骤306还可以在步骤303或者步骤304之后执行,在此不做限定。
步骤306,第一设备确定第一设备的视频资源的视频码率,在满足第二发送条件的情况下,通过直连链路向第二设备发送第一设备的视频资源,第二发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备满足传输条件。
本申请实施例中,若第一设备的可利用上行带宽小于视频码率与参考数量之间的乘积,表明第一设备的可利用上行带宽不够大,不支持通过与各个第二设备之间的直连链路,向各个第二设备发送第一设备的视频资源。也就是说,第一设备无法通过直连链路向所有的第二设备发送第一设备的视频资源,此时,针对每一个第二设备,第一设备需要确定该第二设备是否满足传输条件。
可选地,第二设备满足传输条件,包括:第二设备的直连数据包利用率不小于第二利用率阈值,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的占比。也就是说,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的数量与第二设备使用数据包的总数量之间的比值。其中,第二设备使用数据包的总数量等于第二设备使用通过服务器转发的数据包的数量与第二设备使用通过直连链路发送的数据包的数量之和。
上文已提及,第二设备对通过直连链路和中转链路接收到的数据包进行去重,通过统计第二设备使用通过中转链路接收到的数据包的数量和第二设备使用通过直连链路接收到的数据包的数量,可以得到第二设备针对第一设备的直连数据包利用率。
可选地,第二设备针对第一设备的直连数据包利用率等于第二设备使用通过直连链路接收到的数据包的数量除以第二设备使用的数据包的总数量之比,其中,第二设备使用的数据包的总数量等于第二设备使用通过中转链路接收到的数据包的数量和第二设备使用通过直连链路接收到的数据包的数量之和。也就是说,第二设备针对第一设备的直连数据包利用率满足如下所示的公式(3)。
其中,表征第二设备j针对第一设备i的直连数据包利用率,此时,公式(3)中的数据包为第一设备i发送的数据包,
其中,第二设备针对第一设备的直连数据包利用率和第二设备针对第一设备的中转数据包利用率之和为1,即因此,第二设备在确定出针对第一设备的直连数据包利用率之后,相当于确定了针对第一设备的中转数据包利用率。同样地,第二设备在确定出针对第一设备的中转数据包利用率之后,相当于确定了针对第一设备的直连数据包利用率。
第二设备可以通过第一设备和该第二设备之间的直连链路向第一设备发送针对第一设备的直连数据包利用率,以使第一设备获取到第二设备的直连数据包利用率。第一设备可以判断该第二设备针对第一设备的直连数据包利用率是否不小于第二利用率阈值,若不小于,则第二设备满足传输条件,此时,第一设备通过直连链路向第二设备发送第一设备的视频资源,从而降低了视频资源的传输延时和丢包率。可选地,第二利用率阈值可以是根据经验设备的数据,例如,第二利用率阈值为0.5。
在一种可能的实现方式中,第二设备的数量为至少一个,每一个第二设备可以通过其与第一设备之间的直连链路向第一设备发送针对第一设备的直连数据包利用率,从而使得第一设备能获取到各个第二设备的直连数据包利用率。各个第二设备的直连数据包利用率进行排序,基于排序结果可以确定第二利用率阈值。例如,先计算第一设备的可利用上行带宽与第一设备的视频资源的视频码率之间的比值,并对比值进行向下取整得到最大设备数量,对各个第二设备的直连数据包利用率按照从大到小的顺序进行排序,将排序第最大设备数量个直连数据包利用率确定为第二利用率阈值。当然,也可以按照从小到大的顺序进行排序,将排序倒数第最大设备数量个直连数据包利用率确定为第二利用率阈值。
在一种可能的实现方式中,请参见图4,该方法还包括步骤307。其中,图3示出了步骤307在步骤302之后执行,可选地,步骤307还可以在步骤303或者步骤304之后执行,在此不做限定。
步骤307,第一设备确定第一设备的视频资源的视频码率,在第一设备满足第三发送条件的情况下,向服务器发送第一设备的视频资源,以使服务器向第二设备转发第一设备的视频资源,第三发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备不满足传输条件。
本申请实施例中,若第一设备判断出第二设备针对第一设备的直连数据包利用率小于第二利用率阈值,则第一设备可以向服务器发送第二设备的标识信息。此外,第一设备向服务器发送第一设备的视频资源,使得服务器可以基于第二设备的标识信息,将第一设备的视频资源转发给第二设备,实现在第二设备的直连数据包利用率较小时,可以通过中转链路接收到第一设备的视频资源。
需要说明的是,和第一设备参与同一个语音通话的其他终端设备中可能存在未与第一设备创建直连链路的终端设备。对于这部分终端设备,第一设备可以向服务器发送这部分终端设备的标识信息,在第一设备向服务器发送第一设备的视频资源后,服务器基于这部分终端设备的标识信息,将第一设备的视频资源转发给这部分终端设备,保证了即使终端设备未与第一设备创建直连链路,也可以接收到第一设备的视频资源。
综上,和第一设备参与同一个语音通话的任一个终端设备,该终端设备可能与第一设备之间建立了直连链路,该终端设备可以为上文提及的第二设备,该终端设备可能未与第一设备之间建立直连链路。对于第一设备i和终端设备j,若第一设备i与终端设备j之间建立了直连链路,则Cij=1,若第一设备i与终端设备j之间未建立直连链路,则Cij=0。
对于视频资源来说,采用尽量直连链路传输的策略,以降低服务器的设备成本和带宽成本。假设有Q个终端设备参与语音通话,其中,第一设备i的角色为主讲,且第一设备i的 可利用上行带宽为B,第一设备i的视频资源的视频码率为b。当出现以下情况21和情况22中的任意一种时需要使用中转链路进行视频资源的传输。
情况21:
在情况21中,Q个终端设备中存在部分终端设备未与第一设备i之间建立直连链路。假设第一设备i与终端设备j之间未建立直连链路,即Cij=0,此时,第一设备i向服务器发送第一设备i的视频资源,以使服务器将第一设备i的时频资源转发给终端设备j。
情况22:在Q-1个终端设备均与第一设备创建了直连链路的情况下,满足B<b*(Q-1)。说明第一设备i的可利用上行带宽无法支撑通过直连链路往其他Q-1个终端设备同时发送第一设备i的视频资源。此时,第一设备i需要通过服务器的广播能力来辅助完成视频资源的传输。
假设去除第一设备i向中转链路发送第一设备i的视频资源所需的带宽之外,第一设备i的可利用上行带宽还能支撑通过T个直连链路,向T个终端设备发送第一设备i的视频资源。此时,Q-1个终端设备的直连数据包利用率按照从高到低的顺序进行排序,选择排序前T个对应的终端设备,作为满足传输条件的第二设备。对于满足传输条件的第二设备,第一设备通过直连链路向这些第二设备发送第一设备的视频资源。剩余Q-T-1个终端设备作为不满足传输条件的第二设备,对于这部分终端设备,第一设备向服务器发送第一设备的视频资源,以使服务器将第一设备的时频资源转发给这部分终端设备。
通过上述方式,保证了在不超过第一设备的可利用上行带宽的同时,尽量提升直连链路的利用率,提高视频资源的传输质量。若不会发生情况21和情况22,则第一设备不会使用中转链路来传输第一设备的视频资源,从而有效降低了服务器的设备成本和带宽成本。
需要说明的是,第一设备的数据可以分为两类数据。一类数据为关键数据,关键数据的特点是对传输质量(如丢包率、传输延时等)要求比较高,且数据量较小。在语音通话的过程中,音频资源、指令数据(如对音频资源、视频资源进行编码时的编码参数)等属于关键数据。另一类数据为非关键数据,非关键数据的特点是对传输质量要求不太高且数据量较大,比如,在语音通话的过程中,视频资源等属于非关键数据。本申请实施例中,关键数据采用中转链路和直连链路双发的策略,可以见上文有关对第一设备的音频资源的描述,而非关键数据尽量采用直连链路传输,可以见上文有关对第一设备的视频资源的描述。当然,在实际应用时,关键数据和非关键数据还可以采用其他的传输方式,例如,通过信令的方式传输编码参数。
在一种可能的实现方式中,请参见图4,在步骤302之后还可能包括步骤308。可选地,步骤308还可以在步骤303至步骤307中的任一个步骤之后执行。
步骤308,第一终端响应于第一设备在语音通话过程中的角色发生改变,通过直连链路向第二设备发送关闭直连请求,通过直连链路,接收第二设备发送的针对关闭直连请求的关闭直连响应,基于关闭直连响应关闭直连链路。
本申请实施例中,第一设备和第二设备之间已成功创建了直连链路。由于本申请采用了自适应的角色切换策略,因此,当第一设备的设备状态发生改变时,第一设备的角色也可能会随之发生改变。随着第一设备的设备状态发生改变,第一设备的角色可能会随之发生改变,此时,第一设备需要关闭与第二设备之间的直连链路,即第一设备通过直连链路向第二设备发送关闭直连请求。这里至少存在如下所示的情况31和情况32。
情况31,第一终端响应于第一设备在语音通话过程中的角色发生改变,通过直连链路向第二设备发送关闭直连请求,包括:第一设备响应于第一设备对应的设备状态变更为关闭多媒体资源采集的状态,确定第一设备在语音通话过程中的角色发生改变且第一设备在语音通话过程中的角色变更成听众,若第二设备在语音通话过程中的角色为听众,则通过直连链路向第二设备发送关闭直连请求,听众是指在语音通话过程中用于接收主讲输出的多媒体资源的设备。
当第一设备执行关闭视频采集(如关闭屏幕共享或者关闭摄像头)或关闭音频采集(如关闭麦克风)中的至少一项时,第一设备的设备状态由多媒体资源采集的状态变更为关闭多媒体资源采集的状态。此时,第一设备的角色由主讲变更为听众,该角色的变更即为第一设备的角色发生改变。由于第一设备的角色为听众,因此,第一设备是多媒体资源的接受者,主要用于接收多媒体资源。由于角色为听众的第二设备并不采集多媒体资源,因此,若第二设备的角色为听众,则第一设备通过直连链路向第二设备发送关闭直连请求,以关闭和角色为听众对应的第二设备之间的直连链路。
情况32,第一设备响应于第一设备在语音通话过程中的角色发生改变,通过直连链路向第二设备发送关闭直连请求,包括:第一设备响应于第一设备的设备状态变更为退出语音通话的状态,确定第一设备在语音通话过程中的角色发生改变且第一设备在语音通话过程中的角色变更成无角色,则通过直连链路向第二设备发送关闭直连请求,无角色是指未参与语音通话的设备。
第一设备可以退出已参与的语音通话,此时,第一设备的设备状态由参与语音通话的状态变更为退出语音通话的状态。此时,第一设备的角色由听众、主讲或者中间态变更为无角色,该角色的变更即为第一设备的角色发生改变。当第一设备的角色为无角色时,第一设备既不是多媒体资源的接受者,也不是多媒体信息的生产者,也就是说,第一设备既不用接收多媒体资源,也不用采集多媒体资源。第一设备可以通过直连链路向第二设备发送关闭直连请求,以关闭和角色为听众对应的第二设备之间的直连链路。
第一设备在关闭和第二设备之间的直连链路时,第一设备通过与第二设备之间的直连链路,向第二设备发送关闭直连请求。第二设备接收到关闭直连请求之后,通过与第一设备之间的直连链路,向第一设备发送关闭直连响应,并更新第一设备和第二设备之间的直连链路的链路状态,即将第一设备和第二设备之间的直连链路的链路状态由开启状态更新为关闭状态。当第一设备通过与第二设备之间的直连链路接收到第二设备发送的关闭直连响应时,第一设备更新第一设备和第二五设备之间的直连链路的链路状态,即将第一设备和第二设备之间的直连链路的链路状态由开启状态更新为关闭状态。自此,第一设备和第二设备之间的直连链路关闭。
需要说明的是,本申请所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。例如,本申请中涉及到的音频资源、视频资源等都是在充分授权的情况下获取的。
上述方法中,第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,并创建第一设备与第二设备之间的直连链路,实现了直连链路的动态创建。一方面,通过直连链路向第二设备发送第一设备的多媒体资源,以降低多媒体资源的传输延时和丢包率。另一方面,向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源,提高了多媒体资源的传输质量。
以图5所示的本申请实施例提供的又一种多媒体资源的传输方法的流程图为例,该方法可由图1中的任一终端设备(如终端设备2)执行。由于上述实施例环境中涉及至少两个终端设备,为便于描述,将执行本申请实施例提供的多媒体资源的传输方法的终端设备称为第一设备,将至少两个终端设备中除第一设备之外的终端设备称为其他终端设备。如图5所示,该方法至少包括如下所示的步骤501至步骤503。
步骤501,第二设备响应于服务器转发的第一设备的直连请求,创建第二设备与第一设备之间的直连链路,第二设备是第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定的,主讲是指在语音通话过程中用于输出多媒体资源的设备。
步骤502,第二设备通过直连链路接收第一设备发送的第一设备的多媒体资源。
步骤503,第二设备接收服务器转发的第一设备的多媒体资源。
本申请实施例中,第二设备通过直连链路或者服务器接收第一设备发送的多媒体资源,第一设备的多媒体资源包括音频资源或视频资源中的至少一项,针对音频资源和视频资源,可以采用不同的方式进行发送,以下图6的实施例中步骤601-步骤603对音频资源的发送方式进行了详细说明,步骤604-步骤606对视频资源的发送方式进行了详细说明。
以图6所示的本申请实施例提供的又一种多媒体资源的传输方法的流程图为例,该方法可由图1中的任一终端设备(如终端设备2)执行。由于上述实施例环境中涉及至少两个终端设备,为便于描述,将执行本申请实施例提供的多媒体资源的传输方法的终端设备称为第一设备,将至少两个终端设备中除第一设备之外的终端设备称为其他终端设备。如图6所示,该方法至少包括如下所示的步骤601至步骤603。
步骤601,第二设备响应于服务器转发的第一设备的直连请求,创建第二设备与第一设备之间的直连链路,第二设备是第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定的,主讲是指在语音通话过程中用于输出多媒体资源的设备。
有关步骤601的内容可以见上文有关步骤301和步骤302的描述,二者实现原理相同,在此不再赘述。
在一种可能的实现方式中,第二设备创建第二设备与第一设备之间的直连链路,包括:接收服务器转发的针对第二设备的直连请求,直连请求是第一设备向服务器发送的,直连请求携带第一设备的外网地址。第二设备向服务器发送针对直连请求的直连响应,直连响应携带第二设备的外网地址,基于第一设备的外网地址和第二设备的外网地址,创建第二设备与第一设备之间的直连链路。
有关创建第二设备与第一设备之间的直连链路的描述,可以见上文有关“创建第一设备与第二设备之间的直连链路”的描述,二者实现原理相同,在此不再赘述。
可选地,第二设备基于第一设备的外网地址和第二设备的外网地址,创建第二设备与第一设备之间的直连链路,包括:第二设备基于第一设备的外网地址,向第一设备发送多个直连数据包,在接收到第一设备发送的接收响应的情况下,确定已创建第二设备与第一设备之间的直连链路,接收响应是指第一设备基于第二设备的外网地址反馈的针对各个直连数据包的响应信息。
有关成功创建第二设备与第一设备之间的直连链路的描述,可以见上文有关“成功创建第一设备与第二设备之间的直连链路”的描述,二者实现原理相同,在此不再赘述。
在一种可能的实现方式中,在步骤601之后还包括:第二设备通过直连链路,接收第一设备发送的关闭直连请求,关闭直连请求是第一设备在语音通话过程中的角色发生改变时向第二设备发送的,第二设备通过直连链路,向第一设备发送针对关闭直连请求的关闭直连响应,基于关闭直连响应关闭直连链路。
有关关闭直连链路的描述,可以见上文有关“关闭直连链路”的描述,二者实现原理相同,在此不再赘述。
步骤602,第二设备通过直连链路接收第一设备发送的第一设备的音频资源。
有关步骤602的描述,可以见上文有关步骤303的描述,二者实现原理相同,在此不再赘述。
步骤603,第二设备接收服务器转发的第一设备的音频资源。
有关步骤603的描述,可以见上文有关步骤304的描述,二者实现原理相同,在此不再赘述。
在一种可能的实现方式中,该方法还包括步骤604,第二设备通过直连链路接收第一设备发送的第一设备的视频资源,该视频资源是在满足第一发送条件的情况下通过直连链路发送的,第一发送条件是指第一设备的可利用上行带宽不小于第一设备的视频资源的视频码率 与参考数量之间的乘积。
有关步骤604的描述,可以见上文有关步骤305的描述,二者实现原理相同,在此不再赘述。
该方法还包括步骤605,第二设备通过直连链路接收第一设备发送的第一设备的视频资源,该视频资源是在满足第二发送条件的情况下通过直连链路发送的,第二发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备满足传输条件。可选地,第二设备满足传输条件,包括:第二设备的直连数据包利用率不小于第二利用率阈值,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的占比。
有关步骤605的描述,可以见上文有关步骤306的描述,二者实现原理相同,在此不再赘述。
该方法还包括步骤606,第二设备接收服务器转发的第一设备的视频资源,该视频资源是在满足第三发送条件的情况下通过服务器转发的,第三发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备不满足传输条件。可选地,第二设备满足传输条件,包括:第二设备的直连数据包利用率不小于第二利用率阈值,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的占比。
有关步骤606的描述,可以见上文有关步骤307的描述,二者实现原理相同,在此不再赘述。
上述方法中,第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,并创建第一设备与第二设备之间的直连链路,实现了直连链路的动态创建。一方面,通过直连链路向第二设备发送第一设备的多媒体资源,以降低多媒体资源的传输延时和丢包率。另一方面,向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源,提高了多媒体资源的传输质量。
本申请实施例还提供一种多媒体资源的传输系统,该系统包括第一设备、第二设备和服务器。
其中,第一设备用于执行与图2、图3、图4相关的多媒体资源的传输方法;第二设备用于执行与图5、图6相关的多媒体资源的传输方法;服务器用于执行与图2至图6相关的服务器所执行的功能。
上述从方法步骤的角度阐述了本申请实施例提供的多媒体资源的传输方法,下面结合图7至图9来进一步描述。在本申请实施例中,多个终端设备参与同一个语音通话,将多个终端设备中参与语音通话的任一个终端设备视为第一设备,将多个终端设备中除第一设备之外的终端设备视为其他终端设备。
第一设备在语音通话的过程中会扮演不同的角色,可以使用第一设备的角色来描述第一设备在语音通话的过程中所扮演的角色。当第一设备的设备状态发生改变时,第一设备的角色可能也会随之发生改变。请参见图7,图7是本申请实施例提供的一种角色切换的示意图。
第一设备在未参与语音通话时,第一设备的设备状态为未参与语音通话的状态,此时,第一设备的角色为无角色。当第一设备参与语音通话时,第一设备的设备状态由未参与语音通话的状态变更为参与语音通话的状态,此时,第一设备的角色由无角色变更为听众。在听众下,第一设备可以执行接收并播放音频或接收并播放视频中的至少一项。此时,由于第一设备并未采集音频和视频,因此,第一设备的设备状态也可以称为关闭视频资源采集的状态和关闭音频资源采集的状态。
当第一设备开始采集视频时,第一设备的设备状态由关闭视频资源采集的状态变更为视频资源采集的状态,此时,第一设备的角色由听众变更为主讲。此时,在主讲下,第一设备 可以录制并发送视频。
当第一设备开始采集音频时,第一设备的设备状态由关闭音频资源采集的状态变更为音频资源采集的状态,此时,第一设备的角色先由听众变更为中间态,在中间态下,第一设备可以执行接收并播放视频或接收并播放音频中的至少一项。在中间态下,第一设备还可以录制音频,并对音频进行语音活动检测,该语音活动检测会持续一段时间(比如一分钟)。若在这段时间内从音频中未检测到对象音频,则第一设备的角色由中间态回退至听众。若在这段时间内从音频中检测到了对象音频,则第一设备的角色由中间态变更为主讲,此时,在主讲下,第一设备可以录制并发送音频。
需要说明的是,在作为主讲的情况下,第一设备还可以执行接收并播放视频或接收并播放音频中的至少一项。当第一设备同时开始采集音频和视频时,第一设备的角色直接由听众变更为主讲,此时,若第一设备停止采集音频或者第一设备停止采集视频,第一设备的角色不会发生改变,也就是说,第一设备的角色依旧为主讲。
当第一设备的设备状态仅为视频资源采集的状态,且第一设备停止采集视频时,第一设备的设备状态由视频资源采集的状态变更为关闭视频资源采集的状态。当第一设备的设备状态仅为音频资源采集的状态,且第一设备停止采集音频时,第一设备的设备状态由音频资源采集的状态变更为关闭音频资源采集的状态。当第一设备的设备状态为视频资源采集的状态和音频资源采集的状态,且第一设备停止采集音频和视频时,第一设备的设备状态由视频资源采集的状态变更为关闭视频资源采集的状态,且第一设备的设备状态由音频资源采集的状态变更为关闭音频资源采集的状态。这三种情况均可以概括成第一设备停止采集音频和视频,此时,第一设备的角色由主讲变更为听众。
第一设备可以在语音通话过程中的任意时刻退出语音通话。当第一设备退出语音通话时,第一设备的设备状态由参与语音通话的状态变更为未参与语音通话的状态,此时,第一设备的角色由听众或者主讲或者中间态变更为无角色。
在第一设备的角色为听众或者中间态时,第一设备与其他终端设备中角色处于主讲的终端设备之间可以存在直连链路,以便于第一设备接收并播放音频或视频中的至少一项。当第一设备的角色为主讲时,第一设备与其他终端设备中的任一个其他终端设备之间均可以存在直连链路,以便于第一设备录制并发送音频或视频中的至少一项、接收并播放音频或视频中的至少一项,而这个终端设备的角色可能为听众,也可能为主讲,甚至可能为中间态。
由于第一设备的角色不同,与第一设备之间存在直连链路的其他终端设备也不同,因此,第一设备可以实时创建并关闭直连链路。
本申请实施例中,当第一设备的角色由无角色变更为听众时,第一设备可以与其他终端设备中未与第一设备创建直连链路且角色处于主讲对应的终端设备之间创建直连链路。当第一设备的角色由中间态或者听众变更为主讲时,第一设备可以与其他终端设备中未与第一设备创建直连链路的终端设备之间创建直连链路。以第一设备和第二设备之间创建直连链路为例,请参见图8,图8是本申请实施例提供的一种直连链路的创建和关闭的示意图。
当第一设备参与语音通话时,第一设备可以创建直连套接字,基于其直连套接字向直连打洞服务器发送UDP数据包,直连打洞服务器通过对UDP数据包进行解析得到第一设备的外网地址,并将第一设备的外网地址发送给第一设备,从而使得第一设备得到其外网地址。按照同样的方式,第一设备也可以得到其外网地址。
当第一设备需要创建与第二设备之间的直连链路时,第一设备向中转服务器发送直连请求,中转服务器可以将该直连请求转发给第二设备。其中,直连请求中携带第一设备的外网地址,第二设备通过对直连请求进行解析得到第一设备的外网地址,并存储第一设备的外网地址。此外,第二设备向中转服务器发送针对直连请求的直连响应,中转服务器可以将该直连响应转发给第一设备。其中,直连响应中携带第二设备的外网地址,第一设备通过对直连响应进行解析得到第二设备的外网地址,并存储第二设备的外网地址。
接下来,第一设备和第二设备之间开始直连打洞。第一设备基于第二设备的外网地址, 向第二设备发送多个直连数据包。第二设备每接收到一个直连数据包,向第一设备发送针对该直连数据包的接收响应。当第一设备接收到第二设备发送的针对各个直连数据包的接收响应时,第一设备确定和第二设备之间已成功创建了直连链路。
此时,第一设备可以通过第一设备与第二设备之间的直连链路,向第二设备发送第一设备的多媒体资源,以使第二设备接收到第一设备的多媒体资源。同样地,第二设备也可以通过第一设备与第二设备之间的直连链路,向第一设备发送第二设备的多媒体资源,以使第一设备接收到第二设备的多媒体资源。其中,这里的多媒体资源包括音频资源和/或视频资源。
当第一设备需要关闭与第二设备之间的直连链路时,第一设备通过第一设备与第二设备之间的直连链路,向第二设备发送关闭直连请求。第二设备接收到该关闭直连请求之后,通过第一设备与第二设备之间的直连链路,向第一设备发送关闭直连响应,并确定关闭第一设备与第二设备之间的直连链路。第一设备接收到关闭直连响应时,确定关闭第一设备与第二设备之间的直连链路。
请参见图9,图9是本申请实施例提供的一种多媒体资源的传输示意图。本申请实施例中,第一设备可以通过第一设备与第二设备之间的直连链路,向第二设备发送第一设备的多媒体资源,以使第二设备通过直连链路接收到第一设备的多媒体资源。此外,第一设备可以向中转服务器发送第一设备的多媒体资源,由中转服务器将第一设备的多媒体资源转发给第二设备,以使第二设备通过中转链路接收到第一设备的多媒体资源。这里的多媒体资源是以流的形式发送的,因此,多媒体资源对应于数据包。
由于第二设备既可以通过直连链路接收到第一设备的多媒体资源,也可以通过中转链路接收到第一设备的多媒体资源,因此,对于每一个数据包,第二设备均可以接收到两次。第二设备使用先接收到的数据包,丢弃后接收到的数据包,从而实现对接收到的数据包进行去重。通过统计数据包的使用情况,可以得到直连数据包利用率和中转数据包利用率,并将直连数据包利用率发送给第一设备,将中转数据包利用率发送给中转服务器。
需要说明的是,第一设备的多媒体资源包括音频资源和视频资源,本申请实施例对于音频资源和视频资源采用了不同的传输策略。
对于音频资源,采用中转链路和直连链路双发的策略。一方面,第一设备通过直连链路向第二设备发送第一设备的音频资源。另一方面,第一设备向中转服务器发送第一设备的音频资源,中转服务器判断第二设备的中转数据包利用率是否不大于第一利用率阈值。若第二设备的中转数据包利用率不大于第一利用率阈值,则中转服务器停止向第二设备发送第一设备的音频资源。若第二设备的中转数据包利用率大于第一利用率阈值,则中转服务器向第二设备发送第一设备的音频资源。通过这种方式,保证了第二设备可以接收到第一设备的音频资源的同时,减少了无效传输,节约了下行资源。
对于视频资源,采用尽量直连链路传输的策略。假设有Q个终端设备参与同一个语音通话,第一设备m的角色为主讲,m的可利用上行带宽为B,m的视频资源的视频码率为b,其他终端设备的数量为Q-1,且第一设备和其他终端设备之间均存在直连链路。则在B≥b*(Q-1)时,第一设备通过直连链路向各个其他终端设备发送第一设备的视频资源。在B<b*(Q-1)时,第一设备对各个其他终端设备的直连数据包利用率进行从大到小的排序,基于排序结果确定前T个其他终端设备。一方面,第一设备通过直连链路向前T个其他终端设备发送第一设备的视频资源,另一方面,第一设备向中转服务器发送第一设备的视频资源,中转服务器向后Q-T-1个终端设备发送第一设备的视频资源。当然,若存在终端设备与第一设备之间未创建有直连链路,中转服务器也会向该终端设备发送第一设备的视频资源。
通过这种方式,使得参与语音通话的各个其他终端设备均可以通过直连链路或者中转链路接收到第一设备的视频资源。此外,通过采用尽量直连链路传输的策略,可以降低中转服务器的设备成本和带宽成本,提高视频资源的传输质量。在不超过第一设备的可利用上行带宽的同时,充分地利用了直连链路,提高了直连链路的整体传输质量。
图10是本申请实施例提供的一种多媒体资源的传输装置的结构示意图,设置于第一设备中,如图10所示,该装置包括:
确定模块1001,用于响应于第一设备在语音通话过程中的角色变更为主讲,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,主讲是指在语音通话过程中用于输出多媒体资源的设备;
创建模块1002,用于创建第一设备与第二设备之间的直连链路;
发送模块1003,用于通过直连链路向第二设备发送第一设备的多媒体资源;
发送模块1003,还用于向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源。
在一种可能的实现方式中,装置还包括:
获取模块,用于响应于第一设备的设备状态变更为音频资源采集的状态,获取第一设备采集的音频资源;
确定模块1001,还用于响应于在采集的音频资源中检测出对象音频,确定第一设备在语音通话过程中的角色变更为主讲。
在一种可能的实现方式中,确定模块1001,还用于响应于第一设备的设备状态变更为视频资源采集的状态,确定第一设备在语音通话过程中的角色变更为主讲。
在一种可能的实现方式中,创建模块1002,用于向服务器发送针对第二设备的直连请求,直连请求中携带第一设备的外网地址;接收服务器转发的直连响应,直连响应是第二设备向服务器发送的针对直连请求的响应信息,直连请求由服务器转发给第二设备,直连响应中携带第二设备的外网地址;从直连响应中提取出第二设备的外网地址;基于第一设备的外网地址和第二设备的外网地址,创建第一设备与第二设备之间的直连链路。
在一种可能的实现方式中,创建模块1002,用于基于第二设备的外网地址,向第二设备发送多个直连数据包;在接收到第二设备发送的接收响应的情况下,确定已创建第一设备与第二设备之间的直连链路,接收响应是指第二设备基于第一设备的外网地址反馈的针对各个直连数据包的响应信息。
在一种可能的实现方式中,多媒体资源包括音频资源,服务器在满足转发条件时向第二设备转发第一设备的音频资源,转发条件是指第二设备的中转数据包利用率大于第一利用率阈值,第二设备的中转数据包利用率表征第二设备使用通过服务器转发的数据包的占比;服务器在不满足转发条件时停止向第二设备转发第一设备的音频资源。
在一种可能的实现方式中,发送模块1003,用于通过直连链路向第二设备发送第一设备的音频资源;确定第一设备的视频资源的视频码率;在满足第一发送条件的情况下,通过直连链路向第二设备发送第一设备的视频资源,第一发送条件是指第一设备的可利用上行带宽不小于第一设备的视频资源的视频码率与参考数量之间的乘积。
在一种可能的实现方式中,发送模块1003,用于通过直连链路向第二设备发送第一设备的音频资源;确定第一设备的视频资源的视频码率;在满足第二发送条件的情况下,通过直连链路向第二设备发送第一设备的视频资源,第二发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备满足传输条件;
发送模块1003,还用于向服务器发送第一设备的音频资源,以使服务器向第二设备转发第一设备的音频资源;在满足第三发送条件的情况下,向服务器发送第一设备的视频资源,以使服务器向第二设备转发第一设备的视频资源,第三发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备不满足传输条件。
在一种可能的实现方式中,第二设备满足传输条件,包括:第二设备的直连数据包利用率不小于第二利用率阈值,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的占比。
在一种可能的实现方式中,发送模块1003,还用于响应于第一设备在语音通话过程中的 角色发生改变,通过直连链路向第二设备发送关闭直连请求;
装置还包括:
接收模块,用于通过直连链路,接收第二设备发送的针对关闭直连请求的关闭直连响应,基于关闭直连响应关闭直连链路。
在一种可能的实现方式中,发送模块1003,用于响应于第一设备对应的设备状态变更为关闭多媒体资源采集的状态,确定第一设备在语音通话过程中的角色发生改变且角色变更成听众,若第二设备在语音通话过程中的角色为听众,则通过直连链路向第二设备发送关闭直连请求,听众是指在语音通话过程中用于接收主讲输出的多媒体资源的设备;响应于第一设备对应的设备状态变更为退出语音通话的状态,确定第一设备在语音通话过程中的角色发生改变且角色变更成无角色,则通过直连链路向第二设备发送关闭直连请求,无角色是指未参与语音通话的设备。
上述装置中,第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,并创建第一设备与第二设备之间的直连链路,实现了直连链路的动态创建。一方面,通过直连链路向第二设备发送第一设备的多媒体资源,以降低多媒体资源的传输延时和丢包率。另一方面,向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源,提高了多媒体资源的传输质量。
应理解的是,上述图10提供的装置在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图11是本申请实施例提供的一种多媒体资源的传输装置的结构示意图,如图11所示,该装置包括:
创建模块1101,用于响应于接收到服务器转发的第一设备发送的直连请求,创建第二设备与第一设备之间的直连链路,第二设备是第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定的,主讲是指在语音通话过程中用于输出多媒体资源的设备;
接收模块1102,用于通过直连链路接收第一设备发送的第一设备的多媒体资源;
接收模块1102,还用于接收服务器转发的第一设备的多媒体资源。
在一种可能的实现方式中,创建模块1101,用于接收服务器转发的针对第二设备的直连请求,直连请求是第一设备向服务器发送的,直连请求携带第一设备的外网地址;向服务器发送针对直连请求的直连响应,直连响应携带第二设备的外网地址;基于第一设备的外网地址和第二设备的外网地址,创建第二设备与第一设备之间的直连链路。
在一种可能的实现方式中,创建模块1101,用于基于第一设备的外网地址,向第一设备发送多个直连数据包;在接收到第一设备发送的接收响应的情况下,确定已创建第二设备与第一设备之间的直连链路,接收响应是指第二设备基于第一设备的外网地址反馈的针对各个直连数据包的响应信息。
在一种可能的实现方式中,接收模块1102,还用于通过直连链路接收第一设备发送的第一设备的音频资源;通过直连链路接收第一设备发送的第一设备的视频资源,视频资源是在满足第一发送条件的情况下通过直连链路发送的,第一发送条件是指第一设备的可利用上行带宽不小于第一设备的视频资源的视频码率与参考数量之间的乘积。
在一种可能的实现方式中,接收模块1102,还用于通过直连链路接收第一设备发送的第一设备的音频资源;通过直连链路接收第一设备发送的第一设备的视频资源,视频资源是在满足第二发送条件的情况下通过直连链路发送的,第二发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备满足传输条件。
在一种可能的实现方式中,接收模块1102,还用于接收服务器转发的第一设备的音频资 源;接收服务器转发的第一设备的视频资源,视频资源是在满足第三发送条件的情况下通过服务器转发的,第三发送条件是指第一设备的可利用上行带宽小于第一设备的视频资源的视频码率与参考数量之间的乘积,且第二设备不满足传输条件。
在一种可能的实现方式中,第二设备满足传输条件,包括:第二设备的直连数据包利用率不小于第二利用率阈值,第二设备的直连数据包利用率表征第二设备使用通过直连链路发送的数据包的占比。
在一种可能的实现方式中,接收模块1102,还用于通过直连链路,接收第一设备发送的关闭直连请求,关闭直连请求是第一设备在语音通话过程中的角色发生目标改变时向第二设备发送的;
装置还包括:
发送模块,用于通过直连链路,向第一设备发送针对关闭直连请求的关闭直连响应,基于关闭直连响应关闭直连链路。
上述装置中,第一设备在语音通话过程中的角色变更为主讲时,在与第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,并创建第一设备与第二设备之间的直连链路,实现了直连链路的动态创建。一方面,通过直连链路向第二设备发送第一设备的多媒体资源,以降低多媒体资源的传输延时和丢包率。另一方面,向服务器发送第一设备的多媒体资源,以使服务器向第二设备转发第一设备的多媒体资源,提高了多媒体资源的传输质量。
应理解的是,上述图11提供的装置在实现其功能时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的装置与方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
图12示出了本申请一个示例性实施例提供的终端设备1200的结构框图。该终端设备1200包括有:处理器1201和存储器1202。
处理器1201可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1201可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1201也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1201可以集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1201还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1202可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1202还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1202中的非暂态的计算机可读存储介质用于存储至少一个计算机程序,该至少一个计算机程序用于被处理器1201所执行以实现本申请中方法实施例提供的多媒体资源的传输方法。
在一些实施例中,终端设备1200还可选包括有:外围设备接口1203和至少一个外围设备。处理器1201、存储器1202和外围设备接口1203之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1203相连。具体地,外围设备包括:射频电路1204、显示屏1205、摄像头组件1206和音频电路1207中的至少一种。
外围设备接口1203可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器1201和存储器1202。在一些实施例中,处理器1201、存储器1202和外围设备接口1203被集成在同一芯片或电路板上;在一些其他实施例中,处理器1201、存储器1202和外围设备接口1203中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此 不加以限定。
射频电路1204用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路1204通过电磁信号与通信网络以及其他通信设备进行通信。射频电路1204将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路1204包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路1204可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:万维网、城域网、内联网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路1204还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏1205用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏1205是触摸显示屏时,显示屏1205还具有采集在显示屏1205的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器1201进行处理。此时,显示屏1205还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏1205可以为一个,设置在终端设备1200的前面板;在另一些实施例中,显示屏1205可以为至少两个,分别设置在终端设备1200的不同表面或呈折叠设计;在另一些实施例中,显示屏1205可以是柔性显示屏,设置在终端设备1200的弯曲表面上或折叠面上。甚至,显示屏1205还可以设置成非矩形的不规则图形,也即异形屏。显示屏1205可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件1206用于采集图像或视频。可选地,摄像头组件1206包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件1206还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路1207可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器1201进行处理,或者输入至射频电路1204以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端设备1200的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器1201或射频电路1204的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路1207还可以包括耳机插孔。
本领域技术人员可以理解,图12中示出的结构并不构成对终端设备1200的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
图13为本申请实施例提供的服务器的结构示意图,该服务器1300可因配置或性能不同而产生比较大的差异,可以包括一个或多个处理器1301和一个或多个的存储器1302,其中,该一个或多个存储器1302中存储有至少一条计算机程序,该至少一条计算机程序由该一个或多个处理器1301加载并执行以实现上述各个方法实施例提供的多媒体资源的传输方法,示例性的,处理器1301为CPU。当然,该服务器1300还可以具有有线或无线网络接口、键盘以及输入输出接口等部件,以便进行输入输出,该服务器1300还可以包括其他用于实现设备功能的部件,在此不做赘述。
在示例性实施例中,还提供了一种计算机可读存储介质,该存储介质中存储有至少一条计算机程序,该至少一条计算机程序由处理器加载并执行,以使电子设备实现上述任一种多媒体资源的传输方法。
可选地,上述计算机可读存储介质可以是只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)、磁带、软盘和光数据存储设备等。
在示例性实施例中,还提供了一种计算机程序或计算机程序产品,该计算机程序或计算机程序产品中存储有至少一条计算机程序,该至少一条计算机程序由处理器加载并执行,以使电子设备实现上述任一种多媒体资源的传输方法。
应当理解的是,在本文中提及的“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
以上所述仅为本申请的示例性实施例,并不用以限制本申请,凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (22)

  1. 一种多媒体资源的传输方法,所述方法包括:
    第一设备响应于所述第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
    所述第一设备创建所述第一设备与所述第二设备之间的直连链路;
    所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源;
    所述第一设备向服务器发送所述第一设备的多媒体资源,以使所述服务器向所述第二设备转发所述第一设备的多媒体资源。
  2. 根据权利要求1所述的方法,其中,所述第一设备响应于所述第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备之前,还包括:
    所述第一设备响应于所述第一设备的设备状态变更为音频资源采集的状态,获取所述第一设备采集的音频资源;
    所述第一设备响应于在采集的音频资源中检测出对象音频,确定所述第一设备在语音通话过程中的角色变更为主讲。
  3. 根据权利要求1所述的方法,其中,所述第一设备响应于所述第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备之前,还包括:
    所述第一设备响应于所述第一设备的设备状态变更为视频资源采集的状态,确定所述第一设备在语音通话过程中的角色变更为主讲。
  4. 根据权利要求1所述的方法,其中,所述第一设备创建所述第一设备与所述第二设备之间的直连链路,包括:
    所述第一设备向服务器发送针对所述第二设备的直连请求,所述直连请求携带所述第一设备的外网地址;
    所述第一设备接收所述服务器转发的直连响应,所述直连响应是所述第二设备向所述服务器发送的针对所述直连请求的响应信息,所述直连请求由所述服务器转发给所述第二设备,所述直连响应携带所述第二设备的外网地址;
    所述第一设备从所述直连响应中提取出所述第二设备的外网地址;
    所述第一设备基于所述第一设备的外网地址和所述第二设备的外网地址,创建所述第一设备与所述第二设备之间的直连链路。
  5. 根据权利要求4所述的方法,其中,所述第一设备基于所述第一设备的外网地址和所述第二设备的外网地址,创建所述第一设备与所述第二设备之间的直连链路,包括:
    所述第一设备基于所述第二设备的外网地址,向所述第二设备发送多个直连数据包;
    所述第一设备在接收到所述第二设备发送的接收响应的情况下,确定已创建所述第一设备与所述第二设备之间的直连链路,所述接收响应是指所述第二设备基于所述第一设备的外网地址反馈的针对各个直连数据包的响应信息。
  6. 根据权利要求1所述的方法,其中,所述多媒体资源包括音频资源,所述服务器在满足转发条件的情况下向所述第二设备转发所述第一设备的音频资源,所述转发条件是指所述 第二设备的中转数据包利用率大于第一利用率阈值,所述第二设备的中转数据包利用率表征所述第二设备使用通过所述服务器转发的数据包的占比;
    所述服务器在不满足所述转发条件的情况下停止向所述第二设备转发所述第一设备的音频资源。
  7. 根据权利要求1至6任一项所述的方法,其中,所述多媒体资源包括音频资源和视频资源,所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源,包括:
    所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的音频资源;
    所述第一设备确定所述第一设备的视频资源的视频码率;
    所述第一设备在满足第一发送条件的情况下,通过所述直连链路向所述第二设备发送所述第一设备的视频资源,所述第一发送条件是指所述第一设备的可利用上行带宽不小于所述第一设备的视频资源的视频码率与参考数量之间的乘积。
  8. 根据权利要求1至6任一项所述的方法,其中,所述多媒体资源包括音频资源和视频资源,所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源,包括:
    所述第一设备通过所述直连链路向所述第二设备发送所述第一设备的音频资源;
    所述第一设备确定所述第一设备的视频资源的视频码率;
    所述第一设备在满足第二发送条件的情况下,通过所述直连链路向所述第二设备发送所述第一设备的视频资源,所述第二发送条件是指所述第一设备的可利用上行带宽小于所述第一设备的视频资源的视频码率与参考数量之间的乘积,且所述第二设备满足传输条件;
    所述第一设备向服务器发送所述第一设备的多媒体资源,以使所述服务器向所述第二设备转发所述第一设备的多媒体资源,包括:
    所述第一设备向所述服务器发送所述第一设备的音频资源,以使所述服务器向所述第二设备转发所述第一设备的音频资源;
    所述第一设备在满足第三发送条件的情况下,向所述服务器发送所述第一设备的视频资源,以使所述服务器向所述第二设备转发所述第一设备的视频资源,所述第三发送条件是指所述第一设备的可利用上行带宽小于所述第一设备的视频资源的视频码率与所述参考数量之间的乘积,且所述第二设备不满足所述传输条件。
  9. 根据权利要求8所述的方法,其中,所述第二设备满足传输条件,包括:
    所述第二设备的直连数据包利用率不小于第二利用率阈值,所述第二设备的直连数据包利用率表征所述第二设备使用通过所述直连链路发送的数据包的占比。
  10. 根据权利要求1至6任一项所述的方法,其中,所述第一设备创建所述第一设备与所述第二设备之间的直连链路之后,还包括:
    所述第一设备响应于所述第一设备在语音通话过程中的角色发生改变,通过所述直连链路向所述第二设备发送关闭直连请求;
    所述第一设备通过所述直连链路,接收所述第二设备发送的针对所述关闭直连请求的关闭直连响应,基于所述关闭直连响应关闭所述直连链路。
  11. 根据权利要求10所述的方法,其中,所述第一设备响应于所述第一设备在语音通话过程中的角色发生改变,通过所述直连链路向所述第二设备发送关闭直连请求,包括:
    所述第一设备响应于所述第一设备的设备状态变更为关闭多媒体资源采集的状态,确定所述第一设备在语音通话过程中的角色发生改变且角色变更成听众,若所述第二设备在语音 通话过程中的角色为听众,则通过所述直连链路向所述第二设备发送关闭直连请求,所述听众是指在语音通话过程中用于接收主讲输出的多媒体资源的设备;
    所述第一设备响应于所述第一设备的设备状态变更为退出语音通话的状态,确定所述第一设备在语音通话过程中的角色发生改变且角色变更成无角色,则通过所述直连链路向所述第二设备发送关闭直连请求,所述无角色是指未参与语音通话的设备。
  12. 一种多媒体资源的传输方法,所述方法包括:
    第二设备响应于服务器转发的第一设备的直连请求,创建所述第二设备与所述第一设备之间的直连链路,所述第二设备是所述第一设备在语音通话过程中的角色变更为主讲时,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定的,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
    所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源;
    所述第二设备接收所述服务器转发的所述第一设备的多媒体资源。
  13. 根据权利要求12所述的方法,其中,所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源,包括:
    所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的音频资源;
    所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的视频资源,所述视频资源是在满足第一发送条件的情况下通过所述直连链路发送的,所述第一发送条件是指所述第一设备的可利用上行带宽不小于所述第一设备的视频资源的视频码率与参考数量之间的乘积。
  14. 根据权利要求12所述的方法,其中,所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源,包括:
    所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的音频资源;
    所述第二设备通过所述直连链路接收所述第一设备发送的所述第一设备的视频资源,所述视频资源是在满足第二发送条件的情况下通过所述直连链路发送的,所述第二发送条件是指所述第一设备的可利用上行带宽小于所述第一设备的视频资源的视频码率与参考数量之间的乘积,且所述第二设备满足传输条件。
  15. 根据权利要求12所述的方法,其中,所述第二设备接收所述服务器转发的所述第一设备的多媒体资源,包括:
    所述第二设备接收所述服务器转发的所述第一设备的音频资源;
    所述第二设备接收所述服务器转发的所述第一设备的视频资源,所述视频资源是在满足第三发送条件的情况下通过所述服务器转发的,所述第三发送条件是指所述第一设备的可利用上行带宽小于所述第一设备的视频资源的视频码率与参考数量之间的乘积,且所述第二设备不满足传输条件。
  16. 根据权利要求14或15所述的方法,其中,所述第二设备满足传输条件,包括:
    所述第二设备的直连数据包利用率不小于第二利用率阈值,所述第二设备的直连数据包利用率表征所述第二设备使用通过所述直连链路发送的数据包的占比。
  17. 根据权利要求12所述的方法,其中,所述创建所述第二设备与所述第一设备之间的直连链路之后,还包括:
    所述第二设备通过所述直连链路,接收所述第一设备发送的关闭直连请求,所述关闭直连请求是所述第一设备在语音通话过程中的角色发生改变时向所述第二设备发送的;
    所述第二设备通过所述直连链路,向所述第一设备发送针对所述关闭直连请求的关闭直连响应,基于所述关闭直连响应关闭所述直连链路。
  18. 一种多媒体资源的传输装置,设置于第一设备中,所述装置包括:
    确定模块,用于响应于所述第一设备在语音通话过程中的角色变更为主讲,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定第二设备,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
    创建模块,用于创建所述第一设备与所述第二设备之间的直连链路;
    发送模块,用于通过所述直连链路向所述第二设备发送所述第一设备的多媒体资源;
    所述发送模块,还用于向服务器发送所述第一设备的多媒体资源,以使所述服务器向所述第二设备转发所述第一设备的多媒体资源。
  19. 一种多媒体资源的传输装置,设置于第二设备中,所述装置包括:
    创建模块,用于响应于服务器转发的第一设备的直连请求,创建所述第二设备与所述第一设备之间的直连链路,所述第二设备是所述第一设备在语音通话过程中的角色变更为主讲时,在与所述第一设备参与同一个语音通话的至少一个终端设备中确定的,所述主讲是指在语音通话过程中用于输出多媒体资源的设备;
    接收模块,用于通过所述直连链路接收所述第一设备发送的所述第一设备的多媒体资源;
    所述接收模块,还用于接收所述服务器转发的所述第一设备的多媒体资源。
  20. 一种电子设备,所述电子设备包括处理器和存储器,所述存储器中存储有至少一条计算机程序,所述至少一条计算机程序由所述处理器加载并执行,以使所述电子设备实现如权利要求1至17任一所述的多媒体资源的传输方法。
  21. 一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一条计算机程序,所述至少一条计算机程序由处理器加载并执行,以使电子设备实现如权利要求1至17任一所述的多媒体资源的传输方法。
  22. 一种多媒体资源的传输系统,所述系统包括第一设备、第二设备和服务器;
    所述第一设备用于执行如权利要求1至11任一所述的多媒体资源的传输方法;
    所述第二设备用于执行如权利要求12至17任一所述的多媒体资源的传输方法;
    所述服务器用于执行如权利要求1至17中任一所述的多媒体资源的传输方法中服务器所执行的功能。
PCT/CN2023/094123 2022-08-03 2023-05-15 多媒体资源的传输方法、装置、电子设备及存储介质 WO2024027272A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210927973.4A CN117560357A (zh) 2022-08-03 2022-08-03 多媒体资源的传输方法、装置、电子设备及存储介质
CN202210927973.4 2022-08-03

Publications (2)

Publication Number Publication Date
WO2024027272A1 true WO2024027272A1 (zh) 2024-02-08
WO2024027272A9 WO2024027272A9 (zh) 2024-04-04

Family

ID=89820990

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/094123 WO2024027272A1 (zh) 2022-08-03 2023-05-15 多媒体资源的传输方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN117560357A (zh)
WO (1) WO2024027272A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101047580A (zh) * 2006-03-28 2007-10-03 腾讯科技(深圳)有限公司 创建点对点数据通道的方法
CN102651701A (zh) * 2011-02-28 2012-08-29 腾讯科技(深圳)有限公司 建立音视频通讯连接的方法和装置
CN104394127A (zh) * 2014-11-11 2015-03-04 海信集团有限公司 一种多媒体分享方法、设备和系统
CN110022329A (zh) * 2018-01-08 2019-07-16 腾讯科技(深圳)有限公司 文件传输方法、装置、计算机可读存储介质及计算机设备
WO2022154363A1 (ko) * 2021-01-13 2022-07-21 삼성전자 주식회사 오디오 데이터를 처리하기 위한 오디오 장치 및 그의 동작 방법

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101047580A (zh) * 2006-03-28 2007-10-03 腾讯科技(深圳)有限公司 创建点对点数据通道的方法
CN102651701A (zh) * 2011-02-28 2012-08-29 腾讯科技(深圳)有限公司 建立音视频通讯连接的方法和装置
CN104394127A (zh) * 2014-11-11 2015-03-04 海信集团有限公司 一种多媒体分享方法、设备和系统
CN110022329A (zh) * 2018-01-08 2019-07-16 腾讯科技(深圳)有限公司 文件传输方法、装置、计算机可读存储介质及计算机设备
WO2022154363A1 (ko) * 2021-01-13 2022-07-21 삼성전자 주식회사 오디오 데이터를 처리하기 위한 오디오 장치 및 그의 동작 방법

Also Published As

Publication number Publication date
CN117560357A (zh) 2024-02-13
WO2024027272A9 (zh) 2024-04-04

Similar Documents

Publication Publication Date Title
US8817061B2 (en) Recognition of human gestures by a mobile phone
US7921157B2 (en) Duplicating digital streams for digital conferencing using switching technologies
US6269483B1 (en) Method and apparatus for using audio level to make a multimedia conference dormant
JP4738058B2 (ja) リアルタイムマルチメディア情報の効率的なルーティング
MX2012011620A (es) Transicion entre llamadas de circuito conmutado y videollamadas.
US8385234B2 (en) Media stream setup in a group communication system
WO2023125350A1 (zh) 音频数据推送方法、装置、系统、电子设备及存储介质
CN113114688B (zh) 多媒体会议管理方法及装置、存储介质、电子设备
CN106973253A (zh) 一种调整媒体流传输的方法及装置
US10462197B2 (en) On Demand in-band signaling for conferences
CN108574689B (zh) 一种可视通话的方法和装置
CN108630215B (zh) 一种基于视联网的回声抑制方法及装置
CN109963108B (zh) 一种一对多对讲的方法和装置
US11290685B2 (en) Call processing method and gateway
CN104427295A (zh) 一种视频会议中处理视频的方法及终端
CN109147812B (zh) 回声消除方法和装置
WO2024027272A1 (zh) 多媒体资源的传输方法、装置、电子设备及存储介质
WO2019210667A1 (zh) 屏幕画面传输方法、装置、服务器、系统及存储介质
KR102069695B1 (ko) 분산 텔레프레즌스 서비스 제공 방법 및 장치
CN108989737B (zh) 一种数据播放方法、装置和电子设备
US9503812B2 (en) Systems and methods for split echo cancellation
CN115606170A (zh) 用于沉浸式远程会议和远程呈现的多分组
CN110557595A (zh) 移动终端接入视频会议的方法和装置
CN110087020B (zh) 一种iOS设备进行视联网会议的实现方法及系统
CN114915748A (zh) 一种动态切换音视频通信方式的方法、系统及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23848982

Country of ref document: EP

Kind code of ref document: A1