CN110149305B - Video network-based multi-party audio and video playing method and transfer server - Google Patents

Video network-based multi-party audio and video playing method and transfer server Download PDF

Info

Publication number
CN110149305B
CN110149305B CN201910257607.0A CN201910257607A CN110149305B CN 110149305 B CN110149305 B CN 110149305B CN 201910257607 A CN201910257607 A CN 201910257607A CN 110149305 B CN110149305 B CN 110149305B
Authority
CN
China
Prior art keywords
video
audio
target
target audio
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910257607.0A
Other languages
Chinese (zh)
Other versions
CN110149305A (en
Inventor
方小帅
孙亮亮
李云鹏
沈军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Visionvera Information Technology Co Ltd
Original Assignee
Visionvera Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Visionvera Information Technology Co Ltd filed Critical Visionvera Information Technology Co Ltd
Priority to CN201910257607.0A priority Critical patent/CN110149305B/en
Publication of CN110149305A publication Critical patent/CN110149305A/en
Application granted granted Critical
Publication of CN110149305B publication Critical patent/CN110149305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/10Architectures or entities
    • H04L65/1063Application servers providing network services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/765Media network packet handling intermediate
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the application provides a method for playing audio and video in multiple ways based on a video network and a transit server. The method comprises the following steps: the transfer server receives an audio and video calling instruction; calling an original audio and video according to the audio and video calling instruction; analyzing the acquired original audio and video to obtain a target audio and video and a target audio; and respectively sending the target audio and video to an audio and video address of the video networking terminal and a video address of the conference scheduling terminal, and sending the target audio to an audio address of the conference scheduling terminal so as to play the target audio and the video simultaneously on the video networking terminal and the conference scheduling terminal. The problem that when a conference scheduling system and a video network terminal are simultaneously applied to live broadcast/recorded broadcast video watching, sound cannot be heard at the conference scheduling system side is solved.

Description

Video network-based multi-party audio and video playing method and transfer server
Technical Field
The present application relates to the field of video networking technologies, and in particular, to a method for playing audio and video in multiple parties based on a video networking and a relay server.
Background
With the rapid development of the video networking, video conferences, video teaching and the like based on the video networking are widely popularized in the aspects of life, work, learning and the like of users. In practical applications, a situation that one main meeting place and a plurality of branch meeting places simultaneously carry out video conferences, or one main classroom and a plurality of branch classrooms simultaneously carry out video teaching often occurs.
The main conference scheduling system in the video network is a PAMIR conference scheduling system, and when the conference scheduling system and the video network terminal are simultaneously applied to live broadcast/recorded broadcast video watching, sound cannot be heard at the PAMIR conference scheduling system side.
Disclosure of Invention
In view of the above problems, embodiments of the present application are provided to provide a method for playing audio and video based on multiple parties of a video network and a corresponding transit server, which overcome or at least partially solve the above problems.
In a first aspect, an embodiment of the present application discloses a method for playing audio and video in multiple parties based on a video network, where the method includes:
the transfer server receives an audio and video calling instruction, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video networking terminal;
the transfer server calls an original audio and video according to the audio and video calling instruction;
the transit server analyzes the acquired original audio and video to obtain a target audio and video and a target audio;
and the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and sends the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal.
Optionally, the transfer server includes a collaboration transfer server and a video network server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
Optionally, the step of the transit server analyzing the retrieved original audio and video to obtain a target audio and video and a target audio includes:
the transfer server extracts a target audio and a target video from the target audio and video;
transcoding the target video to obtain target videos with different code rates;
and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
Optionally, the audio/video call instruction further carries target audio/video type information of the conference scheduling terminal and target audio/video type information of the video networking terminal;
the synthesizing the target videos with different code rates and the target audio respectively to obtain the target audios and videos with different code rates comprises the following steps:
the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain a target audio and video of the video network terminal;
the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and the method comprises the following steps:
and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
In a second aspect, an embodiment of the present application discloses a transit server, including: the instruction receiving module is used for receiving an audio and video calling instruction by the transfer server, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video network terminal;
the transfer server is used for transferring the original audio and video according to the audio and video calling instruction;
the analysis module is used for analyzing the acquired original audio and video by the transit server to obtain a target audio and video and a target audio;
and the transmitting module is used for respectively transmitting the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal by the transfer server, and transmitting the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal.
Optionally, the transfer server includes a collaboration transfer server and a video network server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
Optionally, the parsing module is specifically configured to:
the transfer server extracts a target audio and a target video from the target audio and video;
transcoding the target video to obtain target videos with different code rates;
and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
Optionally, the audio/video call instruction further carries target audio/video type information of the conference scheduling terminal and target audio/video type information of the video networking terminal;
the synthesizing the target videos with different code rates and the target audio respectively to obtain the target audios and videos with different code rates comprises the following steps:
the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain a target audio and video of the video network terminal;
the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and the method comprises the following steps:
and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
In a third aspect, an embodiment of the present application further discloses a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the method of any one of the first aspect is implemented.
In a fourth aspect, an embodiment of the present application further discloses a computer-readable storage medium, where a computer program for executing any one of the methods in the first aspect is stored in the computer-readable storage medium.
The method for playing audio and video in multiple parties based on the video networking comprises the steps that firstly, a transfer server receives an audio and video calling instruction, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video networking terminal; the transfer server calls an original audio and video according to the audio and video calling instruction; further, the transit server analyzes the acquired original audio and video to obtain a target audio and video and a target audio; finally, the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and sends the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal. The problem that when a conference scheduling system and a video network terminal are simultaneously applied to live broadcast/recorded broadcast video watching, sound cannot be heard at the conference scheduling system side is solved.
Drawings
Fig. 1 is a schematic networking diagram of a video network provided in an embodiment of the present application;
fig. 2 is a schematic hardware structure diagram of a node server according to an embodiment of the present application;
fig. 3 is a schematic hardware structure diagram of an access switch according to an embodiment of the present application;
fig. 4 is a schematic hardware structure diagram of an ethernet protocol conversion gateway according to an embodiment of the present application;
FIG. 5 is a flowchart illustrating steps of a method for multi-party audio/video playing based on video networking according to an embodiment of the present application;
FIG. 6 is a diagram illustrating an example of an embodiment of a method provided by an embodiment of the present application;
fig. 7 is a block diagram of a configuration of a transit server according to an embodiment of the present application.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
The video networking is an important milestone for network development, is a real-time network, can realize high-definition video real-time transmission, and pushes a plurality of internet applications to high-definition video, and high-definition faces each other.
The video networking adopts a real-time high-definition video exchange technology, can integrate required services such as dozens of services of video, voice, pictures, characters, communication, data and the like on a system platform on a network platform, such as high-definition video conference, video monitoring, intelligent monitoring analysis, emergency command, digital broadcast television, delayed television, network teaching, live broadcast, VOD on demand, television mail, Personal Video Recorder (PVR), intranet (self-office) channels, intelligent video broadcast control, information distribution and the like, and realizes high-definition quality video broadcast through a television or a computer.
To better understand the embodiments of the present application, the following description refers to the internet of view:
some of the technologies applied in the video networking are as follows:
network Technology (Network Technology)
Network technology innovation in video networking has improved over traditional Ethernet (Ethernet) to face the potentially enormous video traffic on the network. Unlike pure network Packet Switching (Packet Switching) or network Circuit Switching (Circuit Switching), the internet of vision technology employs network Packet Switching to satisfy the demand of Streaming (translated into Streaming, and continuous broadcasting, which is a data transmission technology, converting received data into a stable and continuous stream, and continuously transmitting the stream, so that the sound heard by the user or the image seen by the user is very smooth, and the user can start browsing on the screen before the whole data is transmitted). The video networking technology has the advantages of flexibility, simplicity and low price of packet switching, and simultaneously has the quality and safety guarantee of circuit switching, thereby realizing the seamless connection of the whole network switching type virtual circuit and the data format.
Switching Technology (Switching Technology)
The video network adopts two advantages of asynchronism and packet switching of the Ethernet, eliminates the defects of the Ethernet on the premise of full compatibility, has end-to-end seamless connection of the whole network, is directly communicated with a user terminal, and directly bears an IP data packet. The user data does not require any format conversion across the entire network. The video networking is a higher-level form of the Ethernet, is a real-time exchange platform, can realize the real-time transmission of the whole-network large-scale high-definition video which cannot be realized by the existing Internet, and pushes a plurality of network video applications to high-definition and unification.
Server Technology (Server Technology)
The server technology on the video networking and unified video platform is different from the traditional server, the streaming media transmission of the video networking and unified video platform is established on the basis of connection orientation, the data processing capacity of the video networking and unified video platform is independent of flow and communication time, and a single network layer can contain signaling and data transmission. For voice and video services, the complexity of video networking and unified video platform streaming media processing is much simpler than that of data processing, and the efficiency is greatly improved by more than one hundred times compared with that of a traditional server.
Storage Technology (Storage Technology)
The super-high speed storage technology of the unified video platform adopts the most advanced real-time operating system in order to adapt to the media content with super-large capacity and super-large flow, the program information in the server instruction is mapped to the specific hard disk space, the media content is not passed through the server any more, and is directly sent to the user terminal instantly, and the general waiting time of the user is less than 0.2 second. The optimized sector distribution greatly reduces the mechanical motion of the magnetic head track seeking of the hard disk, the resource consumption only accounts for 20% of that of the IP internet of the same grade, but concurrent flow which is 3 times larger than that of the traditional hard disk array is generated, and the comprehensive efficiency is improved by more than 10 times.
Network Security Technology (Network Security Technology)
The structural design of the video network completely eliminates the network security problem troubling the internet structurally by the modes of independent service permission control each time, complete isolation of equipment and user data and the like, generally does not need antivirus programs and firewalls, avoids the attack of hackers and viruses, and provides a structural carefree security network for users.
Service Innovation Technology (Service Innovation Technology)
The unified video platform integrates services and transmission, and is not only automatically connected once whether a single user, a private network user or a network aggregate. The user terminal, the set-top box or the PC are directly connected to the unified video platform to obtain various multimedia video services in various forms. The unified video platform adopts a menu type configuration table mode to replace the traditional complex application programming, can realize complex application by using very few codes, and realizes infinite new service innovation.
Networking of the video network is as follows:
the video network is a centralized control network structure, and the network can be a tree network, a star network, a ring network and the like, but on the basis of the centralized control node, the whole network is controlled by the centralized control node in the network.
As shown in fig. 1, the video network is divided into an access network and a metropolitan network.
The devices of the access network part can be mainly classified into 3 types: node server, access switch, terminal (including various set-top boxes, coding boards, memories, etc.). The node server is connected to an access switch, which may be connected to a plurality of terminals and may be connected to an ethernet network.
The node server is a node which plays a centralized control function in the access network and can control the access switch and the terminal. The node server can be directly connected with the access switch or directly connected with the terminal.
Similarly, devices of the metropolitan network portion may also be classified into 3 types: a metropolitan area server, a node switch and a node server. The metro server is connected to a node switch, which may be connected to a plurality of node servers.
The node server is a node server of the access network part, namely the node server belongs to both the access network part and the metropolitan area network part.
The metropolitan area server is a node which plays a centralized control function in the metropolitan area network and can control a node switch and a node server. The metropolitan area server can be directly connected with the node switch or directly connected with the node server.
Therefore, the whole video network is a network structure with layered centralized control, and the network controlled by the node server and the metropolitan area server can be in various structures such as tree, star and ring.
The access network part can form a unified video platform (circled part), and a plurality of unified video platforms can form a video network; each unified video platform may be interconnected via metropolitan area and wide area video networking.
Video networking device classification
1.1 devices in the video network of the embodiment of the present application can be mainly classified into 3 types: servers, switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.). The video network as a whole can be divided into a metropolitan area network (or national network, global network, etc.) and an access network.
1.2 wherein the devices of the access network part can be mainly classified into 3 types: node servers, access switches (including ethernet gateways), terminals (including various set-top boxes, code boards, memories, etc.).
The specific hardware structure of each access network device is as follows:
a node server:
as shown in fig. 2, the system mainly includes a network interface module 201, a switching engine module 202, a CPU module 203, and a disk array module 204.
The network interface module 201, the CPU module 203, and the disk array module 204 all enter the switching engine module 202; the switching engine module 202 performs an operation of looking up the address table 205 on the incoming packet, thereby obtaining the direction information of the packet; and stores the packet in a queue of the corresponding packet buffer 206 based on the packet's steering information; if the queue of the packet buffer 206 is nearly full, it is discarded; the switching engine module 202 polls all packet buffer queues for forwarding if the following conditions are met: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero. The disk array module 204 mainly implements control over the hard disk, including initialization, read-write, and other operations on the hard disk; the CPU module 203 is mainly responsible for protocol processing with an access switch and a terminal (not shown in the figure), configuring an address table 205 (including a downlink protocol packet address table, an uplink protocol packet address table, and a data packet address table), and configuring the disk array module 204.
The access switch:
as shown in fig. 3, the network interface module (downstream network interface module 301, upstream network interface module 302), the switching engine module 303, and the CPU module 304 are mainly included.
Wherein, the packet (uplink data) coming from the downlink network interface module 301 enters the packet detection module 305; the packet detection module 305 detects whether the Destination Address (DA), the Source Address (SA), the packet type, and the packet length of the packet meet the requirements, if so, allocates a corresponding stream identifier (stream-id) and enters the switching engine module 303, otherwise, discards the stream identifier; the packet (downstream data) coming from the upstream network interface module 302 enters the switching engine module 303; the incoming data packet of the CPU module 304 enters the switching engine module 303; the switching engine module 303 performs an operation of looking up the address table 306 on the incoming packet, thereby obtaining the direction information of the packet; if the packet entering the switching engine module 303 is from the downstream network interface to the upstream network interface, the packet is stored in the queue of the corresponding packet buffer 307 in association with the stream-id; if the queue of the packet buffer 307 is nearly full, it is discarded; if the packet entering the switching engine module 303 is not from the downlink network interface to the uplink network interface, the data packet is stored in the queue of the corresponding packet buffer 307 according to the guiding information of the packet; if the queue of the packet buffer 307 is nearly full, it is discarded.
The switching engine module 303 polls all packet buffer queues, which in this embodiment is divided into two cases:
if the queue is from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queued packet counter is greater than zero; 3) and obtaining the token generated by the code rate control module.
If the queue is not from the downlink network interface to the uplink network interface, the following conditions are met for forwarding: 1) the port send buffer is not full; 2) the queue packet counter is greater than zero.
The rate control module 208 is configured by the CPU module 304, and generates tokens for packet buffer queues from all downstream network interfaces to upstream network interfaces at programmable intervals to control the rate of upstream forwarding.
The CPU module 304 is mainly responsible for protocol processing with the node server, configuration of the address table 306, and configuration of the code rate control module 308.
Ethernet protocol conversion gateway
As shown in fig. 4, the apparatus mainly includes a network interface module (a downlink network interface module 401 and an uplink network interface module 402), a switching engine module 403, a CPU module 404, a packet detection module 405, a rate control module 408, an address table 406, a packet buffer 407, a MAC adding module 409, and a MAC deleting module 410.
Wherein, the data packet coming from the downlink network interface module 401 enters the packet detection module 405; the packet detection module 405 detects whether the ethernet MAC DA, the ethernet MAC SA, the ethernet length or frame type, the video network destination address DA, the video network source address SA, the video network packet type, and the packet length of the packet meet the requirements, and if so, allocates a corresponding stream identifier (stream-id); then, the MAC deletion module 410 subtracts MAC DA, MAC SA, length or frame type (2byte) and enters the corresponding receiving buffer, otherwise, discards it;
the downlink network interface module 401 detects the sending buffer of the port, and if there is a packet, obtains the ethernet MAC DA of the corresponding terminal according to the video networking destination address DA of the packet, adds the ethernet MAC DA of the terminal, the MAC SA of the ethernet coordination gateway, and the ethernet length or frame type, and sends the packet.
The other modules in the ethernet protocol gateway function similarly to the access switch.
A terminal:
the system mainly comprises a network interface module, a service processing module and a CPU module; for example, the set-top box mainly comprises a network interface module, a video and audio coding and decoding engine module and a CPU module; the coding board mainly comprises a network interface module, a video and audio coding engine module and a CPU module; the memory mainly comprises a network interface module, a CPU module and a disk array module.
1.3 devices of the metropolitan area network part can be mainly classified into 3 types: node server, node exchanger, metropolitan area server. The node switch mainly comprises a network interface module, a switching engine module and a CPU module; the metropolitan area server mainly comprises a network interface module, a switching engine module and a CPU module.
2. Video networking packet definition
2.1 Access network packet definition
The data packet of the access network mainly comprises the following parts: destination Address (DA), Source Address (SA), reserved bytes, payload (pdu), CRC.
As shown in the following table, the data packet of the access network mainly includes the following parts:
DA SA Reserved Payload CRC
the Destination Address (DA) is composed of 8 bytes (byte), the first byte represents the type of the data packet (e.g. various protocol packets, multicast data packets, unicast data packets, etc.), there are at most 256 possibilities, the second byte to the sixth byte are metropolitan area network addresses, and the seventh byte and the eighth byte are access network addresses.
The Source Address (SA) is also composed of 8 bytes (byte), defined as the same as the Destination Address (DA).
The reserved byte consists of 2 bytes.
The payload part has different lengths according to types of different datagrams, and is 64 bytes if the type of the datagram is a variety of protocol packets, or is 1056 bytes if the type of the datagram is a unicast packet, but is not limited to the above 2 types.
The CRC consists of 4 bytes and is calculated in accordance with the standard ethernet CRC algorithm.
2.2 metropolitan area network packet definition
The topology of a metropolitan area network is a graph and there may be 2, or even more than 2, connections between two devices, i.e., there may be more than 2 connections between a node switch and a node server, a node switch and a node switch, and a node switch and a node server. However, the metro network address of the metro network device is unique, and in order to accurately describe the connection relationship between the metro network devices, parameters are introduced in the embodiment of the present application: a label to uniquely describe a metropolitan area network device.
In this specification, the definition of the Label is similar to that of a Label of Multi-Protocol Label switching (MPLS), and assuming that there are two connections between a device a and a device B, there are 2 labels for a packet from the device a to the device B, and 2 labels for a packet from the device B to the device a. The label is classified into an incoming label and an outgoing label, and assuming that the label (incoming label) of the packet entering the device a is 0x0000, the label (outgoing label) of the packet leaving the device a may become 0x 0001. The network access process of the metro network is a network access process under centralized control, that is, address allocation and label allocation of the metro network are both dominated by the metro server, and the node switch and the node server are both passively executed, which is different from label allocation of MPLS, and label allocation of MPLS is a result of mutual negotiation between the switch and the server.
As shown in the following table, the data packet of the metro network mainly includes the following parts:
DA SA Reserved label (R) Payload CRC
Namely Destination Address (DA), Source Address (SA), Reserved byte (Reserved), tag, payload (pdu), CRC. The format of the tag may be defined by reference to the following: the tag is 32 bits with the upper 16 bits reserved and only the lower 16 bits used, and its position is between the reserved bytes and payload of the packet.
In the video network, the video network terminals watch the same audio and video addresses analyzed by the live broadcast, and the audio and video of the conference scheduling terminal are different addresses. Therefore, in the prior art, the situation that live broadcast/recorded broadcast audio and video can not be watched simultaneously by using the video network terminal and the conference scheduling terminal cannot be completely realized.
The main body of a preferred implementation manner of the method for viewing audio and video by multiple parties based on the video networking is a transit server.
Fig. 5 illustrates a method for playing audio and video in multiple parties based on a video network, where the method for playing audio and video in multiple parties is applicable to a transit server, and includes the following steps:
step 501: the transfer server receives an audio and video calling instruction, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video networking terminal.
In a possible implementation manner, the audio/video call instruction further carries a target audio/video type of the conference scheduling terminal and a target audio/video type of the video network terminal, the conference scheduling terminal identifier and the video network terminal identifier, and the like. However, the information carried in the audio/video call instruction is not specifically limited in the present application, and all information that enables the transfer server to identify and locate the target audio/video required by the user is within the protection scope of the embodiment of the present application.
Step 502: and the transfer server calls the original audio and video according to the audio and video calling instruction.
In a possible embodiment of the present application, the target audio/video is obtained by the transfer server searching a target audio/video source front end address in the target device corresponding to the audio/video identifier in a preset identifier address corresponding table according to the audio/video identifier carried in the received audio/video call instruction, and requesting the target audio/video source front end according to the target audio/video source front end address. It should be noted that the above manner of obtaining the audio and video is only an example, and the remaining methods that can obtain the target audio and video based on the target audio and video call instruction are all within the protection scope of the embodiment of the present application.
Step 503: and the transfer server analyzes the acquired original audio and video to obtain a target audio and video and a target audio.
In step 503, the transit server extracts a target audio and a target video from the target audio and video; transcoding the target video to obtain target videos with different code rates; and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
Further, the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain the target audio and video of the video network terminal.
Step 504: and the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and sends the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal.
In a preferred embodiment, the sending, by the transit server, the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal respectively includes: and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
In a preferred embodiment, the transit server comprises a collaboration server and a video network server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
In one implementation, the video network terminal device may be a Set Top Box (STB), generally called a Set Top Box or Set Top Box, which is a device for connecting a tv Set and an external signal source, and converts a compressed digital signal into tv content and displays the tv content on the tv Set. Generally, the set-top box may be connected to a camera and a microphone for collecting multimedia data such as video data and audio data, and may also be connected to a television for playing multimedia data such as video data and audio data.
Therefore, in practical applications, a user can trigger the video network terminal to generate a control service application instruction through some operations in a menu (gtml) file, such as dialing a user number of a set top box of an opposite terminal, and send the control service application instruction to the video network server.
The video network terminal device may further include an external interface, such as a USB interface, an HDMI _ OUTx2 interface, an HDMI _ IN interface, a dongle interface, an RCA (Radio Corporation of American, RC interface), an AV interface, and the like.
In order to more clearly illustrate the method for viewing audio and video by multiple parties based on the video network provided by the embodiment of the present application, an example is now described based on fig. 6:
step 601: the conference scheduling terminal sends a target audio and video first calling request message to the video networking server, and the video networking terminal sends a target audio and video second calling request message to the video networking server.
Step 602: and the video networking server extracts the video address, the audio address and the target audio and video type of the conference scheduling terminal in the target audio and video first calling request message and the audio and video address and the target audio and video type of the video networking terminal in the target audio and video second calling request message, and generates the target audio and video calling instruction according to the information.
Step 603: and the video network server sends the target audio and video calling instruction to a coordination server.
Step 604: and the coordination server calls the target audio and video according to the target audio and video calling instruction.
Step 605: and the coordination server analyzes the called target audio and video to obtain a target audio and video of the video network terminal, a target audio and video of the conference scheduling terminal and an analyzed target audio and video.
Step 606: and the coordination server sends the analyzed target audio and video and the target audio to the video networking server.
Step 607: and the video networking server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal, sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal, and sends the analyzed target audio to the audio address of the conference scheduling terminal.
Step 608: and the video networking terminal and the conference scheduling terminal play audio and video.
In step 608, the method for playing the audio and video by the specific video networking terminal and the conference scheduling terminal may utilize the method in the prior art, and the video networking terminal and the conference scheduling terminal may respectively include a picture encoder, an image decoder, an audio encoder, an audio decoder, and the like. The embodiments of the present application are not particularly limited herein.
In summary, the method for watching audio and video in multiple parties based on the video networking, provided by the embodiment of the application, solves the problem that sound cannot be heard at the conference scheduling system side when live broadcast/recorded broadcast video watching is carried out by simultaneously applying the conference scheduling system and the video networking terminal in the video networking, and improves user experience.
It should be noted that, for simplicity of description, the method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the embodiments are not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the embodiments. Further, those skilled in the art will also appreciate that the embodiments described in the specification are presently preferred and that no particular act is required of the embodiments of the application.
Based on the same technical concept, referring to fig. 7, a block diagram of a transit server provided in the embodiment of the present application is shown, where the apparatus may be applied to a video network, and specifically may include the following modules:
the instruction receiving module 701 is used for the transit server to receive an audio/video call instruction, wherein the audio/video call instruction carries a video address and an audio address of the conference scheduling terminal and an audio/video address of the video network terminal.
And the calling module 702 is configured to call the original audio and video by the transit server according to the audio and video calling instruction.
And the analysis module 703 is configured to analyze the acquired original audio and video by the transit server to obtain a target audio and video and a target audio.
A sending module 704, configured to send the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, respectively, and send the target audio to the audio address of the conference scheduling terminal, so as to play the target audio and the target audio simultaneously on the video networking terminal and the conference scheduling terminal.
In a preferred embodiment, the transit server comprises a collaboration server and a video network server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
The parsing module 703 is specifically configured to: the transfer server extracts a target audio and a target video from the target audio and video; transcoding the target video to obtain target videos with different code rates; and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
The audio and video calling instruction also carries target audio and video type information of the conference scheduling terminal and target audio and video type information of the video network terminal;
the synthesizing the target videos with different code rates and the target audio respectively to obtain the target audios and videos with different code rates comprises the following steps:
the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain a target audio and video of the video network terminal;
the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and the method comprises the following steps:
and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
As will be appreciated by one of skill in the art, embodiments of the present application may be provided as a method, apparatus, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The video playing method and the streaming media server based on the video network are provided by the application. The detailed description is given, and the principle and the implementation of the present application are explained by applying specific examples, and the above description of the embodiments is only used to help understanding the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for playing audio and video in multiple parties based on video network is characterized in that the method comprises the following steps:
the transfer server receives an audio and video calling instruction, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video networking terminal; the transfer server calls an original audio and video according to the audio and video calling instruction;
the transit server analyzes the acquired original audio and video to obtain a target audio and video and a target audio; the target audio and video comprises a target audio and video of the video networking terminal and a target audio and video of the conference scheduling terminal; the target audio and video of the video networking terminal is obtained by synthesizing the target video and the target audio with code rates corresponding to the target audio and video type information of the video networking terminal, and the target audio and video of the conference scheduling terminal is obtained by synthesizing the target video and the target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal;
and the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and sends the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal.
2. The method of claim 1, wherein the transit server comprises a collaboration server and a video networking server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
3. The method of claim 1, wherein the step of the transit server parsing the retrieved original audio and video to obtain a target audio and video and a target audio comprises:
the transfer server extracts a target audio and a target video from the target audio and video;
transcoding the target video to obtain target videos with different code rates;
and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
4. The method of claim 3, wherein the audio/video call instruction further carries target audio/video type information of the conference scheduling terminal and target audio/video type information of the video networking terminal;
the synthesizing the target videos with different code rates and the target audio respectively to obtain the target audios and videos with different code rates comprises the following steps:
the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain a target audio and video of the video network terminal;
the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and the method comprises the following steps:
and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
5. A transit server, characterized in that the transit server comprises:
the instruction receiving module is used for receiving an audio and video calling instruction by the transfer server, wherein the audio and video calling instruction carries a video address and an audio address of a conference scheduling terminal and an audio and video address of a video network terminal;
the transfer server is used for transferring the original audio and video according to the audio and video calling instruction;
the analysis module is used for analyzing the acquired original audio and video by the transit server to obtain a target audio and video and a target audio; the target audio and video comprises a target audio and video of the video networking terminal and a target audio and video of the conference scheduling terminal; the target audio and video of the video networking terminal is obtained by synthesizing the target video and the target audio with code rates corresponding to the target audio and video type information of the video networking terminal, and the target audio and video of the conference scheduling terminal is obtained by synthesizing the target video and the target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal;
and the transmitting module is used for respectively transmitting the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal by the transfer server, and transmitting the target audio to the audio address of the conference scheduling terminal so as to be simultaneously played on the video networking terminal and the conference scheduling terminal.
6. The transit server of claim 5, wherein the transit server comprises a collaboration server and a video network server;
the audio and video calling instruction is sent to a video network server by the conference scheduling terminal and the video network terminal, and is forwarded to the protocol conversion server by the video network server;
the target audio and video is called by the coordination server, and the target audio and video is analyzed and then sent to the video networking server.
7. The transit server of claim 5, wherein the parsing module is specifically configured to:
the transfer server extracts a target audio and a target video from the target audio and video;
transcoding the target video to obtain target videos with different code rates;
and respectively synthesizing the target videos with different code rates and the target audio to obtain target audios and videos with different code rates.
8. The transfer server according to claim 7, wherein the audio/video call instruction further carries target audio/video type information of the conference scheduling terminal and target audio/video type information of the video networking terminal;
the synthesizing the target videos with different code rates and the target audio respectively to obtain the target audios and videos with different code rates comprises the following steps:
the transfer server synthesizes a target video and a target audio with code rates corresponding to the target audio and video type information of the conference scheduling terminal to obtain a target audio and video of the conference scheduling terminal, and synthesizes the target video and the target audio with the code rates corresponding to the target audio and video type information of the video network terminal to obtain a target audio and video of the video network terminal;
the transfer server respectively sends the target audio and video to the audio and video address of the video networking terminal and the video address of the conference scheduling terminal, and the method comprises the following steps:
and the transfer server sends the target audio and video of the video networking terminal to the audio and video address of the video networking terminal and sends the target audio and video of the conference scheduling terminal to the video address of the conference scheduling terminal.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 4.
CN201910257607.0A 2019-04-01 2019-04-01 Video network-based multi-party audio and video playing method and transfer server Active CN110149305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910257607.0A CN110149305B (en) 2019-04-01 2019-04-01 Video network-based multi-party audio and video playing method and transfer server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910257607.0A CN110149305B (en) 2019-04-01 2019-04-01 Video network-based multi-party audio and video playing method and transfer server

Publications (2)

Publication Number Publication Date
CN110149305A CN110149305A (en) 2019-08-20
CN110149305B true CN110149305B (en) 2021-10-19

Family

ID=67589300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910257607.0A Active CN110149305B (en) 2019-04-01 2019-04-01 Video network-based multi-party audio and video playing method and transfer server

Country Status (1)

Country Link
CN (1) CN110149305B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111083428A (en) * 2019-12-27 2020-04-28 北京东土科技股份有限公司 Audio and video data processing method and device, computer equipment and storage medium
CN111131760B (en) * 2019-12-31 2022-12-13 视联动力信息技术股份有限公司 Video recording method and device
CN112787830B (en) * 2020-12-24 2023-10-13 世邦通信股份有限公司 High-compatibility broadcasting system based on audio conference architecture

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754366A (en) * 2015-03-03 2015-07-01 腾讯科技(深圳)有限公司 Audio and video file live broadcasting method, device and system
US9948891B1 (en) * 2017-03-29 2018-04-17 Ziiproow, Inc. Conducting an audio or video conference call
CN108307212A (en) * 2017-01-11 2018-07-20 北京视联动力国际信息技术有限公司 A kind of file order method and device
CN108810444A (en) * 2017-07-31 2018-11-13 北京视联动力国际信息技术有限公司 Processing method, conference dispatching end and the association of video conference turn server
CN108881796A (en) * 2017-12-27 2018-11-23 北京视联动力国际信息技术有限公司 A kind of video data handling procedure and turn server depending on networking association

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754366A (en) * 2015-03-03 2015-07-01 腾讯科技(深圳)有限公司 Audio and video file live broadcasting method, device and system
CN108307212A (en) * 2017-01-11 2018-07-20 北京视联动力国际信息技术有限公司 A kind of file order method and device
US9948891B1 (en) * 2017-03-29 2018-04-17 Ziiproow, Inc. Conducting an audio or video conference call
CN108810444A (en) * 2017-07-31 2018-11-13 北京视联动力国际信息技术有限公司 Processing method, conference dispatching end and the association of video conference turn server
CN108881796A (en) * 2017-12-27 2018-11-23 北京视联动力国际信息技术有限公司 A kind of video data handling procedure and turn server depending on networking association

Also Published As

Publication number Publication date
CN110149305A (en) 2019-08-20

Similar Documents

Publication Publication Date Title
CN108574688B (en) Method and device for displaying participant information
CN110166728B (en) Video networking conference opening method and device
CN109640028B (en) Method and device for carrying out conference combining on multiple video networking terminals and multiple Internet terminals
CN109803111B (en) Method and device for watching video conference after meeting
CN108737768B (en) Monitoring method and monitoring device based on monitoring system
CN108881815B (en) Video data transmission method and device
CN109120879B (en) Video conference processing method and system
CN110049271B (en) Video networking conference information display method and device
CN110049273B (en) Video networking-based conference recording method and transfer server
CN109788235B (en) Video networking-based conference recording information processing method and system
CN109218306B (en) Audio and video data stream processing method and system
CN109040656B (en) Video conference processing method and system
CN110149305B (en) Video network-based multi-party audio and video playing method and transfer server
CN108965930B (en) Video data processing method and device
CN108630215B (en) Echo suppression method and device based on video networking
CN108574816B (en) Video networking terminal and communication method and device based on video networking terminal
CN110049268B (en) Video telephone connection method and device
CN110113564B (en) Data acquisition method and video networking system
CN109743284B (en) Video processing method and system based on video network
CN109286775B (en) Multi-person conference control method and system
CN111327868A (en) Method, terminal, server, device and medium for setting conference speaking party role
CN110769179B (en) Audio and video data stream processing method and system
CN109005378B (en) Video conference processing method and system
CN111131760A (en) Video recording method and device
CN110022286B (en) Method and device for requesting multimedia program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant