WO2012041117A1 - Method, system and related device for centralized monitoring of video conference terminal - Google Patents

Method, system and related device for centralized monitoring of video conference terminal Download PDF

Info

Publication number
WO2012041117A1
WO2012041117A1 PCT/CN2011/077639 CN2011077639W WO2012041117A1 WO 2012041117 A1 WO2012041117 A1 WO 2012041117A1 CN 2011077639 W CN2011077639 W CN 2011077639W WO 2012041117 A1 WO2012041117 A1 WO 2012041117A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
video
monitoring
module
signal
Prior art date
Application number
PCT/CN2011/077639
Other languages
French (fr)
Chinese (zh)
Inventor
吴永明
薛尧舜
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012041117A1 publication Critical patent/WO2012041117A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Definitions

  • the present invention relates to the field of video conferencing services, and in particular, to a method and system for centralized monitoring of video conferencing terminals and related devices. Background technique
  • the video conferencing system is a multimedia communication system that supports audio and video transmission. It enables individuals or groups in two or more different places to transmit voice, video and file data to each other through the transmission network and multimedia devices. Interactive communication.
  • the video conference system includes: a video conference terminal, a multipoint control unit (MCU), and a transmission network; the video conference terminal (hereinafter referred to as a terminal) includes a codec for implementing compression coding of audio, video, and the like media. And decoding processing; the terminal can be connected to peripherals such as microphone, camera, display, audio, etc. to complete audio and video input and output; the user can conveniently monitor and remotely operate the terminal through the PC console, and through the terminal man-machine
  • the interactive input interface inputs instructions and information to the terminal, and the terminal provides terminal status information to the user through the human-machine interaction output interface.
  • the MCU can provide users with connection services for group meetings and multiple groups of conferences.
  • the terminal participating in the multipoint conference establishes a connection with the MCU through the transmission network, and the MCU completes the exchange and mixing of the audio and video streams between the multiple terminals; for the audio media stream, the MCU sends the multi-party mixing processing for each terminal.
  • the MCU sends a single-picture video stream of another terminal for each terminal. If the MCU supports the multi-picture function, the video from multiple terminals can be combined into one multi-picture image and sent to the terminal.
  • FIG. 1 is a schematic diagram of a conventional video conference system, and FIG. 1 includes two locations, each location A plurality of terminals are installed, and the PC console separately monitors audio and video signals for each terminal through the IP network, but the PC console can only manage a single terminal at the same time, and cannot manage multiple terminals at the same time, however,
  • the popularity of video conferencing applications, the increasing number of terminals deployed in enterprises or organizations, and the geographical distribution of physical locations are also increasingly dispersed, for example: on different floors of the same building, or between different buildings, if meeting management personnel at the same time Only the monitoring of the working status of one terminal can be implemented, which is not conducive to the protection and maintenance of multiple terminals, and increases the operating cost of the enterprise. Summary of the invention
  • the main purpose of the present invention is to provide a method and system for centralized monitoring of a video conference terminal and related devices, so that the conference management personnel can simultaneously monitor the working status of multiple terminals, facilitate remote control of the terminal, and reduce the enterprise's Operating costs.
  • a method for centralized monitoring of a video conference terminal comprising: a monitoring center server and a terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals, and sends the signal to the monitoring center server; The server unpacks, decodes, and synthesizes the received audio and video signals, and outputs audio and video signals through the peripherals.
  • the audio and video media transmission channel is established to: determine media transmission channel parameters by means of pre-agreed or dynamic signaling negotiation, and establish one or more audio and video media transmission channels.
  • the encoding and encapsulation format of the audio and video signal in the terminal is determined by: the terminal and the monitoring center server determine the coding that the two parties wish to use by exchanging information in the signaling channel or by pre-configuring. Format, package format.
  • the method further includes: transforming the format of the signal during the compression encoding process; and transforming the video signal into: reducing the video Frame resolution, reduced frame rate processing; for audio
  • the signal transformation process is: reducing the sampling rate and reducing the quantization precision processing.
  • the monitoring center server synthesizes the received audio signal into: mixing a plurality of audio signals, or performing a splicing process according to a certain rule, and playing the same through the audio device;
  • the composite video signal is synthesized by combining a plurality of video signals into one multi-picture signal and playing them through a display device, or outputting each video signal to a display device.
  • the present invention also provides a system for centralized monitoring of a video conference terminal, including: one or more terminals, and a monitoring center server;
  • the terminal is configured to collect audio and video signals, and perform compression coding and encapsulation processing on the collected audio and video signals, and then send the signals to the monitoring center server;
  • the monitoring center server is configured to receive audio and video signals sent by the terminal, and perform demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and output audio and video signals through the peripheral devices.
  • the terminal includes a monitoring front-end module; wherein, the monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, and a monitoring network transceiver module; wherein, the monitoring audio encoding module is configured to perform audio signals on the terminal Compressed coding, and encapsulated and sent to the monitoring network transceiver module;
  • the monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
  • the monitoring network transceiver module is configured to separately send the processed audio signal and the video signal to the monitoring center server.
  • the monitoring center server includes: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, and a video output module;
  • a network transceiver module configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
  • An audio decoding module configured to remove an audio signal and decode it, and then send the audio signal to the audio output Module
  • An audio output module configured to synthesize and process the decoded audio signal, and play the same through the peripheral;
  • the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
  • the video output module is configured to synthesize the decoded video signal and play it through the peripheral device.
  • the monitoring front-end module and the monitoring center server respectively include a monitoring signaling processing module, configured to determine media transmission channel parameters of both parties by means of agreement or dynamic signaling negotiation, and establish one or more audio and video media transmissions. aisle.
  • the monitoring signaling processing module is further configured to determine, by using a manner of exchanging information in the signaling channel, or by a pre-configured manner, an encoding format and an encapsulation format that the monitoring front-end module and the monitoring center server desire to use.
  • the monitoring center server includes: one or more monitoring processing boards, configured to perform demolition, decoding, synthesizing, processing, and processing of audio and/or video of one channel of audio and/or one channel of video signals. The signal is played through the corresponding peripheral.
  • the monitoring and processing board is further configured to determine media transmission channel parameters of the monitoring front-end module and the monitoring processing board by means of a convention or dynamic signaling negotiation, and establish an audio and video media transmission channel;
  • the manner in which the information is exchanged in the signaling channel, or the encoding format and the encapsulation format that the monitoring front-end module and the monitoring processing board wish to use are determined in a pre-configured manner.
  • the present invention also provides a video conferencing terminal, including a monitoring front-end module; wherein, the monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, a monitoring network transceiver module, and a monitoring signaling processing module;
  • the monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
  • a monitoring video encoding module configured to compress and encode a video signal of the terminal, and encapsulate the video signal Send to the monitoring network transceiver module;
  • a monitoring network transceiver module configured to separately send the processed audio signal and the video signal to the monitoring center server
  • the monitoring signaling processing module is configured to determine media transmission channel parameters of the terminal and the monitoring center server by means of a contract or dynamic signaling negotiation, and establish one or more audio and video media transmission channels.
  • the present invention also provides a monitoring center server, including: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, a video output module, and a monitoring signaling processing module;
  • a network transceiver module configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
  • An audio decoding module configured to remove and package the audio signal, and decode the signal to send to the audio output module
  • An audio output module configured to synthesize and process the decoded audio signal, and play the same through the peripheral;
  • the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
  • the monitoring signaling processing module is configured to determine a media transmission channel parameter of the terminal and the monitoring center server by using a convention or a dynamic signaling negotiation manner, Establish one or more audio and video media transmission channels.
  • the present invention provides a method and system for centralized monitoring of a video conference terminal and related devices.
  • the monitoring center server establishes an audio and video media transmission channel with the terminal; the terminal compresses the collected audio and video signals. Encoding and encapsulation processing, and sent to the monitoring center server; the monitoring center server de-encapsulates, decodes, synthesizes the received audio and video signals, and outputs audio and video signals through the peripherals.
  • the solution of the present invention enabling video
  • the conference management personnel simultaneously monitors the working status of multiple terminals, facilitates remote control of the terminal, and timely obtains safeguard measures for the failed terminal, improving the working efficiency of the video conference management personnel and the guarantee capability of the video conference service. And reduce the operating costs of the enterprise.
  • FIG. 1 is a schematic structural diagram of a prior art video conference system
  • FIG. 2 is a schematic structural diagram of a system for centralized monitoring of a video conference terminal according to the present invention
  • FIG. 3 is a schematic flowchart of a method for centralized monitoring of a video conference terminal according to the present invention
  • FIG. 4 is a schematic diagram showing the internal structure of a video conference terminal according to the present invention.
  • FIG. 5 is a schematic diagram showing the internal structure of a monitoring center server according to the present invention.
  • FIG. 6 is a schematic structural diagram of a system for monitoring a video conference terminal in a centralized monitoring system when the monitoring center server uses the frame structure. detailed description
  • the basic idea of the present invention is: the monitoring center server and the terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals, and sends them to the monitoring center server; the monitoring center server receives the received sound The video signal is unpacked, decoded, synthesized, and the audio and video signals are output through the peripheral.
  • the present invention provides a system for centralized monitoring of a video conference terminal.
  • the system includes: one or more terminals and one monitoring center server; wherein, the terminal communicates with the monitoring center server through an IP network.
  • the method for centralized monitoring of a video conference terminal includes the following steps:
  • Step 301 The monitoring center server establishes an audio and video media transmission channel with the terminal.
  • the monitoring center server and the terminal may establish one or more audio and video media transmission channels by pre-arranging media transmission channel parameters or dynamic signaling negotiation manner;
  • the signaling can be used but is not limited to Real Time Streaming Protocol (RTSP);
  • the media transmission channel can be used but is not limited to Transmission Control Protocol (TCP) or User Data Packet Protocol (UDP).
  • TCP Transmission Control Protocol
  • UDP User Data Packet Protocol
  • User Datagram Protocol User Datagram Protocol
  • Step la The monitoring center server initiates a TCP connection request to the terminal, and the connection is used to deliver the RTSP message;
  • Step lb The monitoring center server sends an RTSP establishment (SETUP) message to inform the terminal about the transport layer protocol and address information established by the monitoring center server;
  • SETUP RTSP establishment
  • Step lc The terminal sends a response (SETUP RESPONSE) message, and the two channels establish a negotiation to complete;
  • Step Id The monitoring center server sends an RTSP start (PLAY) message, requesting the terminal to start sending the local audio media stream;
  • PLAY RTSP start
  • Step le After receiving the PLAY message, the terminal sends the local audio media stream to the monitoring center server.
  • media streaming channels can be established in a similar manner, such as: a remote audio media stream, a local video media stream, and a remote video media stream.
  • Step 302 The terminal performs compression coding and encapsulation processing on the collected audio and video signals, and sends the audio and video signals to the monitoring center server.
  • the terminal first collects the audio and video signals, and performs compression coding on the collected audio and video signals;
  • the audio and video signals collected by the terminal include: the local audio signal, the far-end audio signal, and the local video. Signal, remote video signal;
  • the terminal and the monitoring center server can determine parameters such as an encoding format and an encapsulation format that both parties wish to use by exchanging information in the signaling channel or by pre-configuring.
  • G.711, G.722 or other encoding algorithms may be used, and the terminal compresses the video signal.
  • H.263, H.264 or other encoding algorithms may be used; if the format of the collected audio and video signals does not meet the requirements of the specified encoding format, the format of the signal needs to be performed in the compression encoding process.
  • Transform processing for example: For video signals, it is necessary to reduce the resolution of the video frame, reduce the frame rate, etc. For audio signals, it is necessary to reduce the sampling rate, reduce the quantization accuracy, etc.; in general, in order to reduce bandwidth consumption and calculation For the video signal, the common intermediate format (CIF, Common Intermediate Format) frame size can be used, and the frame rate can be used for about 15 frames.
  • CIF Common Intermediate Format
  • the terminal encapsulates the compressed audio and video signal and sends it to the monitoring center server through the media transmission channel; wherein the encapsulation format can be used but is not limited to Real Time Protocol (RTP);
  • RTP Real Time Protocol
  • Each media signal is encapsulated into an RTP stream, and multiple media signals can also be encapsulated into one RTP stream, for example: the local audio and local video signals are encapsulated into one RTP stream, and the far-end audio and the far-end video signal are encapsulated as An RTP stream.
  • Step 303 The monitoring center server performs demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and outputs audio and video signals through the peripheral devices.
  • the monitoring center server removes the packaged audio and video signals and decodes and synthesizes them, and then outputs audio and video signals through peripherals such as a display and a speaker; wherein, the synthesis processing of the audio signals may be: The audio signal is mixed, or spliced according to a certain rule, and played by the audio device; for the synthesis of the video signal, the video signal can be synthesized into a multi-picture signal and played through a display device, or Each video signal is output to a display device.
  • the system for centralized monitoring of a video conference terminal includes: one or more terminals, and one monitoring center server;
  • the terminal is configured to collect audio and video signals, and perform compression coding and encapsulation processing on the collected audio and video signals, and then send the signals to the monitoring center server;
  • the monitoring center server is configured to receive audio and video signals sent by the terminal, and perform demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and output audio and video signals through the peripheral devices.
  • the terminal includes: an audio input module, an audio output module, an audio encoding module, an audio decoding module, a video input module, a video output module, a video encoding module, a video decoding module, and a signal transceiving module, and further includes : Monitoring front-end modules;
  • the monitoring front end module is configured to perform compression coding and encapsulation processing on the audio signal and the video signal of the terminal, respectively, and send the processed audio and video signals to the monitoring center server.
  • the monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, and a monitoring network transceiver module;
  • the monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
  • the monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
  • the monitoring network transceiver module is configured to separately send the processed audio signal and the video signal to the network transceiver module of the monitoring center server.
  • the monitoring center server includes: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, and a video output module;
  • a network transceiver module configured to receive an audio and video signal sent by the monitoring front end module, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
  • An audio decoding module configured to remove and package the audio signal, and decode the signal to send to the audio output module
  • An audio output module configured to synthesize and process the decoded audio signal, and play the same through the peripheral;
  • the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
  • the video output module is configured to synthesize the decoded video signal and play it through the peripheral device.
  • the monitoring front-end module and the monitoring center server respectively include a monitoring signaling processing module, and the monitoring signaling processing module determines the media transmission channel parameters by establishing a dynamic transmission channel negotiation manner between the monitoring front-end module and the monitoring center server.
  • One or more audio and video media transmission channels; the signaling may be used but not limited to RTSP; the media transmission channel may be used but not limited to a TCP or UDP channel.
  • the monitoring signaling processing module is further configured to determine, by using a manner of exchanging information in the signaling channel, or by using a pre-configured manner, parameters such as an encoding format and an encapsulation format that the monitoring front-end module and the monitoring center server desire to use.
  • parameters such as an encoding format and an encapsulation format that the monitoring front-end module and the monitoring center server desire to use.
  • G.711, G.722 or other encoding algorithms may be used
  • H.263 when the front-end module performs compression encoding processing on the video signal.
  • H.264 or other encoding algorithm may be used.
  • the monitoring center server includes: one or more monitoring processing boards, as shown in FIG. 6;
  • the monitoring processing board is configured to remove, package, decode, and synthesize one channel of audio and/or one channel of video signals, and play the processed audio and/or video signals through corresponding peripherals.
  • the monitoring processing board is further configured to determine media transmission channel parameters of the monitoring front-end module and the monitoring processing board by means of a convention or dynamic signaling negotiation, and establish an audio and video media transmission channel; and pass through the signaling channel.
  • the encoding format and the encapsulation format that the monitoring front-end module and the monitoring processing board hope to use are determined in a manner of exchanging information or in a pre-configured manner.
  • the signal transmission relationship between the modules in the terminal is: in the direction in which the signal is sent: the audio input module 1 collects the audio of the local end, such as the input of a peripheral such as a microphone, and The collected audio signal is copied into two copies, and one copy is sent to the audio encoding module 2 for compression encoding, and then sent to the opposite end by the signal transceiving module 3, and the other is sent to the monitoring front end module, and the monitoring in the front end module is monitored.
  • the audio encoding module 10 compresses and encodes the input audio, and then sends it to the monitoring center server through the monitoring network transceiver module 12; the video input module 6 collects the video of the local end, such as the input of peripherals such as a camera, and collects the collected
  • the video signal is copied into two copies, one is sent to the video encoding module 7 for compression encoding, and then sent by the signal transceiving module 3 to the opposite end, and the other is sent to the monitoring front end module, and the monitoring video encoding module 11 in the monitoring front end module is
  • the input video is compression encoded and encapsulated, and then sent to the monitoring center server through the monitoring network transceiver module 12.
  • the signal transceiver module 3 receives the audio signal of the opposite end, and after removing the package of the network transmission layer, sends the valid signal to the audio decoding module 5, and the audio decoding module 5 sends the decoded signal to the audio output module 4,
  • the audio output module 4 copies the audio signal into two copies, one output to the peripherals such as the speaker for playing, and the other to the monitoring front end module, and the monitoring audio encoding module 10 of the monitoring front end module compresses and encodes the input audio.
  • the monitoring network transceiver module 12 sends the signal to the monitoring center server; the network transceiver module 3 receives the video signal of the opposite end, removes the encapsulation of the network transmission layer, and sends a valid signal to the video decoding module 9, and the video decoding module 9 decodes the decoded data.
  • the video output module 8 sends the data to two copies, one output is sent to the peripherals such as the display for playback, and the other is sent to the monitoring front-end module, and the monitoring video encoding module 11 of the front-end module monitors the input video. Compressed and encapsulated, then sent and received through the monitoring network Block 12 transmits to the monitoring center server.
  • the network transceiver module 1 receives the audio and video signals sent by the monitoring front-end modules of each terminal, respectively, and the audio The signal is transmitted to the audio decoding module 2, and the video signal is transmitted to the video decoding module 3.
  • the audio decoding module 2 unpacks and decodes the audio signal, and then transmits the audio signal to the audio output module 4, and the audio output module 4 pairs the plurality of monitoring front-end modules.
  • the audio signal is synthesized and played by a peripheral device such as a speaker; the synthesized processing of the audio signal may be: mixing the local audio and the far end audio sent by a monitoring front end module, or using a roulette wheel
  • the audio signal segment splicing process of each monitoring front-end module is selected by the rotation mode; the video decoding module 3 de-encapsulates and decodes the video signal, and then transmits the video signal to the video output module 5, and the video output module 5 performs video signals from the plurality of monitoring front-end modules.
  • the video signal synthesis processing may be: combining the local video and the far-end video sent by the multiple monitoring front-end modules into one multi-picture video, and playing through a display device, Or output each video signal to a display device separately.
  • the monitoring center server of the present invention can also be implemented by using a hardware device of a chassis-type structure.
  • a hardware device of a chassis-type structure As shown in FIG. 6, in the structure shown in FIG. 6, one or more monitoring processing boards are inserted in the monitoring center server, and Each monitoring and processing board is respectively connected to one terminal and one peripheral such as a monitor; each monitoring and processing board is responsible for performing demodulation, decoding, synthesizing, etc. on one audio and/or one video signal, and The processed audio and/or video signals are played through the corresponding peripherals; each monitoring processing board can use the structure shown in FIG. 5, and the functions between the modules are not described again; Internally, the power supply status of each monitoring board can be configured uniformly or separately.
  • the invention provides a method and system for centralized monitoring of a video conference terminal and related devices, and the basic idea is: the monitoring center server and the terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals and Encapsulation processing, and sent to the monitoring center server; the monitoring center server removes the encapsulation, decoding, and synthesis processing of the received audio and video signals, and outputs audio and video signals through the peripherals.
  • the solution of the present invention enables the video conference management personnel to monitor the working status of multiple terminals at the same time, facilitates remote control of the terminal, and timely obtains safeguard measures for the faulty terminal, thereby improving the video conference management personnel.

Abstract

Disclosed are a method, system and related device for centralized monitoring of video conference terminal. The method of the present invention comprises: an audio-video media transmission channel is established between a monitoring center server and a terminal; the terminal compresses and encapsulates the collected audio-video signal and sends the signal to the monitoring center server; the monitoring center server de-encapsulates, de-compresses and synthesizes the received audio-video signal, and outputs the audio-video signal via peripheral equipment. The embodiment of the present invention enables the video conference administrator to simultaneously monitor the working status of multiple terminals, conveniently remote control the terminals, and promptly take support measures for faulty terminals, thus improving the work efficiency of the video conference administrator and the support capability of video conference operations and reducing operating cost.

Description

一种对视频会议终端集中监控的方法和系统及相关装置 技术领域  Method and system for centralized monitoring of video conference terminal and related device
本发明涉及视频会议业务应用领域, 特别是一种对视频会议终端集中 监控的方法和系统及相关装置。 背景技术  The present invention relates to the field of video conferencing services, and in particular, to a method and system for centralized monitoring of video conferencing terminals and related devices. Background technique
视频会议系统是支持音频、 视频传递的多媒体通信系统, 可以使两个 或两个以上不同地方的个人或群体, 通过传送网络及多媒体设备, 将声音、 影像及文件资料传递给对方, 实现即时且互动的沟通。  The video conferencing system is a multimedia communication system that supports audio and video transmission. It enables individuals or groups in two or more different places to transmit voice, video and file data to each other through the transmission network and multimedia devices. Interactive communication.
视频会议系统包括: 视频会议终端、 多点控制单元(MCU, Multipoint Control Units )和传送网络; 视频会议终端 (以下简称终端) 中包括编解码 器, 用于实现对音频、 视频等媒体的压缩编码和解码处理; 终端可以连接 麦克风、 摄像头、 显示器、 音响等外设, 以完成音频、 视频的输入和输出; 用户可通过 PC控制台方便地对终端进行监控和远程操作,并通过终端的人 机交互输入接口向终端输入指令和信息, 终端通过人机交互输出接口向用 户提供终端状态信息。  The video conference system includes: a video conference terminal, a multipoint control unit (MCU), and a transmission network; the video conference terminal (hereinafter referred to as a terminal) includes a codec for implementing compression coding of audio, video, and the like media. And decoding processing; the terminal can be connected to peripherals such as microphone, camera, display, audio, etc. to complete audio and video input and output; the user can conveniently monitor and remotely operate the terminal through the PC console, and through the terminal man-machine The interactive input interface inputs instructions and information to the terminal, and the terminal provides terminal status information to the user through the human-machine interaction output interface.
MCU可以为用户提供群组会议、 多组会议的连接服务。 参加多点会议 的终端通过传送网络与 MCU建立连接,并由 MCU完成多个终端之间的音、 视频流的交换和混合; 对于声音媒体流, MCU为每个终端送出多方混音处 理后的媒体流; 对于视频媒体流, MCU为每个终端发送另一个终端的单画 面视频流, 如果 MCU支持多画面功能, 也能够把多个终端来的视频合成为 一个多画面图像发送给终端。  The MCU can provide users with connection services for group meetings and multiple groups of conferences. The terminal participating in the multipoint conference establishes a connection with the MCU through the transmission network, and the MCU completes the exchange and mixing of the audio and video streams between the multiple terminals; for the audio media stream, the MCU sends the multi-party mixing processing for each terminal. For the video media stream, the MCU sends a single-picture video stream of another terminal for each terminal. If the MCU supports the multi-picture function, the video from multiple terminals can be combined into one multi-picture image and sent to the terminal.
现有技术中, 同一时刻下会议管理人员只能对一个终端实施监控, 图 1 为传统的视频会议系统的示意图, 图 1 中包含了两个场所, 每个场所内安 装了若干终端, PC控制台通过 IP网络分别对每个终端进行音视频信号的监 控,但 PC控制台在同一个时刻只能管理单个终端, 无法对多个终端同时进 行管理, 然而, 随着视频会议应用的普及, 企业或组织内部部署的终端越 来越多、 物理位置的分布也越来越分散, 例如: 在同一大楼的不同楼层、 或不同的大楼之间, 如果会议管理人员同一时刻下只能对一个终端的工作 状态实施监控, 将不利于多个终端的保障与维护, 并且增加了企业的运营 成本。 发明内容 In the prior art, conference management personnel can only monitor one terminal at the same time. FIG. 1 is a schematic diagram of a conventional video conference system, and FIG. 1 includes two locations, each location A plurality of terminals are installed, and the PC console separately monitors audio and video signals for each terminal through the IP network, but the PC console can only manage a single terminal at the same time, and cannot manage multiple terminals at the same time, however, The popularity of video conferencing applications, the increasing number of terminals deployed in enterprises or organizations, and the geographical distribution of physical locations are also increasingly dispersed, for example: on different floors of the same building, or between different buildings, if meeting management personnel at the same time Only the monitoring of the working status of one terminal can be implemented, which is not conducive to the protection and maintenance of multiple terminals, and increases the operating cost of the enterprise. Summary of the invention
有鉴于此, 本发明的主要目的在于提供一种对视频会议终端集中监控 的方法和系统及相关装置, 使会议管理人员同时监控多个终端的工作状态, 方便对终端的远程控制, 降低企业的运营成本。  In view of this, the main purpose of the present invention is to provide a method and system for centralized monitoring of a video conference terminal and related devices, so that the conference management personnel can simultaneously monitor the working status of multiple terminals, facilitate remote control of the terminal, and reduce the enterprise's Operating costs.
为解决上述技术问题, 本发明的技术方案是这样实现的:  In order to solve the above technical problem, the technical solution of the present invention is implemented as follows:
一种对视频会议终端集中监控的方法, 包括: 监控中心服务器与终端建立 音视频媒体传输通道; 终端将釆集到的音视频信号进行压缩编码及封装处 理, 并发送给监控中心服务器; 监控中心服务器对接收到的音视频信号进 行拆除封装、 解码、 合成处理, 并通过外设输出音视频信号。 A method for centralized monitoring of a video conference terminal, comprising: a monitoring center server and a terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals, and sends the signal to the monitoring center server; The server unpacks, decodes, and synthesizes the received audio and video signals, and outputs audio and video signals through the peripherals.
上述方案中, 所述音视频媒体传输通道的建立为: 通过预先约定或动 态信令协商的方式确定媒体传输通道参数, 建立一个或多个音视频媒体传 输通道。  In the above solution, the audio and video media transmission channel is established to: determine media transmission channel parameters by means of pre-agreed or dynamic signaling negotiation, and establish one or more audio and video media transmission channels.
上述方案中, 所述终端中音视频信号的编码及封装格式的确定方式为: 终端和监控中心服务器通过在信令通道中交换信息的方式、 或通过预先配 置的方式确定双方希望釆用的编码格式、 封装格式。  In the above solution, the encoding and encapsulation format of the audio and video signal in the terminal is determined by: the terminal and the monitoring center server determine the coding that the two parties wish to use by exchanging information in the signaling channel or by pre-configuring. Format, package format.
上述方案中, 当所釆集的音视频信号的格式不符合规定的编码格式要 求时, 该方法还包括: 在压缩编码处理时对信号的格式进行变换处理; 对 于视频信号的变换处理为: 降低视频帧分辨率、 降低帧频处理; 对于音频 信号的变换处理为: 降低釆样率、 降低量化精度处理。 In the above solution, when the format of the collected audio and video signal does not meet the requirements of the specified encoding format, the method further includes: transforming the format of the signal during the compression encoding process; and transforming the video signal into: reducing the video Frame resolution, reduced frame rate processing; for audio The signal transformation process is: reducing the sampling rate and reducing the quantization precision processing.
上述方案中, 所述监控中心服务器对接收到的音频信号的合成处理为: 将多个音频信号进行混音处理、 或按一定规则的剪接处理, 并通过音响设 备播放; 所述监控中心服务器对接收到的视频信号的合成处理为: 将多个 视频信号合成为一个多画面信号, 并通过一个显示设备播放, 或者将每个 视频信号分别输出到一个显示设备。  In the above solution, the monitoring center server synthesizes the received audio signal into: mixing a plurality of audio signals, or performing a splicing process according to a certain rule, and playing the same through the audio device; The composite video signal is synthesized by combining a plurality of video signals into one multi-picture signal and playing them through a display device, or outputting each video signal to a display device.
本发明还提供了一种对视频会议终端集中监控的系统, 包括: 一个或 一个以上终端、 以及监控中心服务器; 其中,  The present invention also provides a system for centralized monitoring of a video conference terminal, including: one or more terminals, and a monitoring center server;
终端, 用于釆集音视频信号, 并将釆集到的音视频信号进行压缩编码 及封装处理后发送给监控中心服务器;  The terminal is configured to collect audio and video signals, and perform compression coding and encapsulation processing on the collected audio and video signals, and then send the signals to the monitoring center server;
监控中心服务器, 用于接收终端发送的音视频信号, 并对接收到的音 视频信号进行拆除封装、 解码及合成处理, 通过外设输出音视频信号。  The monitoring center server is configured to receive audio and video signals sent by the terminal, and perform demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and output audio and video signals through the peripheral devices.
上述方案中, 所述终端包括监控前端模块; 其中, 监控前端模块包括: 监控音频编码模块、 监控视频编码模块、 及监控网络收发模块; 其中, 监控音频编码模块, 用于对终端的音频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  In the above solution, the terminal includes a monitoring front-end module; wherein, the monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, and a monitoring network transceiver module; wherein, the monitoring audio encoding module is configured to perform audio signals on the terminal Compressed coding, and encapsulated and sent to the monitoring network transceiver module;
监控视频编码模块, 用于对终端的视频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控网络收发模块, 用于分别将处理后的音频信号和视频信号发送给 监控中心服务器。  The monitoring network transceiver module is configured to separately send the processed audio signal and the video signal to the monitoring center server.
上述方案中, 所述监控中心服务器包括: 网络收发模块、 音频解码模 块、 音频输出模块、 视频解码模块、 以及视频输出模块; 其中,  In the above solution, the monitoring center server includes: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, and a video output module;
网络收发模块, 用于接收终端发送的音视频信号, 并分别将音频信号 发送给音频解码模块, 将视频信号发送给视频解码模块;  a network transceiver module, configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
音频解码模块, 用于将音频信号拆除封装, 并解码后发送给音频输出 模块; An audio decoding module, configured to remove an audio signal and decode it, and then send the audio signal to the audio output Module
音频输出模块, 用于将解码后的音频信号合成处理, 并通过外设播放; 视频解码模块, 用于将视频信号拆除封装, 并解码后发送给视频输出 模块;  An audio output module, configured to synthesize and process the decoded audio signal, and play the same through the peripheral; the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
视频输出模块, 用于将解码后的视频信号合成处理, 并通过外设播放。 上述方案中 , 所述监控前端模块和监控中心服务器还分别包括监控信 令处理模块, 用于通过约定或动态信令协商的方式确定双方的媒体传输通 道参数, 建立一个或多个音视频媒体传输通道。  The video output module is configured to synthesize the decoded video signal and play it through the peripheral device. In the foregoing solution, the monitoring front-end module and the monitoring center server respectively include a monitoring signaling processing module, configured to determine media transmission channel parameters of both parties by means of agreement or dynamic signaling negotiation, and establish one or more audio and video media transmissions. aisle.
上述方案中, 所述监控信令处理模块还用于通过在信令通道中交换信 息的方式、 或通过预先配置的方式确定监控前端模块和监控中心服务器希 望釆用的编码格式、 及封装格式。  In the foregoing solution, the monitoring signaling processing module is further configured to determine, by using a manner of exchanging information in the signaling channel, or by a pre-configured manner, an encoding format and an encapsulation format that the monitoring front-end module and the monitoring center server desire to use.
上述方案其中, 所述监控中心服务器包括: 一个或一个以上监控处理 单板, 用于对一路音频和 /或一路视频信号进行拆除封装、 解码、 合成处理, 并将处理后的音频和 /或视频信号通过对应的外设播放。  In the above solution, the monitoring center server includes: one or more monitoring processing boards, configured to perform demolition, decoding, synthesizing, processing, and processing of audio and/or video of one channel of audio and/or one channel of video signals. The signal is played through the corresponding peripheral.
上述方案中, 所述监控处理单板, 还用于通过约定或动态信令协商的 方式确定所述监控前端模块和监控处理单板的媒体传输通道参数, 建立音 视频媒体传输通道; 以及通过在信令通道中交换信息的方式、 或通过预先 配置的方式确定所述监控前端模块和监控处理单板希望釆用的编码格式、 及封装格式。  In the above solution, the monitoring and processing board is further configured to determine media transmission channel parameters of the monitoring front-end module and the monitoring processing board by means of a convention or dynamic signaling negotiation, and establish an audio and video media transmission channel; The manner in which the information is exchanged in the signaling channel, or the encoding format and the encapsulation format that the monitoring front-end module and the monitoring processing board wish to use are determined in a pre-configured manner.
本发明还提供了一种视频会议终端, 包括监控前端模块; 其中, 监控 前端模块包括: 监控音频编码模块、 监控视频编码模块、 监控网络收发模 块、 及监控信令处理模块; 其中,  The present invention also provides a video conferencing terminal, including a monitoring front-end module; wherein, the monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, a monitoring network transceiver module, and a monitoring signaling processing module;
监控音频编码模块, 用于对终端的音频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控视频编码模块, 用于对终端的视频信号进行压缩编码, 并封装后 发送给监控网络收发模块; a monitoring video encoding module, configured to compress and encode a video signal of the terminal, and encapsulate the video signal Send to the monitoring network transceiver module;
监控网络收发模块, 用于分别将处理后的音频信号和视频信号发送给 监控中心服务器;  a monitoring network transceiver module, configured to separately send the processed audio signal and the video signal to the monitoring center server;
监控信令处理模块, 用于通过约定或动态信令协商的方式确定终端与 监控中心服务器的媒体传输通道参数, 建立一个或多个音视频媒体传输通 道。  The monitoring signaling processing module is configured to determine media transmission channel parameters of the terminal and the monitoring center server by means of a contract or dynamic signaling negotiation, and establish one or more audio and video media transmission channels.
本发明还提供了一种监控中心服务器, 包括: 网络收发模块、 音频解 码模块、 音频输出模块、 视频解码模块、 视频输出模块、 以及监控信令处 理模块; 其中,  The present invention also provides a monitoring center server, including: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, a video output module, and a monitoring signaling processing module;
网络收发模块, 用于接收终端发送的音视频信号, 并分别将音频信号 发送给音频解码模块, 将视频信号发送给视频解码模块;  a network transceiver module, configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
音频解码模块, 用于将音频信号拆除封装, 并解码后发送给音频输出 模块;  An audio decoding module, configured to remove and package the audio signal, and decode the signal to send to the audio output module;
音频输出模块, 用于将解码后的音频信号合成处理, 并通过外设播放; 视频解码模块, 用于将视频信号拆除封装, 并解码后发送给视频输出 模块;  An audio output module, configured to synthesize and process the decoded audio signal, and play the same through the peripheral; the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
视频输出模块, 用于将解码后的视频信号合成处理, 并通过外设播放; 监控信令处理模块, 用于通过约定或动态信令协商的方式确定终端与 监控中心服务器的媒体传输通道参数, 建立一个或多个音视频媒体传输通 道。  a video output module, configured to synthesize and process the decoded video signal, and play the same through the peripheral device; the monitoring signaling processing module is configured to determine a media transmission channel parameter of the terminal and the monitoring center server by using a convention or a dynamic signaling negotiation manner, Establish one or more audio and video media transmission channels.
本发明所提供的一种对视频会议终端集中监控的方法和系统及相关装 置, 在本发明方法中, 监控中心服务器与终端建立音视频媒体传输通道; 终端将釆集到的音视频信号进行压缩编码及封装处理, 并发送给监控中心 服务器; 监控中心服务器对接收到的音视频信号进行拆除封装、 解码、 合 成处理, 并通过外设输出音视频信号。 釆用本发明所述方案, 能够使视频 会议管理人员同时对多个终端的工作状态实施监控, 方便对终端的远程控 制, 并及时对出现故障的终端釆取保障措施, 提高了视频会议管理人员的 工作效率及视频会议业务的保障能力, 并且降低了企业的运营成本。 附图说明 The present invention provides a method and system for centralized monitoring of a video conference terminal and related devices. In the method of the present invention, the monitoring center server establishes an audio and video media transmission channel with the terminal; the terminal compresses the collected audio and video signals. Encoding and encapsulation processing, and sent to the monitoring center server; the monitoring center server de-encapsulates, decodes, synthesizes the received audio and video signals, and outputs audio and video signals through the peripherals. Using the solution of the present invention, enabling video The conference management personnel simultaneously monitors the working status of multiple terminals, facilitates remote control of the terminal, and timely obtains safeguard measures for the failed terminal, improving the working efficiency of the video conference management personnel and the guarantee capability of the video conference service. And reduce the operating costs of the enterprise. DRAWINGS
图 1为现有技术视频会议系统的组成结构示意图;  1 is a schematic structural diagram of a prior art video conference system;
图 2为本发明对视频会议终端集中监控的系统组成结构示意图; 图 3为本发明对视频会议终端集中监控的方法流程示意图;  2 is a schematic structural diagram of a system for centralized monitoring of a video conference terminal according to the present invention; FIG. 3 is a schematic flowchart of a method for centralized monitoring of a video conference terminal according to the present invention;
图 4为本发明视频会议终端的内部组成结构示意图;  4 is a schematic diagram showing the internal structure of a video conference terminal according to the present invention;
图 5为本发明监控中心服务器的内部组成结构示意图;  FIG. 5 is a schematic diagram showing the internal structure of a monitoring center server according to the present invention; FIG.
图 6为本发明监控中心服务器釆用机框式结构时, 对视频会议终端集 中监控的系统组成结构示意图。 具体实施方式  FIG. 6 is a schematic structural diagram of a system for monitoring a video conference terminal in a centralized monitoring system when the monitoring center server uses the frame structure. detailed description
本发明的基本思想是: 监控中心服务器与终端建立音视频媒体传输通 道; 终端将釆集到的音视频信号进行压缩编码及封装处理, 并发送给监控 中心服务器; 监控中心服务器对接收到的音视频信号进行拆除封装、 解码、 合成处理, 并通过外设输出音视频信号。  The basic idea of the present invention is: the monitoring center server and the terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals, and sends them to the monitoring center server; the monitoring center server receives the received sound The video signal is unpacked, decoded, synthesized, and the audio and video signals are output through the peripheral.
本发明提供了一种对视频会议终端集中监控的系统, 如图 2所示, 包 括: 一个以上终端、 以及一个监控中心服务器; 其中, 终端与监控中心服 务器之间通过 IP网络进行通信。  The present invention provides a system for centralized monitoring of a video conference terminal. As shown in FIG. 2, the system includes: one or more terminals and one monitoring center server; wherein, the terminal communicates with the monitoring center server through an IP network.
基于上述系统, 本发明提供的对视频会议终端集中监控的方法, 如图 3 所示, 包括以下步骤:  Based on the above system, the method for centralized monitoring of a video conference terminal provided by the present invention, as shown in FIG. 3, includes the following steps:
步骤 301 : 监控中心服务器与终端建立音视频媒体传输通道;  Step 301: The monitoring center server establishes an audio and video media transmission channel with the terminal.
本步骤中, 监控中心服务器与终端可以通过预先约定媒体传输通道参 数、 或动态信令协商的方式来建立一个或多个音视频媒体传输通道; 所述 信令可以釆用但不限于实时流传输协议 ( RTSP , Real Time Streaming Protocol ); 所述媒体传输通道可以釆用但不限于传输控制协议 (TCP , Transmission Control Protocol )或用户数据包协议 ( UDP , User Datagram Protocol )通道。 In this step, the monitoring center server and the terminal may establish one or more audio and video media transmission channels by pre-arranging media transmission channel parameters or dynamic signaling negotiation manner; The signaling can be used but is not limited to Real Time Streaming Protocol (RTSP); the media transmission channel can be used but is not limited to Transmission Control Protocol (TCP) or User Data Packet Protocol (UDP). User Datagram Protocol ) channel.
釆用 RTSP来建立音频媒体传输通道时, 可以参照以下流程: 步骤 la: 监控中心服务器向终端发起 TCP连接请求, 该连接用于传递 RTSP消息;  When using RTSP to establish an audio media transmission channel, refer to the following process: Step la: The monitoring center server initiates a TCP connection request to the terminal, and the connection is used to deliver the RTSP message;
步骤 lb: 监控中心服务器发送 RTSP的建立 (SETUP ) 消息, 通知终 端关于监控中心服务器建立的传送层协议和地址信息;  Step lb: The monitoring center server sends an RTSP establishment (SETUP) message to inform the terminal about the transport layer protocol and address information established by the monitoring center server;
步骤 lc: 终端发送响应 ( SETUP RESPONSE ) 消息, 双方通道建立协 商完成;  Step lc: The terminal sends a response (SETUP RESPONSE) message, and the two channels establish a negotiation to complete;
步骤 Id: 监控中心服务器发送 RTSP的开始(PLAY )消息, 请求终端 开始发送本端音频媒体流;  Step Id: The monitoring center server sends an RTSP start (PLAY) message, requesting the terminal to start sending the local audio media stream;
步骤 le: 终端接收到 PLAY消息后, 向监控中心服务器发送本端音频 媒体流。  Step le: After receiving the PLAY message, the terminal sends the local audio media stream to the monitoring center server.
后续可釆用相似的方式建立其它的媒体流传输通道, 如: 远端音频媒 体流、 本端视频媒体流、 远端视频媒体流。  In the following, other media streaming channels can be established in a similar manner, such as: a remote audio media stream, a local video media stream, and a remote video media stream.
步骤 302: 终端将釆集到的音视频信号进行压缩编码及封装处理, 并发 送给监控中心服务器;  Step 302: The terminal performs compression coding and encapsulation processing on the collected audio and video signals, and sends the audio and video signals to the monitoring center server.
本步骤中, 终端先釆集音视频信号, 并将釆集到的音视频信号进行压 缩编码处理; 所述终端釆集的音视频信号包括: 本端音频信号、 远端音频 信号、 本端视频信号、 远端视频信号; 这里, 终端和监控中心服务器可以 通过在信令通道中交换信息的方式、 或通过预先配置的方式确定双方希望 釆用的编码格式、 封装格式等参数。 较佳地, 终端对音频信号进行压缩编 码处理时, 可釆用 G.711、 G.722或其它编码算法, 终端对视频信号进行压 缩编码处理时, 可釆用 H.263、 H.264或其它编码算法; 如果所釆集的音视 频信号的格式不符合规定的编码格式要求, 则在压缩编码处理时需要对信 号的格式进行变换处理, 例如: 对于视频信号, 则需要降低视频帧分辨率、 降低帧频等处理; 对于音频信号, 则需要降低釆样率、 降低量化精度等处 理; 一般情况下, 为了减少带宽消耗以及计算量, 对于视频信号可以釆用 公共中间格式 ( CIF, Common Intermediate Format )帧大小, 帧频可以釆用 15帧左右。 In this step, the terminal first collects the audio and video signals, and performs compression coding on the collected audio and video signals; the audio and video signals collected by the terminal include: the local audio signal, the far-end audio signal, and the local video. Signal, remote video signal; Here, the terminal and the monitoring center server can determine parameters such as an encoding format and an encapsulation format that both parties wish to use by exchanging information in the signaling channel or by pre-configuring. Preferably, when the terminal performs compression coding on the audio signal, G.711, G.722 or other encoding algorithms may be used, and the terminal compresses the video signal. When the encoding process is used, H.263, H.264 or other encoding algorithms may be used; if the format of the collected audio and video signals does not meet the requirements of the specified encoding format, the format of the signal needs to be performed in the compression encoding process. Transform processing, for example: For video signals, it is necessary to reduce the resolution of the video frame, reduce the frame rate, etc. For audio signals, it is necessary to reduce the sampling rate, reduce the quantization accuracy, etc.; in general, in order to reduce bandwidth consumption and calculation For the video signal, the common intermediate format (CIF, Common Intermediate Format) frame size can be used, and the frame rate can be used for about 15 frames.
其次, 终端将压缩编码处理后的音视频信号进行封装, 并通过媒体传 输通道发送给监控中心服务器; 其中, 封装格式可以釆用但不限于实时传 输协议(RTP, Real Time Protocol ); 通常, 可将每个媒体信号封装为一个 RTP流, 也可以将多个媒体信号封装为一个 RTP流, 例如: 将本地音频和 本地视频信号封装为一个 RTP流, 将远端音频和远端视频信号封装为一个 RTP流。  Secondly, the terminal encapsulates the compressed audio and video signal and sends it to the monitoring center server through the media transmission channel; wherein the encapsulation format can be used but is not limited to Real Time Protocol (RTP); Each media signal is encapsulated into an RTP stream, and multiple media signals can also be encapsulated into one RTP stream, for example: the local audio and local video signals are encapsulated into one RTP stream, and the far-end audio and the far-end video signal are encapsulated as An RTP stream.
步骤 303:监控中心服务器对接收到的音视频信号进行拆除封装、解码、 合成处理, 并通过外设输出音视频信号。  Step 303: The monitoring center server performs demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and outputs audio and video signals through the peripheral devices.
本步骤中, 监控中心服务器将接收到的音视频信号拆除封装并经过解 码、 合成处理后, 通过显示器、 扬声器等外设输出音视频信号; 其中, 对 于音频信号的合成处理可以为: 将多个音频信号进行混音处理、 或按一定 规则的剪接处理, 并通过音响设备播放; 对于视频信号的合成处理可以为: 将多个视频信号合成为一个多画面信号, 并通过一个显示设备播放, 或者 将每个视频信号分别输出到一个显示设备。  In this step, the monitoring center server removes the packaged audio and video signals and decodes and synthesizes them, and then outputs audio and video signals through peripherals such as a display and a speaker; wherein, the synthesis processing of the audio signals may be: The audio signal is mixed, or spliced according to a certain rule, and played by the audio device; for the synthesis of the video signal, the video signal can be synthesized into a multi-picture signal and played through a display device, or Each video signal is output to a display device.
本发明提供的对视频会议终端集中监控的系统, 如图 2 所示, 包括: 一个以上终端、 以及一个监控中心服务器; 其中,  The system for centralized monitoring of a video conference terminal provided by the present invention, as shown in FIG. 2, includes: one or more terminals, and one monitoring center server;
终端, 用于釆集音视频信号, 并将釆集到的音视频信号进行压缩编码 及封装处理后发送给监控中心服务器; 监控中心服务器, 用于接收终端发送的音视频信号, 并对接收到的音 视频信号进行拆除封装、 解码及合成处理, 通过外设输出音视频信号。 The terminal is configured to collect audio and video signals, and perform compression coding and encapsulation processing on the collected audio and video signals, and then send the signals to the monitoring center server; The monitoring center server is configured to receive audio and video signals sent by the terminal, and perform demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and output audio and video signals through the peripheral devices.
如图 4所示, 所述终端包括: 音频输入模块、 音频输出模块、 音频编 码模块、 音频解码模块、 视频输入模块、 视频输出模块、 视频编码模块、 视频解码模块、 信号收发模块, 还进一步包括: 监控前端模块;  As shown in FIG. 4, the terminal includes: an audio input module, an audio output module, an audio encoding module, an audio decoding module, a video input module, a video output module, a video encoding module, a video decoding module, and a signal transceiving module, and further includes : Monitoring front-end modules;
所述监控前端模块, 用于分别对终端的音频信号、 以及视频信号进行 压缩编码及封装处理, 并将处理后的音视频信号发送给监控中心服务器。  The monitoring front end module is configured to perform compression coding and encapsulation processing on the audio signal and the video signal of the terminal, respectively, and send the processed audio and video signals to the monitoring center server.
所述监控前端模块包括: 监控音频编码模块、 监控视频编码模块、 及 监控网络收发模块; 其中,  The monitoring front-end module includes: a monitoring audio encoding module, a monitoring video encoding module, and a monitoring network transceiver module;
监控音频编码模块, 用于对终端的音频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控视频编码模块, 用于对终端的视频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控网络收发模块, 用于分别将处理后的音频信号和视频信号发送给 监控中心服务器的网络收发模块。  The monitoring network transceiver module is configured to separately send the processed audio signal and the video signal to the network transceiver module of the monitoring center server.
如图 5 所示, 所述监控中心服务器包括: 网络收发模块、 音频解码模 块、 音频输出模块、 视频解码模块、 以及视频输出模块; 其中,  As shown in FIG. 5, the monitoring center server includes: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, and a video output module;
网络收发模块, 用于接收监控前端模块发送的音视频信号, 并分别将 音频信号发送给音频解码模块, 将视频信号发送给视频解码模块;  a network transceiver module, configured to receive an audio and video signal sent by the monitoring front end module, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
音频解码模块, 用于将音频信号拆除封装, 并解码后发送给音频输出 模块;  An audio decoding module, configured to remove and package the audio signal, and decode the signal to send to the audio output module;
音频输出模块, 用于将解码后的音频信号合成处理, 并通过外设播放; 视频解码模块, 用于将视频信号拆除封装, 并解码后发送给视频输出 模块;  An audio output module, configured to synthesize and process the decoded audio signal, and play the same through the peripheral; the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
视频输出模块, 用于将解码后的视频信号合成处理, 并通过外设播放。 所述监控前端模块和监控中心服务器还分别包括监控信令处理模块, 通过监控信令处理模块使监控前端模块与监控中心服务器之间通过约定或 动态信令协商的方式确定媒体传输通道参数, 建立一个或多个音视频媒体 传输通道; 所述信令可以釆用但不限于 RTSP; 所述媒体传输通道可以釆用 但不限于 TCP或 UDP通道。 The video output module is configured to synthesize the decoded video signal and play it through the peripheral device. The monitoring front-end module and the monitoring center server respectively include a monitoring signaling processing module, and the monitoring signaling processing module determines the media transmission channel parameters by establishing a dynamic transmission channel negotiation manner between the monitoring front-end module and the monitoring center server. One or more audio and video media transmission channels; the signaling may be used but not limited to RTSP; the media transmission channel may be used but not limited to a TCP or UDP channel.
所述监控信令处理模块还用于通过在信令通道中交换信息的方式、 或 通过预先配置的方式确定监控前端模块和监控中心服务器希望釆用的编码 格式、 及封装格式等参数。 较佳地, 监控前端模块对音频信号进行压缩编 码处理时, 可釆用 G.711、 G.722或其它编码算法, 监控前端模块对视频信 号进行压缩编码处理时, 可釆用 H.263、 H.264或其它编码算法。  The monitoring signaling processing module is further configured to determine, by using a manner of exchanging information in the signaling channel, or by using a pre-configured manner, parameters such as an encoding format and an encapsulation format that the monitoring front-end module and the monitoring center server desire to use. Preferably, when the monitoring front-end module compresses and encodes the audio signal, G.711, G.722 or other encoding algorithms may be used, and when the front-end module performs compression encoding processing on the video signal, H.263 may be used. H.264 or other encoding algorithm.
所述监控中心服务器包括: 一个或一个以上监控处理单板, 如图 6所 示;  The monitoring center server includes: one or more monitoring processing boards, as shown in FIG. 6;
所述监控处理单板,用于对一路音频和 /或一路视频信号进行拆除封装、 解码、 合成处理, 并将处理后的音频和 /或视频信号通过对应的外设播放。  The monitoring processing board is configured to remove, package, decode, and synthesize one channel of audio and/or one channel of video signals, and play the processed audio and/or video signals through corresponding peripherals.
所述监控处理单板, 还用于通过约定或动态信令协商的方式确定所述 监控前端模块和监控处理单板的媒体传输通道参数, 建立音视频媒体传输 通道; 以及通过在信令通道中交换信息的方式、 或通过预先配置的方式确 定所述监控前端模块和监控处理单板希望釆用的编码格式、 及封装格式。  The monitoring processing board is further configured to determine media transmission channel parameters of the monitoring front-end module and the monitoring processing board by means of a convention or dynamic signaling negotiation, and establish an audio and video media transmission channel; and pass through the signaling channel. The encoding format and the encapsulation format that the monitoring front-end module and the monitoring processing board hope to use are determined in a manner of exchanging information or in a pre-configured manner.
在图 4所示的终端内部组成结构示意图中, 终端内部各个模块之间的 信号传递关系为: 在信号的发送方向: 音频输入模块 1 釆集本端的音频, 如麦克风等外设的输入, 并将所釆集的音频信号复制成两份, 一份送给音 频编码模块 2进行压缩编码, 然后由信号收发模块 3发送到对端, 另一份 送给监控前端模块, 监控前端模块中的监控音频编码模块 10对输入音频进 行压缩编码并封装,然后通过监控网络收发模块 12发送给监控中心服务器; 视频输入模块 6釆集本端的视频, 如摄像机等外设的输入, 并将所釆集的 视频信号复制成两份, 一份送给视频编码模块 7进行压缩编码, 然后由信 号收发模块 3发送到对端, 另一份送给监控前端模块, 监控前端模块中的 监控视频编码模块 11对输入视频进行压缩编码并封装, 然后通过监控网络 收发模块 12发送给监控中心服务器。 In the internal structure diagram of the terminal shown in FIG. 4, the signal transmission relationship between the modules in the terminal is: in the direction in which the signal is sent: the audio input module 1 collects the audio of the local end, such as the input of a peripheral such as a microphone, and The collected audio signal is copied into two copies, and one copy is sent to the audio encoding module 2 for compression encoding, and then sent to the opposite end by the signal transceiving module 3, and the other is sent to the monitoring front end module, and the monitoring in the front end module is monitored. The audio encoding module 10 compresses and encodes the input audio, and then sends it to the monitoring center server through the monitoring network transceiver module 12; the video input module 6 collects the video of the local end, such as the input of peripherals such as a camera, and collects the collected The video signal is copied into two copies, one is sent to the video encoding module 7 for compression encoding, and then sent by the signal transceiving module 3 to the opposite end, and the other is sent to the monitoring front end module, and the monitoring video encoding module 11 in the monitoring front end module is The input video is compression encoded and encapsulated, and then sent to the monitoring center server through the monitoring network transceiver module 12.
在信号的接收方向: 信号收发模块 3接收对端的音频信号, 拆除网络 传输层的封装后, 将有效信号送给音频解码模块 5 , 音频解码模块 5将解码 后的信号送给音频输出模块 4, 音频输出模块 4将音频信号复制为两份,一 份输出给扬声器等外设进行播放, 另一份送给监控前端模块, 监控前端模 块的监控音频编码模块 10对输入音频进行压缩编码并封装, 然后由监控网 络收发模块 12发送给监控中心服务器; 网络收发模块 3接收对端的视频信 号, 拆除网络传输层的封装后, 将有效信号送给视频解码模块 9, 视频解码 模块 9将解码后的数据送给视频输出模块 8,视频输出模块 8将数据复制为 两份, 一份输出给显示器等外设进行播放, 另一份送给监控前端模块, 监 控前端模块的监控视频编码模块 11对输入视频进行压缩编码并封装, 然后 通过监控网络收发模块 12发送给监控中心服务器。  In the receiving direction of the signal: the signal transceiver module 3 receives the audio signal of the opposite end, and after removing the package of the network transmission layer, sends the valid signal to the audio decoding module 5, and the audio decoding module 5 sends the decoded signal to the audio output module 4, The audio output module 4 copies the audio signal into two copies, one output to the peripherals such as the speaker for playing, and the other to the monitoring front end module, and the monitoring audio encoding module 10 of the monitoring front end module compresses and encodes the input audio. Then, the monitoring network transceiver module 12 sends the signal to the monitoring center server; the network transceiver module 3 receives the video signal of the opposite end, removes the encapsulation of the network transmission layer, and sends a valid signal to the video decoding module 9, and the video decoding module 9 decodes the decoded data. The video output module 8 sends the data to two copies, one output is sent to the peripherals such as the display for playback, and the other is sent to the monitoring front-end module, and the monitoring video encoding module 11 of the front-end module monitors the input video. Compressed and encapsulated, then sent and received through the monitoring network Block 12 transmits to the monitoring center server.
在图 5 所示的监控中心服务器内部组成结构示意图中, 监控中心服务 器内部各个模块信号之间的传递关系为: 网络收发模块 1接收各个终端的 监控前端模块发送来的音视频信号, 分别将音频信号传送给音频解码模块 2, 将视频信号传送给视频解码模块 3; 音频解码模块 2将音频信号拆除封 装并解码后,传送给音频输出模块 4, 音频输出模块 4对来自多个监控前端 模块的音频信号进行合成处理, 并通过外设如扬声器进行播放; 所述音频 信号的合成处理可以为: 将某个监控前端模块发送的本端音频和远端音频 进行混音处理, 或者釆用轮盘转方式选择每个监控前端模块的音频信号片 段剪接处理; 视频解码模块 3 将视频信号拆除封装并解码后, 传送给视频 输出模块 5 ,视频输出模块 5对来自多个监控前端模块的视频信号进行合成 处理, 并通过外设如显示器进行播放; 所述视频信号的合成处理可以为: 将多个监控前端模块发送的本端视频和远端视频组合为一个多画面视频, 并通过一个显示设备播放, 或者将每个视频信号分别输出到一个显示设备。 In the internal structure diagram of the monitoring center server shown in FIG. 5, the transmission relationship between the signals of each module in the monitoring center server is: The network transceiver module 1 receives the audio and video signals sent by the monitoring front-end modules of each terminal, respectively, and the audio The signal is transmitted to the audio decoding module 2, and the video signal is transmitted to the video decoding module 3. The audio decoding module 2 unpacks and decodes the audio signal, and then transmits the audio signal to the audio output module 4, and the audio output module 4 pairs the plurality of monitoring front-end modules. The audio signal is synthesized and played by a peripheral device such as a speaker; the synthesized processing of the audio signal may be: mixing the local audio and the far end audio sent by a monitoring front end module, or using a roulette wheel The audio signal segment splicing process of each monitoring front-end module is selected by the rotation mode; the video decoding module 3 de-encapsulates and decodes the video signal, and then transmits the video signal to the video output module 5, and the video output module 5 performs video signals from the plurality of monitoring front-end modules. Synthesis Processing, and playing through a peripheral device such as a display; the video signal synthesis processing may be: combining the local video and the far-end video sent by the multiple monitoring front-end modules into one multi-picture video, and playing through a display device, Or output each video signal to a display device separately.
本发明监控中心服务器还可釆用机框式结构的硬件设备来实现,如图 6 所示, 在图 6所示的结构中在监控中心服务器内部插有一个或多个监控处 理单板, 且每块监控处理单板都分别与一台终端和一台外设如监视器相连; 每一块监控处理单板负责对一路音频和 /或一路视频信号进行拆除封装、 解 码、 合成等监控处理, 并将处理后的音频和 /或视频信号通过对应的外设播 放; 每块监控处理单板均可以釆用如图 5 所示的结构, 其各模块之间的功 能不再赘述; 在监控中心服务器内部, 可以对每块监控处理单板的电源供 电情况进行统一配置, 也可以进行单独配置。  The monitoring center server of the present invention can also be implemented by using a hardware device of a chassis-type structure. As shown in FIG. 6, in the structure shown in FIG. 6, one or more monitoring processing boards are inserted in the monitoring center server, and Each monitoring and processing board is respectively connected to one terminal and one peripheral such as a monitor; each monitoring and processing board is responsible for performing demodulation, decoding, synthesizing, etc. on one audio and/or one video signal, and The processed audio and/or video signals are played through the corresponding peripherals; each monitoring processing board can use the structure shown in FIG. 5, and the functions between the modules are not described again; Internally, the power supply status of each monitoring board can be configured uniformly or separately.
以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围, 凡在本发明的精神和原则之内所作的任何修改、 等同替换和改进 等, 均应包含在本发明的保护范围之内。 工业实用性  The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included. Within the scope of protection of the present invention. Industrial applicability
本发明提供的一种对视频会议终端集中监控的方法和系统及相关装 置, 其基本思想为: 监控中心服务器与终端建立音视频媒体传输通道; 终 端将釆集到的音视频信号进行压缩编码及封装处理, 并发送给监控中心服 务器; 监控中心服务器对接收到的音视频信号进行拆除封装、 解码、 合成 处理, 并通过外设输出音视频信号。  The invention provides a method and system for centralized monitoring of a video conference terminal and related devices, and the basic idea is: the monitoring center server and the terminal establish an audio and video media transmission channel; the terminal compresses and encodes the collected audio and video signals and Encapsulation processing, and sent to the monitoring center server; the monitoring center server removes the encapsulation, decoding, and synthesis processing of the received audio and video signals, and outputs audio and video signals through the peripherals.
釆用本发明所述方案, 能够使视频会议管理人员同时对多个终端的工 作状态实施监控, 方便对终端的远程控制, 并及时对出现故障的终端釆取 保障措施, 提高了视频会议管理人员的工作效率及视频会议业务的保障能 力, 并且降低了企业的运营成本。  The solution of the present invention enables the video conference management personnel to monitor the working status of multiple terminals at the same time, facilitates remote control of the terminal, and timely obtains safeguard measures for the faulty terminal, thereby improving the video conference management personnel. The efficiency of the work and the support of the video conferencing service, and reduce the operating costs of the enterprise.

Claims

权利要求书 Claim
1、 一种对视频会议终端集中监控的方法, 其特征在于, 包括: 监控中心服务器与终端建立音视频媒体传输通道;  A method for centralized monitoring of a video conference terminal, comprising: a monitoring center server and a terminal establishing an audio and video media transmission channel;
终端将釆集到的音视频信号进行压缩编码及封装处理, 并发送给监控 中心服务器;  The terminal compresses and encodes the collected audio and video signals and sends them to the monitoring center server;
监控中心服务器对接收到的音视频信号进行拆除封装、 解码、 合成处 理, 并通过外设输出音视频信号。  The monitoring center server unpacks, decodes, and synthesizes the received audio and video signals, and outputs audio and video signals through the peripherals.
2、 根据权利要求 1所述的方法, 其特征在于, 所述音视频媒体传输通 道的建立为: 通过预先约定或动态信令协商的方式确定媒体传输通道参数 , 建立一个或多个音视频媒体传输通道。  The method according to claim 1, wherein the audio and video media transmission channel is established as: determining media transmission channel parameters by means of pre-agreed or dynamic signaling negotiation, and establishing one or more audio and video media Transmission channel.
3、 根据权利要求 1所述的方法, 其特征在于, 所述终端中音视频信号 的编码及封装格式的确定方式为: 终端和监控中心服务器通过在信令通道 中交换信息的方式、 或通过预先配置的方式确定双方希望釆用的编码格式、 封装格式。  3. The method according to claim 1, wherein the encoding and encapsulation format of the audio and video signals in the terminal are determined by: a manner in which the terminal and the monitoring center server exchange information in the signaling channel, or The pre-configured way determines the encoding format and encapsulation format that both parties wish to use.
4、 根据权利要求 3所述的方法, 其特征在于, 当所釆集的音视频信号 的格式不符合规定的编码格式要求时, 该方法还包括: 在压缩编码处理时 对信号的格式进行变换处理; 对于视频信号的变换处理为: 降低视频帧分 辨率、 降低帧频处理; 对于音频信号的变换处理为: 降低釆样率、 降低量 化精度处理。  The method according to claim 3, wherein when the format of the collected audio and video signals does not meet the requirements of the specified encoding format, the method further comprises: transforming the format of the signal during the compression encoding process The conversion processing for the video signal is: reducing the video frame resolution and reducing the frame rate processing; the conversion processing for the audio signal is: reducing the sampling rate and reducing the quantization precision processing.
5、 根据权利要求 1至 4任一所述的方法, 其特征在于, 所述监控中心 服务器对接收到的音频信号的合成处理为: 将多个音频信号进行混音处理、 或按一定规则的剪接处理, 并通过音响设备播放; 所述监控中心服务器对 接收到的视频信号的合成处理为: 将多个视频信号合成为一个多画面信号, 并通过一个显示设备播放, 或者将每个视频信号分别输出到一个显示设备。  The method according to any one of claims 1 to 4, wherein the monitoring process of the received audio signal by the monitoring center server is: mixing a plurality of audio signals, or according to a certain rule Splicing processing, and playing through the audio device; the monitoring center server synthesizes the received video signal into: combining a plurality of video signals into one multi-picture signal, and playing through a display device, or each video signal Output to a display device separately.
6、 一种对视频会议终端集中监控的系统, 其特征在于, 包括: 一个或 一个以上终端、 以及监控中心服务器; 其中, 6. A system for centralized monitoring of a video conference terminal, characterized in that: More than one terminal, and a monitoring center server;
终端, 用于釆集音视频信号, 并将釆集到的音视频信号进行压缩编码 及封装处理后发送给监控中心服务器;  The terminal is configured to collect audio and video signals, and perform compression coding and encapsulation processing on the collected audio and video signals, and then send the signals to the monitoring center server;
监控中心服务器, 用于接收终端发送的音视频信号, 并对接收到的音 视频信号进行拆除封装、 解码及合成处理, 通过外设输出音视频信号。  The monitoring center server is configured to receive audio and video signals sent by the terminal, and perform demolition, encapsulation, decoding, and synthesis processing on the received audio and video signals, and output audio and video signals through the peripheral devices.
7、 根据权利要求 6所述的系统, 其特征在于, 所述终端包括监控前端 模块; 其中, 监控前端模块包括: 监控音频编码模块、 监控视频编码模块、 及监控网络收发模块; 其中,  The system according to claim 6, wherein the terminal comprises a monitoring front end module; wherein, the monitoring front end module comprises: a monitoring audio encoding module, a monitoring video encoding module, and a monitoring network transceiver module;
监控音频编码模块, 用于对终端的音频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控视频编码模块, 用于对终端的视频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控网络收发模块, 用于分别将处理后的音频信号和视频信号发送给 监控中心服务器。  The monitoring network transceiver module is configured to separately send the processed audio signal and the video signal to the monitoring center server.
8、 根据权利要求 6所述的系统, 其特征在于, 所述监控中心服务器包 括: 网络收发模块、 音频解码模块、 音频输出模块、 视频解码模块、 以及 视频输出模块; 其中,  The system according to claim 6, wherein the monitoring center server comprises: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, and a video output module;
网络收发模块, 用于接收终端发送的音视频信号, 并分别将音频信号 发送给音频解码模块, 将视频信号发送给视频解码模块;  a network transceiver module, configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
音频解码模块, 用于将音频信号拆除封装, 并解码后发送给音频输出 模块;  An audio decoding module, configured to remove and package the audio signal, and decode the signal to send to the audio output module;
音频输出模块, 用于将解码后的音频信号合成处理, 并通过外设播放; 视频解码模块, 用于将视频信号拆除封装, 并解码后发送给视频输出 模块;  An audio output module, configured to synthesize and process the decoded audio signal, and play the same through the peripheral; the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
视频输出模块, 用于将解码后的视频信号合成处理, 并通过外设播放。 The video output module is configured to synthesize the decoded video signal and play it through the peripheral device.
9、 根据权利要求 7或 8所述的系统, 其特征在于, 所述监控前端模块 和监控中心服务器还分别包括监控信令处理模块, 用于通过约定或动态信 令协商的方式确定双方的媒体传输通道参数, 建立一个或多个音视频媒体 传输通道。 The system according to claim 7 or 8, wherein the monitoring front end module and the monitoring center server further comprise a monitoring signaling processing module, respectively, configured to determine media of both parties by means of agreement or dynamic signaling negotiation. Transmit channel parameters to establish one or more audio and video media transmission channels.
10、 根据权利要求 9所述的系统, 其特征在于, 所述监控信令处理模 块还用于通过在信令通道中交换信息的方式、 或通过预先配置的方式确定 监控前端模块和监控中心服务器希望釆用的编码格式、 及封装格式。  The system according to claim 9, wherein the monitoring signaling processing module is further configured to determine a monitoring front-end module and a monitoring center server by exchanging information in a signaling channel or by a pre-configured manner. The encoding format and encapsulation format that you want to use.
11、 根据权利要求 6至 10任一所述的系统, 其特征在于, 所述监控中 心服务器包括: 一个或一个以上监控处理单板, 用于对一路音频和 /或一路 视频信号进行拆除封装、 解码、 合成处理, 并将处理后的音频和 /或视频信 号通过对应的外设播放。  The system according to any one of claims 6 to 10, wherein the monitoring center server comprises: one or more monitoring processing boards, configured to remove and package one channel of audio and/or one channel of video signals, Decoding, synthesizing, and playing the processed audio and/or video signals through the corresponding peripherals.
12、 根据权利要求 11所述的系统, 其特征在于, 所述监控处理单板, 还用于通过约定或动态信令协商的方式确定所述监控前端模块和监控处理 单板的媒体传输通道参数, 建立音视频媒体传输通道; 以及通过在信令通 道中交换信息的方式、 或通过预先配置的方式确定所述监控前端模块和监 控处理单板希望釆用的编码格式、 及封装格式。  The system according to claim 11, wherein the monitoring processing board is further configured to determine media transmission channel parameters of the monitoring front-end module and the monitoring processing board by means of appointment or dynamic signaling negotiation. And establishing an audio and video media transmission channel; and determining, by using a manner of exchanging information in the signaling channel, or by using a pre-configured manner, the encoding format and the encapsulation format that the monitoring front-end module and the monitoring processing board hope to use.
13、 一种视频会议终端, 其特征在于, 包括监控前端模块; 其中, 监 控前端模块包括: 监控音频编码模块、 监控视频编码模块、 监控网络收发 模块、 及监控信令处理模块; 其中,  A video conference terminal, comprising: a monitoring front-end module; wherein, the monitoring front-end module comprises: a monitoring audio encoding module, a monitoring video encoding module, a monitoring network transceiver module, and a monitoring signaling processing module;
监控音频编码模块, 用于对终端的音频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring audio encoding module is configured to compress and encode the audio signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控视频编码模块, 用于对终端的视频信号进行压缩编码, 并封装后 发送给监控网络收发模块;  The monitoring video encoding module is configured to compress and encode the video signal of the terminal, and then package and send the signal to the monitoring network transceiver module;
监控网络收发模块, 用于分别将处理后的音频信号和视频信号发送给 监控中心服务器; 监控信令处理模块, 用于通过约定或动态信令协商的方式确定终端与 监控中心服务器的媒体传输通道参数, 建立一个或多个音视频媒体传输通 道。 a monitoring network transceiver module, configured to separately send the processed audio signal and the video signal to the monitoring center server; The monitoring signaling processing module is configured to determine media transmission channel parameters of the terminal and the monitoring center server by means of appointment or dynamic signaling negotiation, and establish one or more audio and video media transmission channels.
14、 一种监控中心服务器, 其特征在于, 包括: 网络收发模块、 音频 解码模块、 音频输出模块、 视频解码模块、 视频输出模块、 以及监控信令 处理模块; 其中,  A monitoring center server, comprising: a network transceiver module, an audio decoding module, an audio output module, a video decoding module, a video output module, and a monitoring signaling processing module;
网络收发模块, 用于接收终端发送的音视频信号, 并分别将音频信号 发送给音频解码模块, 将视频信号发送给视频解码模块;  a network transceiver module, configured to receive an audio and video signal sent by the terminal, and respectively send the audio signal to the audio decoding module, and send the video signal to the video decoding module;
音频解码模块, 用于将音频信号拆除封装, 并解码后发送给音频输出 模块;  An audio decoding module, configured to remove and package the audio signal, and decode the signal to send to the audio output module;
音频输出模块, 用于将解码后的音频信号合成处理, 并通过外设播放; 视频解码模块, 用于将视频信号拆除封装, 并解码后发送给视频输出 模块;  An audio output module, configured to synthesize and process the decoded audio signal, and play the same through the peripheral; the video decoding module is configured to remove the packaged video signal, and decode the signal to send to the video output module;
视频输出模块, 用于将解码后的视频信号合成处理, 并通过外设播放; 监控信令处理模块, 用于通过约定或动态信令协商的方式确定终端与 监控中心服务器的媒体传输通道参数, 建立一个或多个音视频媒体传输通 道。  a video output module, configured to synthesize and process the decoded video signal, and play the same through the peripheral device; the monitoring signaling processing module is configured to determine a media transmission channel parameter of the terminal and the monitoring center server by using a convention or a dynamic signaling negotiation manner, Establish one or more audio and video media transmission channels.
PCT/CN2011/077639 2010-09-30 2011-07-26 Method, system and related device for centralized monitoring of video conference terminal WO2012041117A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2010105036069A CN102447875A (en) 2010-09-30 2010-09-30 Method and system for centralized monitoring of video session terminals and relevant devices
CN201010503606.9 2010-09-30

Publications (1)

Publication Number Publication Date
WO2012041117A1 true WO2012041117A1 (en) 2012-04-05

Family

ID=45891894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/077639 WO2012041117A1 (en) 2010-09-30 2011-07-26 Method, system and related device for centralized monitoring of video conference terminal

Country Status (2)

Country Link
CN (1) CN102447875A (en)
WO (1) WO2012041117A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103840949A (en) * 2012-11-22 2014-06-04 苏州朗捷通智能科技有限公司 Electronic conference management and control system
CN107197186A (en) * 2017-04-14 2017-09-22 武汉鲨鱼网络直播技术有限公司 A kind of audio frequency and video compact system and method

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103428474A (en) * 2012-05-17 2013-12-04 上海闻泰电子科技有限公司 Video and audio monitoring method based on cell phone mobile terminal design
CN103092552A (en) * 2013-01-18 2013-05-08 中兴通讯股份有限公司 Method and system for achieving multi-screen display
CN104410856B (en) * 2014-12-15 2017-04-26 国家电网公司 Automatic inspection method for video conferencing operation system
CN106162040A (en) * 2015-03-30 2016-11-23 北京视联动力国际信息技术有限公司 The method and apparatus that video conference accesses in many ways
CN104918000A (en) * 2015-06-30 2015-09-16 国家电网公司 Video conference remote control device
CN107040748A (en) * 2016-02-03 2017-08-11 北京机电工程研究所 One kind monitoring and video conference application integration platform and method
CN106488066A (en) * 2016-10-27 2017-03-08 合肥浮点信息科技有限公司 A kind of communication transmission system based on system integrating
CN108268324A (en) * 2016-12-30 2018-07-10 航天信息股份有限公司 A kind of long-range multi-service management method and system
CN110636244B (en) * 2018-06-25 2022-03-29 中兴通讯股份有限公司 Video conference server, system, control method and storage medium
CN108924631B (en) * 2018-06-27 2021-07-06 杭州叙简科技股份有限公司 Video generation method based on audio and video shunt storage
CN112118440A (en) * 2020-09-16 2020-12-22 苏州科达科技股份有限公司 Conference polling method, electronic device and storage medium
CN112565644A (en) * 2020-12-01 2021-03-26 云杉(天津)技术有限公司 Transmission method of communication system
CN114143478A (en) * 2021-11-25 2022-03-04 广州林电智能科技有限公司 Multifunctional audio and video processing terminal
CN117596442A (en) * 2024-01-16 2024-02-23 深圳星网信通科技股份有限公司 Converged communication method and platform

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005269225A (en) * 2004-03-18 2005-09-29 Hitachi Information Systems Ltd Method for monitoring number of speech, device for monitoring number of speech, and video conference system
CN1889676A (en) * 2006-06-01 2007-01-03 上海交通大学 Video frequency session system based on P2P and SIP and realizing method thereof
CN1913461A (en) * 2006-08-30 2007-02-14 北京天地互连信息技术有限公司 Remote vedio monitoring system based on next generation interconnection network and its implementing method
CN101583021A (en) * 2009-05-21 2009-11-18 上海华平信息技术股份有限公司 Monitoring device used in video conferencing monitoring system
CN101753961A (en) * 2008-12-08 2010-06-23 北京中星微电子有限公司 Meeting realizing method in video monitoring system and video monitoring meeting system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100373323B1 (en) * 2000-09-19 2003-02-25 한국전자통신연구원 Method of multipoint video conference in video conferencing system
CN100479416C (en) * 2005-07-14 2009-04-15 华为技术有限公司 Audio/video document play-back method and system
CN201436809U (en) * 2009-02-24 2010-04-07 北京建自凯科系统工程有限公司 IP based audio and video monitoring system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005269225A (en) * 2004-03-18 2005-09-29 Hitachi Information Systems Ltd Method for monitoring number of speech, device for monitoring number of speech, and video conference system
CN1889676A (en) * 2006-06-01 2007-01-03 上海交通大学 Video frequency session system based on P2P and SIP and realizing method thereof
CN1913461A (en) * 2006-08-30 2007-02-14 北京天地互连信息技术有限公司 Remote vedio monitoring system based on next generation interconnection network and its implementing method
CN101753961A (en) * 2008-12-08 2010-06-23 北京中星微电子有限公司 Meeting realizing method in video monitoring system and video monitoring meeting system
CN101583021A (en) * 2009-05-21 2009-11-18 上海华平信息技术股份有限公司 Monitoring device used in video conferencing monitoring system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103840949A (en) * 2012-11-22 2014-06-04 苏州朗捷通智能科技有限公司 Electronic conference management and control system
CN107197186A (en) * 2017-04-14 2017-09-22 武汉鲨鱼网络直播技术有限公司 A kind of audio frequency and video compact system and method

Also Published As

Publication number Publication date
CN102447875A (en) 2012-05-09

Similar Documents

Publication Publication Date Title
WO2012041117A1 (en) Method, system and related device for centralized monitoring of video conference terminal
CN106331581B (en) Method and device for communication between mobile terminal and video network terminal
KR100880150B1 (en) Multi-point video conference system and media processing method thereof
JP5345081B2 (en) Method and system for conducting resident meetings
WO2014161402A2 (en) Distributed video conference method, system, terminal, and audio-video integrated device
US8988486B2 (en) Adaptive video communication channel
EP3197153B1 (en) Method and system for conducting video conferences of diverse participating devices
CN101938626B (en) Video session terminal, system, and method
CN102404547B (en) Method and terminal for realizing video conference cascade
WO2012155660A1 (en) Telepresence method, terminal and system
WO2010034254A1 (en) Video and audio processing method, multi-point control unit and video conference system
WO2012068940A1 (en) Method for monitoring terminal through ip network and mcu
WO2016184001A1 (en) Video monitoring processing method and apparatus
WO2014154065A2 (en) Data transmission method, media acquisition device, video conference terminal and storage medium
CN101931783A (en) Double-flow transmitting system and method for video session
WO2015127799A1 (en) Method and device for negotiating on media capability
CN107070671A (en) The processing method of share desktop in conference system
CN108366044A (en) A kind of VoIP remote audio-videos sharing method
WO2014177082A1 (en) Video conference video processing method and terminal
CN112019792A (en) Conference control method, conference control device, terminal equipment and storage medium
CN102438119B (en) Audio/video communication system of digital television
JP2013042492A (en) Method and system for switching video streams in resident display type video conference
JP2003271530A (en) Communication system, inter-system relevant device, program and recording medium
WO2004059975A1 (en) Multiple-picture output method and system
CN110719435B (en) Method and system for carrying out terminal conference

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11828013

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11828013

Country of ref document: EP

Kind code of ref document: A1