WO2014153831A1 - 单路视频多路音频的视频监控方法及系统 - Google Patents

单路视频多路音频的视频监控方法及系统 Download PDF

Info

Publication number
WO2014153831A1
WO2014153831A1 PCT/CN2013/076501 CN2013076501W WO2014153831A1 WO 2014153831 A1 WO2014153831 A1 WO 2014153831A1 CN 2013076501 W CN2013076501 W CN 2013076501W WO 2014153831 A1 WO2014153831 A1 WO 2014153831A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
channel
video
rtp packet
client
Prior art date
Application number
PCT/CN2013/076501
Other languages
English (en)
French (fr)
Inventor
李奎
蔡瑞青
陈杰
凌在龙
金祥庆
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Priority to EP13880464.6A priority Critical patent/EP3104597A4/en
Priority to US15/121,743 priority patent/US10477282B2/en
Publication of WO2014153831A1 publication Critical patent/WO2014153831A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/631Multimode Transmission, e.g. transmitting basic layers and enhancement layers of the content over different transmission paths or transmitting with different error corrections, different keys or with different transmission protocols
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44209Monitoring of downstream path of the transmission network originating from a server, e.g. bandwidth variations of a wireless network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/183Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation

Definitions

  • the invention relates to a video monitoring method and system for single-channel video multi-channel audio.
  • one analog video capture point can only correspond to one audio.
  • the embedded device combines audio and video signals into a composite stream through a series of operations such as acquisition, encoding, and encapsulation, which can be used for local storage and center. Audio and video applications such as remote requests.
  • the object of the invention is to provide a video monitoring method and system for single-channel video multi-channel audio, which can realize audio and video collection with multi-channel audio and single-channel video, and provide users with free choice of playing video and/or corresponding channel audio.
  • the present invention provides a video monitoring method for single-channel video multi-channel audio, including:
  • the device side allocates a fixed initial SSRC value for each audio
  • the client establishes an RTSP interaction mode with the device side
  • the client requests a single channel video and multiple channels of audio to the device end, and the device end randomly generates a modified SSRC value corresponding to the RTP packet to be written for each channel of audio, and corrects each channel corresponding to the audio.
  • the SSRC value is sent to the client;
  • the device collects single-channel video and multi-channel audio, generates and sends a single-channel video RTP packet to the client, and generates an RTP packet containing an initial SSRC value for each audio, which is in the RTP packet of each audio. After the initial SSRC value is modified to the corresponding modified SSRC value, an RTP packet containing the modified SSRC value of each audio is sent to the client, where each RTP packet includes a PT value that distinguishes between video and audio;
  • the client receives the RTP packet of the single channel video and the multi-channel audio, distinguishes the video and the audio according to the PT value in the RTP packet, and distinguishes the audio according to the modified SSRC value in the RTP packet of the multi-channel audio, and according to the user's demand Play the video and/or the audio of the corresponding way.
  • the step of transmitting the RTP packet containing the modified SSRC value of each audio to the client includes:
  • Each video or audio is independently encoded and compressed to form a code stream, and the code stream is encapsulated to form an RTP packet including an initial SSRC value; and the RTP packet of the single video is sent to the client;
  • the client distinguishes video and audio according to the PT value in the RTP packet, distinguishes each channel audio according to the modified SSRC value in the RTP packet of the multi-channel audio, and plays the video according to the user's demand and/or Or the steps of the audio of the corresponding way include:
  • the device side randomly generates a modified SSRC value corresponding to the RTP packet to be written for each audio, and sends the modified SSRC value corresponding to each audio to the client:
  • the device side randomly generates a modified SSRC value corresponding to the RTP packet to be written for each audio, and carries the modified SSRC value corresponding to each audio in the SDP information and sends the modified SSRC value to the SDP information.
  • the client In the DESCRIBE P section of the RTSP interaction process, the device side randomly generates a modified SSRC value corresponding to the RTP packet to be written for each audio, and carries the modified SSRC value corresponding to each audio in the SDP information and sends the modified SSRC value to the SDP information.
  • a video surveillance system for single-channel video multi-channel audio including: a client, configured to interact with the device end through an RTSP manner, and request a single video and multiple to the device end Road audio, and RTP packets that receive single-channel video and multi-channel audio, distinguish video and audio according to PT value in RTP package, distinguish each channel audio according to modified SSRC value in RTP packet of multi-channel audio, and according to user's needs Play video and/or audio of the corresponding way;
  • the device is configured to interact with the client by using an RTSP manner, randomly generate a modified SSRC value corresponding to the RTP packet to be written for each audio, and send the modified SSRC value corresponding to each audio to the client. And collecting single-channel video and multi-channel audio, generating and sending a single-channel RTP packet to the client, generating an RTP packet containing an initial SSRC value for each audio, and RTP for each audio.
  • An RTP packet of the SSRC value is sent to the client, where each RTP packet includes a PT value that distinguishes between video and audio.
  • the device end is configured to separately encode and compress each video or audio to form a code stream, and encapsulate the code stream to form an RTP packet including an initial SSRC value;
  • the RTP packet is sent to the client; after the initial SSRC value in the RTP packet of each audio is modified to the corresponding modified SSRC value, the RTP packet containing the modified SSRC value of each audio is sent to the client.
  • the client is configured to unpack the RTP packet, distinguish video and audio according to the PT value in the RTP packet, and distinguish according to the modified SSRC value in the RTP packet of the multi-channel audio.
  • Each channel of audio decompresses the stream of each video or audio, and plays the decompressed stream of the video and/or the audio of the corresponding channel according to the user's needs.
  • the present invention allocates a fixed initial SSRC value for each audio through the device end; the client establishes an RTSP interaction mode with the device end; the client requests a single video and multiple channels to the device end. Audio, the device end randomly generates a corresponding to be written to the RTP packet for each audio
  • the device end collects single channel video and multiple channels of audio, generates and sends a single channel video RTP packet to the client, and generates each The RTP packet of the road audio containing the initial SSRC value, after modifying the initial SSRC value in the RTP packet of each audio to the corresponding modified SSRC value, sending an RTP packet containing the modified SSRC value of each audio to the client
  • Each RTP packet includes a PT value that distinguishes between video and audio; the client receives the RTP packet of the single channel video and the multiple channels of audio, and distinguishes the video and the audio according to the PT value in the RTP packet, according to the multi-channel audio
  • the modified SSRC value in the RTP packet distinguishes the audio of each channel, and plays the video and/or the audio of the corresponding channel according to the user's needs. It can realize the audio and video collection with multiple audio and single channel video, and the user can freely choose to
  • FIG. 1 is a schematic diagram of interaction between a client and a device according to an embodiment of the present invention
  • FIG. 2 is a flow chart of a video surveillance method for single-channel video multi-channel audio according to an embodiment of the present invention
  • FIG. 3 is a block diagram of a video surveillance system for single-channel video multi-channel audio according to an embodiment of the present invention.
  • the present invention provides a video surveillance method for single-channel video multi-channel audio, which includes an RTSP (Real Time Streaming Protocol) interaction between a client and a device:
  • RTSP Real Time Streaming Protocol
  • Step SI the device side allocates a fixed initial SSRC value for each audio
  • Step S2 the client 1 establishes an RTSP interaction mode with the device end 2;
  • Step S3 the client 1 requests a single channel video and multiple channels of audio to the device end 2, and the device terminal 2 randomly generates a modified SSRC value corresponding to the RTP packet to be written for each channel of audio, and correspondingly corresponds to each channel of audio.
  • the modified SSRC value is sent to the client 1 through the SDP information. In this way, the client 1 can know how many audios, the corrected SSRC value corresponding to each audio, according to the number and order of the modified SSRC values.
  • RTSP is a real-time streaming protocol, an application layer protocol in the TCP/IP protocol system, and is an IETF RFC standard submitted by Columbia University, Netscape, and RealNetworks.
  • the RTSP protocol defines how a one-to-many application can efficiently deliver multimedia data over an IP network.
  • the RTSP is architecturally located on the RTP (Realtime Transport Protocol) real-time transport protocol and the RTCP (Realtime Transport Control Protocol) real-time transport control protocol.
  • RTSP uses TCP or RTP for data transmission.
  • the HTTP request is sent by the client, and the device responds.
  • both client 1 and device 2 can make a request, that is, RTSP can be bidirectional.
  • step S2 the DESCRIBE P section in the RTSP interaction process
  • the device end 2 randomly generates a modified SSRC value corresponding to the RTP packet to be written for each audio, and corrects each audio correspondingly.
  • the SSRC value is carried in the SDP information and sent to the client 1.
  • the RTSP interaction can be roughly divided into the following stages: OPTIONS, SET_PARAMETER, DESCRIBE, SETUP, PLAY, PAUSE, HEARTBEAT, TEARDOWN.
  • the DESCRIBE P section to generate multiple random SSRC values for multiple audio such as SSRC1, SSRC2...SSRCn, and carried in the SDP information back to the client 1, in the order of the first SSRC1
  • the corrected SSRC value of the first audio, and the nth SSRCn is the corrected SSRC value of the nth audio.
  • the initial SSRC value in the header of the RTP packet is modified to the corresponding modified SSRC value.
  • the device 2 determines the initial SSRC value of the audio RTP packet when transmitting the code stream.
  • the initial SSRC value modified to the first audio is the corrected SSRC value of SSRC1, if it is audio.
  • the initial SSRC value is Sn
  • the modified SSRC value modified to the nth audio is SSRCn.
  • the SDP is a session description protocol, and its purpose is to deliver media stream information in a media session, allowing the recipient of the session description to participate in the session.
  • SDP basically works on the Internet.
  • SDP defines the uniform format of the drawing description, but does not define the allocation of multicast addresses and the transmission of SDP messages, nor does it support the negotiation of media coding schemes. These functions are all completed by the underlying session transfer protocol.
  • Typical lower layer session transfer protocols include: SAP (Session Announcement Protocol), SIP, RTSP, HTTP, and E-Mail using MIME. SAP can only contain one session description.
  • SAP of other session transfer protocols can include multiple The description of the painting, the unified format of the SDP painting description includes the following aspects:
  • Media information contained in the session including: media type (video, audio, etc), transport protocol (RTP/UDP/IP, H.320, etc), media format (H.261 video, MPEG video, etc) Multicast or remote (unicast) address and port;
  • Step S4 the device end 2 collects single channel video and multiple channels of audio, generates and sends a single channel video RTP packet to the client terminal 1, generates an RTP packet containing an initial SSRC value for each channel of audio, and each channel of audio. After the initial SSRC value in the RTP packet is modified to the corresponding modified SSRC value, an RTP packet containing the modified SSRC value of each audio is sent to the client 1, wherein each RTP packet includes a video and audio distinction. PT value.
  • the device end 2 can collect the single-channel video through a network camera, which is a new-generation camera generated by combining traditional cameras and network technologies, and can transmit images to the other end of the earth through a network.
  • the network camera has an embedded chip built in, using an embedded real-time operating system.
  • the video signal transmitted by the device end receiving the network camera is digitized, compressed by the high-efficiency compression chip, and transmitted to the client or the management server through the network bus.
  • the user of the client 1 can directly watch the monitoring video by using the browser or the client software.
  • the authorized user can also control the action of the network camera pan/tilt lens or perform system configuration operations on the device end and the network camera.
  • the device 2 can collect one analog video source and multiple analog audio sources, generate and send a single video RTP packet to the client 1, and generate an RTP packet containing an initial SSRC value for each audio, and each audio channel After the initial SSRC value in the RTP packet is modified to the corresponding modified SSRC value, the RTP packet containing the modified SSRC value of each audio is sent to the client, that is, the RTP packet of the single video and the multiple audio is sent through the network. Give the client 1.
  • step S4 specifically includes:
  • the device end 2 separately encodes and compresses each video or audio to form a code stream, and encapsulates the code stream to form an RTP packet including an initial SSRC value;
  • the RTP packet containing the modified SSRC value of each audio is sent to the client.
  • the RTP package needs to be used.
  • the device end 2 can include an acquisition module, an encoding module, a packet module, and a network sending module, respectively performing single-channel video and multi-channel audio encoding and compression to form a code stream. And encapsulating the code stream to form an RTP packet and sending the RTP packet to the client, and the RTP packet message is composed of two parts: a header and a payload.
  • the RTP header format is shown in the following table.
  • V The version number of the RTP protocol, which is 2 digits, and the current protocol version number is 2.
  • CC CSRC counter, 4 bits, indicating the number of CSRC identifiers.
  • Synchronization Source (SSRC) Identifier 32 bits, used to identify the synchronization source. The identifier is randomly selected, and the two sync sources participating in the same video conference cannot have the same SSRC value.
  • Each CSRC identifier is 32 bits and can have 0 to 15. Each CSRC identifies all the special sources contained in the RTP message payload.
  • the payload type which is 7 bits, is used to describe the type of payload in RTP messages, such as GSM audio, JPEM images, and so on.
  • Serial number 16 bits, used to identify the serial number of the RTP packet sent by the sender. Each time a message is sent, the serial number is incremented by one. The receiver uses the serial number to detect packet loss, reorder the message, and recover the data.
  • Timestamp A 32-bit timestamp that reflects the sampling instant of the first octet of the RTP message.
  • the receiver uses the time stamp to calculate the delay and delay jitter and performs synchronous control.
  • Step S5 the client 1 receives the RTP packet of the single channel video and the multi-channel audio, distinguishes the video and the audio according to the PT value in the RTP packet, and distinguishes the audio of each channel according to the modified SSRC value in the RTP packet of the multi-channel audio, and Play the video and/or the audio of the corresponding channel according to the user's needs.
  • step S5 the video and audio are distinguished according to the PT value in the RTP packet, and the audio is differentiated according to the modified SSRC value in the RTP packet of the multi-channel audio, and the video and/or the audio of the corresponding channel are played according to the user's demand.
  • the steps specifically include:
  • the client may include a network receiving module, a unpacking module, a decoding module, and a playing module, where the network receiving module implements a function of receiving a single channel video and a multi-channel audio RTP packet, and the unpacking module implements the RTP The packet is unpacked, and the function of the video stream or the audio stream of each audio is differentiated according to the PT value and the modified SSRC value in the RTP header.
  • the decoding module implements the function of decompressing the code stream, and the playback module is based on the user's The decompressed code stream of the video or audio of the corresponding channel is required to be played.
  • the device side independently collects single-channel video and multi-channel audio, and the client can perform real-time on-demand on any one of them when real-time preview and video playback of video and audio can be required.
  • Embodiment 2
  • the present invention also provides another single-channel video multi-channel audio video monitoring system, including a client and a device.
  • the client 1 is configured to interact with the device by means of an RTSP (Real Time Streaming Protocol), request single video and multiple audio to the device, and receive RTP of single video and multiple audio.
  • RTSP Real Time Streaming Protocol
  • the packet distinguishes video and audio according to the PT value in the RTP packet, distinguishes each channel audio according to the modified SSRC value in the RTP packet of the multi-channel audio, and plays the video and/or the corresponding channel audio according to the user's demand.
  • the client 1 is configured to unpack the RTP packet, and distinguish the video stream or the audio stream of each audio according to the modified SSRC value in the RTP header, and decompress the code stream of each video or audio. And play the decompressed code stream of the corresponding channel video or audio according to the user's needs.
  • the client 1 may include a network receiving module 11, a unpacking module 12, a decoding module 13, and a playing module 14, wherein the network receiving module 11 implements the function of receiving the RTP packet of the single channel video and the multi-channel audio,
  • the packet module 12 implements the function of unpacking the RTP packet and storing the code stream of each video or audio according to the initial SSRC value or the modified SSRC value in the RTP header, and the decoding module 13 implements the function of decompressing the code stream, and plays
  • the module 14 plays the decompressed code stream of the video or audio of the corresponding way according to the user's needs.
  • the device end 2 is configured to interact with the client by using an RTSP manner, randomly generate a modified SSRC value corresponding to the RTP packet to be written for each audio, and send the corrected SSRC value corresponding to each audio to the Client 1 and collecting single-channel video and multi-channel audio, generating and transmitting a single-channel RTP packet to the client 1, generating an RTP packet containing an initial SSRC value for each audio, and each channel sound After the initial SSRC value in the frequency RTP packet is modified to the corresponding modified SSRC value, an RTP packet containing the modified SSRC value of each audio is sent to the client, where each RTP packet includes a video and audio distinction. PT value.
  • the client 1 can learn that there are several channels of audio according to the number and sequence of the modified SSRC values. , the corrected SSRC value corresponding to each channel of audio.
  • the device end 2 randomly generates a modified SSRC value of the corresponding RTP packet for each audio, and carries the modified SSRC value corresponding to each audio in the SDP information.
  • the device end 2 is configured to separately encode and compress each video or audio to form a code stream, encapsulate the code stream to form an RTP packet including an initial SSRC value, and send the RTP packet of the single channel video to the RTP packet.
  • the client sends an RTP packet containing a modified SSRC value to each client to the client after modifying the initial SSRC value in the RTP packet of each audio to the corresponding modified SSRC value.
  • the device end 2 may include an acquisition module 21, an encoding module 22, a packet module 23, and a network sending module 24, respectively, performing single-channel video and multi-channel audio, encoding, and compression to form a code stream, and encapsulating the code stream to form an RTP. Packets and various functions for transmitting the RTP packets to the client. For details of the embodiment, refer to the corresponding part in the first embodiment, and details are not described herein.
  • the present invention allocates a fixed initial SSRC value for each audio by the device end; the client establishes an RTSP interaction mode with the device end; the client requests single channel video and multiple channels of audio to the device end, The device randomly generates a modified SSRC value corresponding to the RTP packet to be written for each audio, and sends a modified SSRC value corresponding to each audio to the client; the device collects single video and multiple channels Audio, generates and sends an RTP packet of a single video to the client, generates an RTP packet containing an initial SSRC value for each audio, and modifies the initial SSRC value in the RTP packet of each audio to a corresponding corrected SSRC value.
  • each RTP packet includes a PT value that distinguishes between video and audio;
  • the RTP packet of the single channel video and the multi-channel audio distinguishes the video and the audio according to the PT value in the RTP packet, distinguishes the audio according to the modified SSRC value in the RTP packet of the multi-channel audio, and plays the video according to the user's demand and/or Or the corresponding channel audio, can achieve audio and video capture with multiple audio, single channel video, and users can freely choose to play video and / or the corresponding channel audio.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明提供了一种单路视频多路音频的视频监控方法及系统,所述方法包括:设备端为每路音频分配一个固定的初始SSRC值;客户端与设备端建立RTSP交互方式;客户端向设备端请求单路视频和多路音频,设备端为每路音频随机生成对应的修正SSRC值并发送给客户端;设备端采集单路视频和多路音频,向客户端发送单路视频的RTP包,将每路音频的RTP包中的初始SSRC值修改为对应的修正SSRC值后,将每路音频的包含修正SSRC值的RTP包发送到所述客户端;客户端根据多路音频的RTP包中的修正SSRC值区分各路音频,并根据用户的需求播放视频和/或相应路的音频。本发明能够实现具有多路音频、单路视频的音视频采集,并供用户自由选择播放视频和/或相应路的音频。

Description

技术领域
本发明涉及一种单路视频多路音频的视频监控方法及系统。
背景技术
目前, 在进行视频监控时, 通常一路模拟视频采集点只能对应一路音频, 嵌入式设备通过采集、 编码、 封装等一系列操作, 将音视频信号合成为复合码 流, 可用于本地存储、 中心远程请求等音视频应用。
然而, 随着视频监控要求的提高, 目前出现了一种将设置有网络摄像机的 监控区域中划分有多个不同的功能区域(如几个拒台) 的监控场景, 在这种监 控场景中, 进行视频监控的管理中心不仅要求实现远程实时视频的采集和播放, 而且要求能够随意选播其中一路音频。 针对这种监控场景, 现有的一路模拟视 频采集点对应一路音频的监控方式显然不能满足单路视频配合多路音频的应用 需求。
发明内容
发明的目的在于提供一种单路视频多路音频的视频监控方法及系统, 能够 实现具有多路音频、 单路视频的音视频采集, 并供用户自由选择播放视频和 /或 相应路的音频。
为解决上述问题, 本发明提供一种单路视频多路音频的视频监控方法, 包 括:
设备端为每路音频分配一个固定的初始 SSRC值;
客户端与设备端建立 RTSP交互方式;
所述客户端向所述设备端请求单路视频和多路音频, 所述设备端为每路音 频随机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值发送给所述客户端;
所述设备端采集单路视频和多路音频, 生成并向所述客户端发送单路视频 的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包,将每路音频的 RTP包 中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC 值的 RTP包发送到所述客户端,其中,每个 RTP包包括一区分视频和音频的 PT 值;
所述客户端接收单路视频和多路音频的 RTP包, 根据 RTP包中的 PT值区 分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根 据用户的需求播放视频和 /或相应路的音频。
进一步的, 在上述方法中, 所述生成并向所述客户端发送单路视频的 RTP 包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包中的初 始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送到所述客户端的步骤包括:
将每路视频或音频分别独立进行编码和压缩形成码流、 封装所述码流形成 包含初始 SSRC值的 RTP包; 将单路视频的 RTP包发送至所述客户端;
将单路视频的 RTP包发送至所述客户端; 将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后,将每路音频的包含修正 SSRC值的 RTP 包发送到所述客户端。
进一步的, 在上述方法中, 所述客户端根据 RTP包中的 PT值区分视频和 音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的 需求播放视频和 /或相应路的音频的步骤包括:
对所述 RTP包进行拆包, 根据 RTP包中的 PT值区分视频和音频, 并根据 多路音频的 RTP包中的修正 SSRC值区分各路音频, 对每路视频或音频的码流 进行解压, 并根据用户的需求播放视频和 /或相应路的音频的解压后的码流。 进一步的, 在上述方法中, 设备端为每路音频随机生成对应的待写入 RTP 包的修正 SSRC值,并将每路音频对应的修正 SSRC值发送给所述客户端的步骤 中:
在 RTSP交互过程中的 DESCRIBE P介段, 所述设备端为每路音频随机生成 对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值携带 在 SDP信息中发送给所述客户端。 根据本发明的另一面, 提供一种单路视频多路音频的视频监控系统, 包括: 客户端,用于通过 RTSP方式与所述设备端进行交互, 向所述设备端请求单 路视频和多路音频,及接收单路视频和多路音频的 RTP包,根据 RTP包中的 PT 值区分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播放视频和 /或相应路的音频;
设备端, 用于通过 RTSP方式与所述客户端进行交互, 为每路音频随机生成 对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值发送 给所述客户端, 及采集单路视频和多路音频, 生成并向所述客户端发送单路视 频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP
SSRC值的 RTP包发送到所述客户端, 其中, 每个 RTP包包括一区分视频和音 频的 PT值。
进一步的, 在上述系统中, 所述设备端, 用于将每路视频或音频分别独立 进行编码和压缩形成码流、 封装所述码流形成包含初始 SSRC值的 RTP包; 将 单路视频的 RTP包发送至所述客户端; 将每路音频的 RTP包中的初始 SSRC值 修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送 到所述客户端。 进一步的, 在上述系统中, 所述客户端, 用于对所述 RTP包进行拆包, 根 据 RTP包中的 PT值区分视频和音频,并根据多路音频的 RTP包中的修正 SSRC 值区分各路音频, 对每路视频或音频的码流进行解压, 并根据用户的需求播放 视频和 /或相应路的音频的解压后的码流。
与现有技术相比,本发明通过设备端为每路音频分配一个固定的初始 SSRC 值;客户端与设备端建立 RTSP交互方式; 所述客户端向所述设备端请求单路视 频和多路音频, 所述设备端为每路音频随机生成对应的待写入 RTP包的爹正
SSRC值, 并将每路音频对应的修正 SSRC值发送给所述客户端; 所述设备端采 集单路视频和多路音频, 生成并向所述客户端发送单路视频的 RTP包, 生成每 路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包中的初始 SSRC值 修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送 到所述客户端, 其中, 每个 RTP包包括一区分视频和音频的 PT值; 所述客户 端接收单路视频和多路音频的 RTP包,根据 RTP包中的 PT值区分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播 放视频和 /或相应路的音频, 能够实现具有多路音频、 单路视频的音视频采集, 并供用户自由选择播放视频和 /或相应路的音频。
附图说明
图 1是本发明一实施例的客户端和设备端交互原理图;
图 2是本发明一实施例的单路视频多路音频的视频监控方法的流程图; 图 3是本发明一实施例的单路视频多路音频的视频监控系统的模块图。 具体实施方式
为使本发明的上述目的、 特征和优点能够更加明显易懂, 下面结合附图和 具体实施方式对本发明作进一步详细的说明。 实施例一
如图 1和 2所示, 本发明提供一种单路视频多路音频的视频监控方法, 包 括客户端与设备端进行 RTSP ( Real Time Streaming Protocol ) 交互:
步骤 SI , 设备端为每路音频分配一个固定的初始 SSRC值;
步骤 S2, 客户端 1与设备端 2建立 RTSP交互方式;
步骤 S3, 所述客户端 1向所述设备端 2请求单路视频和多路音频, 设备端 2为每路音频随机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音频对 应的修正 SSRC值通过 SDP信息发送给所述客户端 1。 这样客户端 1就可以根 据所述修正 SSRC值的数量和顺序得知有几路音频, 每路音频所对应的修正 SSRC值。
具体的, RTSP为实时流传输协议, 是 TCP/IP协议体系中的一个应用层协 议,是由哥伦比亚大学、网景和 RealNetworks公司提交的 IETF RFC标准。 RTSP 协议定义了一对多应用程序如何有效地通过 IP网络传送多媒体数据。 RTSP在 体系结构上位于 RTP ( Realtime Transport Protocol ) 实时传输协议和 RTCP ( Realtime Transport Control Protocol ) 实时传输控制协议之上, RTSP使用 TCP 或 RTP完成数据传输。 HTTP与 RTSP相比, HTTP传送 HTML, 而 RTSP传送 的是多媒体数据。 HTTP请求由客户端发出, 设备端作出响应; 使用 RTSP时, 客户端 1和设备端 2都可以发出请求, 即 RTSP可以是双向的。
优选的, 步骤 S2中, 可在 RTSP交互过程中的 DESCRIBE P介段, 所述设备 端 2为每路音频随机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音频 对应的修正 SSRC值携带在 SDP信息中发送给所述客户端 1。 具体的, RTSP交 互大致可分为如下几个阶段: OPTIONS , SET_PARAMETER、 DESCRIBE, SETUP、 PLAY, PAUSE、 HEARTBEAT、 TEARDOWN。 本实施例中, 为了区 分多路音频,在 DESCRIBE P介段为多路音频生成多个随机的 SSRC值如 SSRC1、 SSRC2...SSRCn, 并携带在 SDP信息里返回给客户端 1 , 按照先后顺序第一个 SSRC1为第一路音频的修正 SSRC值,第 n个 SSRCn为第 n路音频的修正 SSRC 值。 然后在后续发送 RTP包到客户端 1时, 将 RTP包的报头中的初始 SSRC值 修改为相应的修正 SSRC值。 例如, 设备端 2在发送码流时根据音频 RTP包的 初始 SSRC值来判断, 若是音频的初始 SSRC值为 si , 则修改为第一路音频的 初始 SSRC值为修正 SSRC值为 SSRC1,若是音频的初始 SSRC值为 Sn, 则修改 为第 n路音频的修正 SSRC值为 SSRCn。
详细的, SDP是会话描述协议, 其目的就是在媒体会话中, 传递媒体流信 息, 允许会话描述的接收者去参与会话。 SDP基本上在 internet上工作, SDP定 义了绘画描述的统一格式, 但并不定义多播地址的分配和 SDP消息的传输, 也 不支持媒体编码方案的协商,这些功能均由下层会话传送协议完成。 典型的下层 会话传送协议包括: SAP(Session Announcement Protocol会话公告协议)、 SIP, RTSP、 HTTP和使用 MIME的 E-Mail, 其中, SAP只能包含一个会话描述, 其 它会话传输协议的 SDP可包含多个绘画描述, SDP绘画描述的统一格式包括以 下一些方面:
1 )会话的名称和目的;
2 )会话存活时间;
3 ) 包含在会话中的媒体信息, 包括: 媒体类型(video, audio, etc) , 传输 协议 (RTP/UDP/IP, H.320, etc) , 媒体格式 (H.261 video, MPEG video, etc) 多 播或远端 (单播 )地址和端口;
4 ) 为接收媒体而需的信息 (addresses , ports , formats and so on);
5 )使用的带宽信息;
6 )可信赖的接洽信息( Contact information )。 步骤 S4, 所述设备端 2采集单路视频和多路音频, 生成并向所述客户端 1发 送单路视频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音 频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包 含修正 SSRC值的 RTP包发送到所述客户端 1 , 其中, 每个 RTP包包括一区分 视频和音频的 PT值。 具体的, 所述设备端 2可通过一网络摄像机采集所述单路 视频, 网络摄像机是一种结合传统摄像机与网络技术所产生的新一代摄像机, 它可以将影像通过网络传至地球另一端, 且远端的浏览者不需用任何专业软件, 只要标准的网络浏览器(如 Microsoft IE或 Netscape )或配套的客户端软件即可 监视其影像。 网络摄像机内置一个嵌入式芯片, 采用嵌入式实时操作系统。 所 述设备端接收网络摄像机传送来的视频信号数字化后由高效压缩芯片压缩, 通 过网络总线传送给客户端或管理服务器。 客户端 1的用户可以直接用浏览器或 客户端软件观看监控视频, 另外, 授权用户还可以控制网络摄像机云台镜头的 动作或对所述设备端和网络摄像机进行系统配置操作。 设备端 2可以采集一路 模拟视频源、 多路模拟音频源, 生成并向所述客户端 1发送单路视频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后,将每路音频的包含修正 SSRC值的 RTP 包发送到所述客户端, 即通过网络将单路视频和多路音频的 RTP包发送给所述 客户端 1。
优选的, 步骤 S4具体可包括:
所述设备端 2将每路视频或音频分别独立进行编码和压缩形成码流、 封装所 述码流形成包含初始 SSRC值的 RTP包;
将单路视频的 RTP包发送至所述客户端,
将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每 路音频的包含修正 SSRC值的 RTP包发送到所述客户端。 具体的, 为了使客户 端 1接收到多路音频时, 能够正确有效的区分每一路音频, 这里需要将 RTP包
SSRC置为修正 SSRC值如 SSRC1,将第二路音频的 SSRC置为修正 SSRC值如 SSRC2,将第 n路音频的 SSRC置为修正 SSRC值如 SSRCn, 当客户端 1接收到 多路音频时, 能够根据修正 SSRC值对每一路音频进行区别存储和播放, 所述 设备端 2可包括采集模块、 编码模块、 封包模块、 网络发送模块, 分别完成单 路视频和多路音频编码和压缩形成码流、 封装所述码流形成 RTP包及发送所述 RTP包至所述客户端的各种功能, RTP包报文由两部分组成: 报头和有效载荷。 RTP 头格式如下表所示,
Figure imgf000010_0001
其中:
V: RTP协议的版本号, 占 2位, 当前协议版本号为 2。
P: 填充标志, 占 1位, 如果 P=l , 则在该 4艮文的尾部填充一个或多个额 的八位组, 它们不是有效载荷的一部分。
X 扩展标志, 占 1位, 如果 X=l , 则在 RTP报头后跟有一个扩展报头。 CC: CSRC计数器, 占 4位, 指示 CSRC 标识符的个数。
M: 标记, 占 1位, 不同的有效载荷有不同的含义, 对于视频, 标记一帧的 结束; 对于音频, 标记会话的开始。 同步信源 (SSRC)标识符: 占 32位, 用于标识同步信源。 该标识符是随机选 择的, 参加同一视频会议的两个同步信源不能有相同的 SSRC值。
特约信源 (CSRC)标识符: 每个 CSRC标识符占 32位, 可以有 0 ~ 15个。每 个 CSRC标识了包含在该 RTP报文有效载荷中的所有特约信源。
PT: 有效载荷类型,占 7位,用于说明 RTP报文中有效载荷的类型,如 GSM 音频、 JPEM图像等。
序列号: 占 16位, 用于标识发送者所发送的 RTP报文的序列号, 每发送一 个报文, 序列号增 1。 接收者通过序列号来检测报文丢失情况, 重新排序报文, 恢复数据。
时戳 (Timestamp): 占 32位, 时戳反映了该 RTP报文的第一个八位组的采样 时刻。 接收者使用时戳来计算延迟和延迟抖动, 并进行同步控制。
步骤 S5,所述客户端 1接收单路视频和多路音频的 RTP包,根据 RTP包中 的 PT值区分视频和音频,根据多路音频的 RTP包中的修正 SSRC值区分各路音 频, 并根据用户的需求播放视频和 /或相应路的音频。
优选的, 步骤 S5中, 根据 RTP包中的 PT值区分视频和音频, 根据多路音 频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播放视频和 /或 相应路的音频的步骤具体包括:
对所述 RTP包进行拆包, 根据 RTP包中的 PT值区分视音频, 并根据多路 音频的 RTP包中的修正 SSRC值区分各路音频, 对每路视频或音频的码流进行 解压, 并根据用户的需求播放视频和 /或相应路的音频的解压后的码流。 具体的, 所述客户端可包括网络接收模块、 拆包模块、 解码模块、 播放模块, 其中, 网 络接收模块实现接收单路视频和多路音频的 RTP包的功能, 拆包模块实现所述 RTP包进行拆包,并根据 RTP报头中的 PT值和修正 SSRC值区分视频或各路音 频的码流的功能, 解码模块实现将码流进行解压的功能, 播放模块根据用户的 需求播放相应路的视频或音频的解压后的码流。
综上所述, 本实施例中设备端独立采集单路视频和多路音频, 客户端在可 以需要实时预览和录像回放视音频时, 对其中任意一路进行实时点播。 实施例二
如图 1和 3所示, 本发明还提供另一种单路视频多路音频的视频监控系统, 包括客户端和设备端。
所述客户端 1 , 用于通过 RTSP ( Real Time Streaming Protocol )方式与所述 设备端进行交互, 向所述设备端请求单路视频和多路音频, 及接收单路视频和 多路音频的 RTP包, 根据 RTP包中的 PT值区分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播放视频和 /或相应 路的音频。
优选的, 所述客户端 1 , 用于对所述 RTP包进行拆包, 并根据 RTP报头中 的修正 SSRC值区分视频或各路音频的码流, 对每路视频或音频的码流进行解 压, 并根据用户的需求播放相应路的视频或音频的解压后的码流。 具体的, 所 述客户端 1可包括网络接收模块 11、 拆包模块 12、 解码模块 13、 播放模块 14, 其中, 网络接收模块 11实现接收单路视频和多路音频的 RTP包的功能, 拆包模 块 12实现所述 RTP包进行拆包,并根据 RTP报头中的初始 SSRC值或修正 SSRC 值存储每路视频或音频的码流的功能,解码模块 13实现将码流进行解压的功能, 播放模块 14根据用户的需求播放相应路的视频或音频的解压后的码流。
所述设备端 2, 用于通过 RTSP方式与所述客户端进行交互, 为每路音频随 机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC 值发送给所述客户端 1 , 及采集单路视频和多路音频, 生成并向所述客户端 1发 送单路视频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音 频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包 含修正 SSRC值的 RTP包发送到所述客户端, 其中, 每个 RTP包包括一区分视 频和音频的 PT值。具体的,所述设备端 2将每路音频对应的 RTP包的修正 SSRC 值发送给所述客户端 1后, 客户端 1就可以根据所述修正 SSRC值的数量和顺 序得知有几路音频, 每路音频所对应的修正 SSRC值。
更优的, 可在 RTSP交互过程中的 DESCRIBE P介段, 所述设备端 2为每路 音频随机生成对应的 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC 值携带在 SDP信息中发送给所述客户端。 优选的, 所述设备端 2, 用于将每路 视频或音频分别独立进行编码和压缩形成码流、 封装所述码流形成包含初始 SSRC值的 RTP包;将单路视频的 RTP包发送至所述客户端;将每路音频的 RTP 包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送到所述客户端。 具体的, 所述设备端 2可包括采集模块 21、 编码模块 22、 封包模块 23、 网络发送模块 24分别完成单路视频和多路音 频、 编码和压缩形成码流、 封装所述码流形成 RTP包及发送所述 RTP包至所述 客户端的各种功能。 本实施例的详细内容可参见实施一中的对应部分, 在此不 再赘述。
综上所述, 本发明通过设备端为每路音频分配一个固定的初始 SSRC值; 客户端与设备端建立 RTSP交互方式;所述客户端向所述设备端请求单路视频和 多路音频, 所述设备端为每路音频随机生成对应的待写入 RTP包的修正 SSRC 值, 并将每路音频对应的修正 SSRC值发送给所述客户端; 所述设备端采集单 路视频和多路音频, 生成并向所述客户端发送单路视频的 RTP包, 生成每路音 频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包中的初始 SSRC值修改 为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送到所 述客户端, 其中, 每个 RTP包包括一区分视频和音频的 PT值; 所述客户端接 收单路视频和多路音频的 RTP包, 根据 RTP包中的 PT值区分视频和音频, 根 据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播放 视频和 /或相应路的音频, 能够实现具有多路音频、 单路视频的音视频采集, 并 供用户自由选择播放视频和 /或相应路的音频。
本说明书中各个实施例采用递进的方式描述, 每个实施例重点说明的都是 与其他实施例的不同之处, 各个实施例之间相同相似部分互相参见即可。 对于 实施例公开的系统而言, 由于与实施例公开的方法相对应, 所以描述的比较筒 单, 相关之处参见方法部分说明即可。
专业人员还可以进一步意识到, 结合本文中所公开的实施例描述的各示例 的单元及算法步骤, 能够以电子硬件、 计算机软件或者二者的结合来实现, 为 了清楚地说明硬件和软件的可互换性, 在上述说明中已经按照功能一般性地描 述了各示例的组成及步骤。 这些功能究竟以硬件还是软件方式来执行, 取决于 技术方案的特定应用和设计约束条件。 专业技术人员可以对每个特定的应用来 使用不同方法来实现所描述的功能, 但是这种实现不应认为超出本发明的范围。
显然, 本领域的技术人员可以对发明进行各种改动和变型而不脱离本发明 的精神和范围。 这样, 倘若本发明的这些修改和变型属于本发明权利要求及其 等同技术的范围之内, 则本发明也意图包括这些改动和变型在内。

Claims

权利要求
1、 一种单路视频多路音频的视频监控方法, 其特征在于, 包括:
设备端为每路音频分配一个固定的初始 SSRC值;
客户端与设备端建立 RTSP交互方式;
所述客户端向所述设备端请求单路视频和多路音频, 所述设备端为每路音 频随机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值发送给所述客户端;
所述设备端采集单路视频和多路音频, 生成并向所述客户端发送单路视频 的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包 中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC 值的 RTP包发送到所述客户端,其中,每个 RTP包包括一区分视频和音频的 PT 值;
所述客户端接收单路视频和多路音频的 RTP包, 根据 RTP包中的 PT值区 分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根 据用户的需求播放视频和 /或相应路的音频。
2、 如权利要求 1所述的单路视频多路音频的视频监控方法, 其特征在于, 所述生成并向所述客户端发送单路视频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后,将每路音频的包含修正 SSRC值的 RTP包发送到所述客户端的步骤 包括:
将每路视频或音频分别独立进行编码和压缩形成码流、 封装所述码流形成 包含初始 SSRC值的 RTP包;
将单路视频的 RTP包发送至所述客户端; 将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后, 将 每路音频的包含修正 SSRC值的 RTP包发送到所述客户端。
3、 如权利要求 2所述的单路视频多路音频的视频监控方法, 其特征在于, 所述客户端根据 RTP包中的 PT值区分视频和音频, 根据多路音频的 RTP包中 的修正 SSRC值区分各路音频, 并根据用户的需求播放视频和 /或相应路的音频 的步骤包括:
对所述 RTP包进行拆包, 根据 RTP包中的 PT值区分视视频和音频, 并根 据多路音频的 RTP包中的修正 SSRC值区分各路音频, 对视频或每路音频的码 流进行解压, 并根据用户的需求播放视频和 /或相应路的音频的解压后的码流。
4、 如权利要求 1所述的单路视频多路音频的视频监控方法, 其特征在于, 设备端为每路音频随机生成对应的待写入 RTP包的修正 SSRC值, 并将每路音 频对应的修正 SSRC值发送给所述客户端的步骤中:
在 RTSP交互过程中的 DESCRIBE P介段, 所述设备端为每路音频随机生成 对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值携带 在 SDP信息中发送给所述客户端。
5、 一种单路视频多路音频的视频监控系统, 其特征在于, 包括:
客户端,用于通过 RTSP方式与所述设备端进行交互, 向所述设备端请求单 路视频和多路音频,及接收单路视频和多路音频的 RTP包,根据 RTP包中的 PT 值区分视频和音频, 根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 并根据用户的需求播放视频和 /或相应路的音频;
设备端,用于通过 RTSP方式与所述客户端进行交互, 为每路音频随机生成 对应的待写入 RTP包的修正 SSRC值, 并将每路音频对应的修正 SSRC值发送 给所述客户端, 及采集单路视频和多路音频, 生成并向所述客户端发送单路视 频的 RTP包, 生成每路音频的包含初始 SSRC值的 RTP包, 将每路音频的 RTP 包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送到所述客户端, 其中, 每个 RTP包包括一区分视频和音 频的 PT值。
6、 如权利要求 5所述的单路视频多路音频的视频监控系统, 其特征在于, 所述设备端, 用于将每路视频或音频分别独立进行编码和压缩形成码流、 封装 所述码流形成包含初始 SSRC值的 RTP包; 将单路视频的 RTP包发送至所述客 户端; 将每路音频的 RTP包中的初始 SSRC值修改为对应的修正 SSRC值后, 将每路音频的包含修正 SSRC值的 RTP包发送到所述客户端。
7、 如权利要求 6所述的单路视频多路音频的视频监控系统, 其特征在于, 所述客户端, 用于对所述 RTP包进行拆包, 根据 RTP包中的 PT值区分视频和 音频, 并根据多路音频的 RTP包中的修正 SSRC值区分各路音频, 对视频或每 路音频的码流进行解压, 并根据用户的需求播放视频和 /或相应路的音频的解压 后的码流。
PCT/CN2013/076501 2013-03-29 2013-05-30 单路视频多路音频的视频监控方法及系统 WO2014153831A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP13880464.6A EP3104597A4 (en) 2013-03-29 2013-05-30 Method and system for monitoring video with single path of video and multiple paths of audio
US15/121,743 US10477282B2 (en) 2013-03-29 2013-05-30 Method and system for monitoring video with single path of video and multiple paths of audio

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310109433.6A CN104079870B (zh) 2013-03-29 2013-03-29 单路视频多路音频的视频监控方法及系统
CN201310109433.6 2013-03-29

Publications (1)

Publication Number Publication Date
WO2014153831A1 true WO2014153831A1 (zh) 2014-10-02

Family

ID=51600883

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/076501 WO2014153831A1 (zh) 2013-03-29 2013-05-30 单路视频多路音频的视频监控方法及系统

Country Status (4)

Country Link
US (1) US10477282B2 (zh)
EP (1) EP3104597A4 (zh)
CN (1) CN104079870B (zh)
WO (1) WO2014153831A1 (zh)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106162038A (zh) * 2015-03-25 2016-11-23 中兴通讯股份有限公司 一种音频发送方法及装置
CN107155083B (zh) * 2016-03-02 2020-03-17 腾讯科技(深圳)有限公司 一种多端多媒体数据处理方法、装置和系统
CN108270740A (zh) * 2016-12-30 2018-07-10 上海华讯网络系统有限公司 对含有多路视频流的视频会议的直播系统和方法
CN110086978A (zh) * 2018-01-25 2019-08-02 浙江宇视科技有限公司 多路音频传输方法、装置及终端设备
CN110248152A (zh) * 2018-11-20 2019-09-17 浙江大华技术股份有限公司 一种处理监控数据的方法、云台摄像机及监控系统
CN112073822B (zh) * 2019-06-10 2022-10-18 成都鼎桥通信技术有限公司 一种宽带集群通信中的媒体变更方法和系统
CN112152975B (zh) * 2019-06-28 2022-11-08 成都鼎桥通信技术有限公司 音频数据的处理方法和装置
CN113542688B (zh) * 2021-07-14 2023-03-28 杭州海康威视数字技术股份有限公司 音视频监控方法、装置、设备、存储介质以及系统
CN114827101A (zh) * 2022-04-13 2022-07-29 京东科技信息技术有限公司 音频处理方法、装置、电子设备及存储介质
CN117615036A (zh) * 2023-11-28 2024-02-27 北京华宇信息技术有限公司 一种多路音频的传输方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1561078A (zh) * 2004-02-27 2005-01-05 北京邮电大学 基于实时传输协议的端到端网络测量方法
CN1655609A (zh) * 2004-02-13 2005-08-17 精工爱普生株式会社 记录视频会议数据的方法和系统
CN101689998A (zh) * 2007-06-12 2010-03-31 微软公司 活动说话者标识

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6466550B1 (en) * 1998-11-11 2002-10-15 Cisco Technology, Inc. Distributed conferencing system utilizing data networks
WO2001003389A1 (en) 1999-07-06 2001-01-11 At & T Laboratories Cambridge Ltd. A thin multimedia communication device and method
US7167486B2 (en) * 2001-01-19 2007-01-23 Shoretel, Inc. Voice traffic through a firewall
US8659636B2 (en) * 2003-10-08 2014-02-25 Cisco Technology, Inc. System and method for performing distributed video conferencing
GB2413726A (en) * 2004-04-29 2005-11-02 Siemens Plc Rendering a media stream to a calling device from one of a plurality of called devices on the basis of an identifier in the media stream
CN1750505B (zh) * 2004-09-16 2010-04-28 华为技术有限公司 基于实时传输协议的发送方标识方法
JP2008516475A (ja) * 2004-10-05 2008-05-15 ヴェクターマックス コーポレーション マルチメディアデータを放送する方法及びシステム
US20060227237A1 (en) * 2005-03-31 2006-10-12 International Business Machines Corporation Video surveillance system and method with combined video and audio recognition
US7612793B2 (en) * 2005-09-07 2009-11-03 Polycom, Inc. Spatially correlated audio in multipoint videoconferencing
CN100479528C (zh) 2006-08-30 2009-04-15 华为技术有限公司 一种支持多音轨的方法、系统及流媒体服务器
US20080107108A1 (en) * 2006-11-03 2008-05-08 Nokia Corporation System and method for enabling fast switching between psse channels
CN101039325B (zh) * 2007-04-26 2010-12-29 中兴通讯股份有限公司 基于混合器的实时传输协议数据包配置方法
US20090163254A1 (en) * 2007-12-20 2009-06-25 Texas Instruments Incorporated Method, system and apparatus for synchronizing multiple streams for optimizing delay and talk time
US20090204812A1 (en) * 2008-02-13 2009-08-13 Baker Todd M Media processing
JP5675807B2 (ja) * 2009-08-12 2015-02-25 コニンクリーケ・ケイピーエヌ・ナムローゼ・フェンノートシャップ 動的なrtcpリレー
CN102868937A (zh) * 2011-07-08 2013-01-09 中兴通讯股份有限公司 多媒体数据的传输方法及系统

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1655609A (zh) * 2004-02-13 2005-08-17 精工爱普生株式会社 记录视频会议数据的方法和系统
CN1561078A (zh) * 2004-02-27 2005-01-05 北京邮电大学 基于实时传输协议的端到端网络测量方法
CN101689998A (zh) * 2007-06-12 2010-03-31 微软公司 活动说话者标识

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3104597A4 *

Also Published As

Publication number Publication date
US20170099524A1 (en) 2017-04-06
US10477282B2 (en) 2019-11-12
CN104079870B (zh) 2017-07-11
CN104079870A (zh) 2014-10-01
EP3104597A1 (en) 2016-12-14
EP3104597A4 (en) 2017-11-29

Similar Documents

Publication Publication Date Title
WO2014153831A1 (zh) 单路视频多路音频的视频监控方法及系统
KR101340762B1 (ko) 지상파 디지털 멀티미디어 방송 서비스의 인터넷 프로토콜패킷 재전송 방법
EP3515083B1 (en) Method and apparatus for performing synchronization operation on contents
JP2017130955A (ja) デジタル放送システムにおけるデータを受信する装置
JP2018515976A (ja) ブロードキャストサービスのためのサービスシグナリングを送受信する方法及び装置
WO2008055420A1 (fr) Procédé de synchronisation entre différents flux de média et un système
WO2012167638A1 (zh) 媒体数据控制方法及装置
US10630656B2 (en) System and method of encrypted media encapsulation
WO2013113281A1 (zh) 传输多媒体数据的方法、装置及系统
WO2017092338A1 (zh) 一种数据传输的方法和装置
WO2008098509A1 (fr) Procédé et système de négociation d'un support et procédé de transmission d'information de description de support
WO2013007145A1 (zh) 多媒体数据的传输方法及系统
US20150071307A1 (en) Communication interface and method for robust header compression of data flows
WO2011153842A1 (zh) 媒体网关间的报文传输方法、媒体网关和无线通信系统
JP4600513B2 (ja) データ送信装置、送信レート制御方法およびプログラム
Chu et al. The design and implementation of video surveillance system based on H. 264, SIP, RTP/RTCP and RTSP
JP4544029B2 (ja) 携帯端末、ストリーミング通信システム、ストリーミング通信方法及びストリーミング通信プログラム
JP2004159101A (ja) データ伝送方法、データ送信装置、データ受信装置、及びデータ伝送システム
JP2015506614A5 (zh)
CN112887497A (zh) 通信方法、装置和计算机存储介质
JP2020074589A (ja) 送信方法
KR20060038296A (ko) 이동통신 네트워크에서의 멀티플렉싱 장치 및 방법
KR20100080330A (ko) 패킷을 처리하는 방법 및 장치
WO2010075794A1 (zh) 一种压缩复用报文处理方法及装置
JP2006295537A (ja) 通信システム、通信装置および方法、プログラム、並びにデータ構造

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13880464

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13880464

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 11/04/2016)

REEP Request for entry into the european phase

Ref document number: 2013880464

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013880464

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 15121743

Country of ref document: US