WO2020199304A1 - 一种同步播放消息流与音视频流的方法、装置和系统 - Google Patents

一种同步播放消息流与音视频流的方法、装置和系统 Download PDF

Info

Publication number
WO2020199304A1
WO2020199304A1 PCT/CN2019/086061 CN2019086061W WO2020199304A1 WO 2020199304 A1 WO2020199304 A1 WO 2020199304A1 CN 2019086061 W CN2019086061 W CN 2019086061W WO 2020199304 A1 WO2020199304 A1 WO 2020199304A1
Authority
WO
WIPO (PCT)
Prior art keywords
message
audio
video
stream
server
Prior art date
Application number
PCT/CN2019/086061
Other languages
English (en)
French (fr)
Inventor
张俊
黄剑武
Original Assignee
网宿科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 网宿科技股份有限公司 filed Critical 网宿科技股份有限公司
Priority to EP19752632.0A priority Critical patent/EP3742742A4/en
Priority to US16/544,991 priority patent/US11102540B2/en
Publication of WO2020199304A1 publication Critical patent/WO2020199304A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4886Data services, e.g. news ticker for displaying a ticker, e.g. scrolling banner for news, stock exchange, weather data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • This application relates to the field of streaming media live broadcast technology, and in particular to a method, device and system for synchronously playing message streams and audio and video streams.
  • the teacher terminal where the teacher is located can collect the audio and video streams generated during the live broadcast through live broadcast equipment such as personal computers, cameras, and headsets, and then push the audio and video streams to the corresponding audio and video servers.
  • the student terminal where the student is located can pull the audio and video stream from the audio and video server through the live broadcast device for viewing.
  • the message streams generated during the live broadcast such as whiteboard, text chat, and roll call
  • they can be pushed to the corresponding message stream server via the live streaming device at the push end (teacher or student end), and then the pull end (student end)
  • the live broadcast equipment on the end or teacher end pulls the message stream from the message stream server, and then presents it to the viewer (student or teacher).
  • some embodiments of the present application provide a method, device and system for synchronously playing a message stream and an audio and video stream.
  • the technical solution is as follows:
  • a method for synchronously playing a message stream and an audio and video stream is provided.
  • the method is executed at the streaming end, and includes:
  • the audio and video stream is pulled from the audio and video server and played, and the message stream is pulled from the message server and cached.
  • Each audio and video frame in the audio and video stream is added with an audio and video time stamp.
  • a message time stamp is added to each message, and the time source taken by the audio and video time stamp and the message time stamp is a synchronization time source;
  • the message to be played synchronously with the audio and video frame to be played is determined in the buffered message stream and played.
  • a method for synchronously playing a message stream and an audio and video stream is provided.
  • the method is executed at the push end and includes:
  • the server pulls the message stream and caches it, and based on the audio and video timestamp of the audio and video frame and the message timestamp of the message, it is determined in the cached message stream to be played synchronously with the audio and video frame to be played Message and play it.
  • a method for synchronously playing a message stream and an audio and video stream includes:
  • the audio and video server receives the audio and video stream, and adds an audio and video time stamp to each audio and video frame in the received audio and video stream;
  • the message server receives the message stream and adds a message timestamp to each message in the received message stream, wherein the server time of the message server and the server time of the audio and video server are synchronized.
  • a streaming end is provided, and the streaming end includes:
  • the streaming module is used to pull audio and video streams from the audio and video server and play them, and pull the message streams from the message server and cache them, wherein each audio and video frame in the audio and video stream is added with an audio and video time stamp, Each message in the message stream is added with a message time stamp, and the time source taken by the audio and video time stamp and the message time stamp is a synchronization time source;
  • the synchronous playback module is used to determine the message to be played synchronously with the audio and video frame to be played in the buffered message stream based on the audio and video time stamp of the audio and video frame and the message time stamp of the message, and to compare To play.
  • a push end is provided, and the push end includes:
  • the time stamp module is used to add an audio and video time stamp to each audio and video frame in the collected audio and video stream, and to add a message time stamp to each message in the collected message stream, where the audio and video time stamp and the message
  • the timestamp is the local time of the push end
  • the push module is used to push the audio and video stream to the audio and video server, and push the message stream to the message stream server.
  • a system for synchronously playing message streams and audio and video streams includes an audio and video server and a message server, wherein:
  • the audio and video server is configured to receive audio and video streams, and add an audio and video time stamp to each audio and video frame in the received audio and video stream;
  • the message server is configured to receive a message stream and add a message time stamp to each message in the received message stream, wherein the server time of the message server and the server time of the audio and video server are synchronized.
  • a system for synchronously playing message streams and audio and video streams includes the streaming terminal provided in the fourth aspect, the streaming terminal provided in the fifth aspect, and the audio streaming terminal provided in the sixth aspect.
  • Video server and message server includes the streaming terminal provided in the fourth aspect, the streaming terminal provided in the fifth aspect, and the audio streaming terminal provided in the sixth aspect.
  • the streaming terminal can determine the message whose message timestamp is less than or equal to the audio and video timestamp of the audio and video frame based on the audio and video timestamp of the audio and video stream from the same time source and the message timestamp of the message stream as A message that is played synchronously with video frames.
  • the streaming end plays the audio and video frames, if there is a message that is played synchronously, the streaming end can play the audio and video frames and the corresponding messages synchronously; if there is no synchronized playback message, the streaming end will You can only play audio and video frames.
  • FIG. 1 is a schematic diagram of a system structure for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application
  • FIG. 2 is a flowchart of a method for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application
  • FIG. 3 is a flowchart of a method for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application
  • FIG. 4 is a flowchart of a method for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application
  • FIG. 5 is a process flow chart of establishing a connection provided by an embodiment of the present application.
  • Fig. 6 is a flowchart of a method for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application
  • FIG. 7 is a sequence diagram provided by an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a structure of a pulling end provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the structure of a streaming end provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a system for synchronously playing a message stream and an audio and video stream according to an embodiment of the present application.
  • the embodiment of the present application provides a method for synchronously playing a message stream and an audio and video stream.
  • the method can be implemented by a streaming end, an audio and video server, a message server, and a streaming end.
  • the push end pushes audio and video streams
  • the push end can be a live broadcast device deployed on the teacher's end.
  • the push end can convert the images and sounds of the teacher during the teaching process into audio and video streams and push them to the audio and video server;
  • the streaming terminal can be a live broadcast device located on the student terminal, and the streaming terminal can pull audio and video streams from the audio and video server and play them.
  • the push end When the push end pushes the message stream, the push end can be a live broadcast device deployed on the teacher or student side, and the push end can push the generated message stream to the message stream server; the pull end can be deployed on the student or teacher
  • the live streaming device of the end, the push end can pull the message stream from the message stream server and play it (here refers to the rendering and display of the message stream).
  • the foregoing audio and video server and message server can be any CDN node server in a CDN (Content Delivery Network) system, which can distribute the buffered audio and video streams or message streams to each streaming terminal.
  • CDN Content Delivery Network
  • the above-mentioned push end, audio and video server, message server, and pull end may all include a processor, a memory, and a transceiver.
  • the processor can be used for synchronously playing message streams and audio and video streams, and the memory can be used for storage processing.
  • the transceiver can be used to receive and send related data in the process.
  • Step 201 The streaming terminal pulls the audio and video streams from the audio and video server and plays them, and pulls the message streams from the message server and caches them.
  • the streaming terminal when students want to watch a live course, they can open the video playback software that supports online live education on the smart phone or computer and other live broadcast devices, and then use the video playback software Find the above live course and click the play button.
  • the streaming end can pull and play the audio and video streams from the audio and video server that caches the audio and video streams of the live course.
  • the streaming end can pull the message stream from the message server where the message stream is cached, and cache it at the streaming end. When the message stream meets the playback requirements, the streaming end will play the message stream again.
  • the audio and video streams and message streams pulled by the streaming end are all added with a timestamp taken from the synchronization time source.
  • the timestamp taken from the same time source can be used to accurately generate the same time.
  • the audio and video frames and messages are marked as the basis for the subsequent judgment of the synchronized playback of audio and video streams and message streams.
  • both the streaming end and the server can perform the processing of adding a time stamp. If the time stamp is added by the push end, the push end only needs to add a timestamp to the recorded audio and video stream and the generated message stream based on the same time source, such as the local time of the push end, to comply with the above synchronization time source. Claim. If the time stamp is added by the server, the server time of the audio and video server and the server time of the message server can be synchronized and corrected, which can meet the requirements of the synchronization time source mentioned above.
  • Step 202 Based on the audio and video timestamps of the audio and video frames and the message timestamps of the messages, the streaming terminal determines the messages to be played synchronously with the audio and video frames to be played in the buffered message stream and plays them.
  • the streaming end after the streaming end has buffered the message stream, it can control the playback timing of the message in the message stream, so that the message is played synchronously with the corresponding audio and video frames.
  • the streaming terminal may determine the audio and video timestamp of each audio and video frame in the audio and video stream and the message timestamp of each message in the message stream to determine the corresponding audio and video frames in the buffered message stream. Synchronously play messages. In this way, when the streaming end plays the audio and video frames, if there is a message that is played synchronously, the streaming end can play the audio and video frames and the corresponding messages synchronously; if there is no synchronized playback message, the streaming end will You can only play audio and video frames.
  • the streaming terminal after the streaming terminal finishes playing a certain message, it can discard it, such as deleting it from the cache or marking it as discarded data to avoid repeating the same message multiple times. In this way, by controlling the playing timing of the message, it is possible to avoid the asynchronous problem caused by the real-time playing of the pulled message, thereby improving the interaction and experience effect of online live education.
  • the specific processing of the above step 202 may be as follows: the message in the buffered message stream whose message timestamp is less than or equal to the audio and video timestamp of the audio and video frame to be played is determined to be played synchronously with the audio and video frame to be played news.
  • the streaming end in order to realize the synchronous playback of the message stream and the audio and video stream, the streaming end will not play the message stream after it has been pulled, but will cache it, and wait for the streaming end to pull the audio and video corresponding to the message. frame.
  • the streaming terminal After the streaming terminal pulls the audio and video frame corresponding to the message, the streaming terminal then plays the message and the corresponding audio and video frame synchronously.
  • the streaming terminal can compare the size relationship between the message timestamp and the audio and video timestamp to find out whether there is a message that is played synchronously with the current audio and video frame to be played in the unplayed messages. .
  • the streaming end can traverse the message timestamp of each message in the buffer according to the FIFO (First In First Out) principle. If there is a message with a message timestamp less than or equal to the audio and video timestamp of the audio and video frame to be played, the streaming terminal can determine the message as a message that is played synchronously with the audio and video frame to be played; If the message timestamps are greater than the audio and video timestamps of the audio and video frames to be played, it indicates that the streaming terminal has not obtained the audio and video frames corresponding to each message that has been buffered, and the streaming terminal does not respond to the messages in the message stream.
  • FIFO First In First Out
  • Play processing is performed until a new audio and video frame is pulled, and there is a message whose message time stamp is less than or equal to the audio and video time stamp of the audio and video frame to be played, and then the message is played synchronously.
  • the above processing can be as shown in Figure 3.
  • the processing flow of the method for synchronously playing the message stream and the audio and video stream can be as shown in Figure 4, which can be specifically as follows:
  • Step 401 The streaming end adds an audio and video time stamp to each audio and video frame in the collected audio and video stream, and adds a message time stamp to each message in the collected message stream.
  • the teacher can broadcast the narrated course through the corresponding live broadcast equipment, and can use interactive methods such as whiteboard, text chat, and roll call to teach during the live broadcast.
  • the push end can collect audio and video streams and message streams, and the push end can add audio and video timestamps to each audio and video frame in the collected audio and video streams based on the local time of the push end, and add audio and video timestamps to the collected message streams. Add a message timestamp to each message in.
  • the specific processing of the above step 401 may be as follows: the streaming end writes the local acquisition time of each audio and video frame in the audio and video stream into the SEI field of the audio and video frame; the streaming end writes each message in the message stream The local collection time of each message is written into the timestamp field of each message.
  • the local acquisition time of the audio and video frame can be written into the SEI (Supplemental enhancement information) field of the audio and video frame as the audio and video frame.
  • SEI Supplemental enhancement information
  • the audio and video timestamp of the frame can be written.
  • the push end can write the local collection time of the message into the timestamp field of the message.
  • the push stream terminal can convert the local collection time of the message to a unix timestamp and write Enter the timestamp field timestamp of the message, that is ⁇ "msg":"A","timestamp”:"1554261147000" ⁇ .
  • Step 402 The streaming end pushes the audio and video stream to the audio and video server, and pushes the message stream to the message stream server.
  • the push end can perform authentication and connection establishment processing with the audio and video server and the message stream server before pushing the audio and video stream and the message stream to the audio and video server and the message stream server respectively.
  • the processing process can be as shown in the figure. 5 shown.
  • Step 403 The streaming terminal pulls the audio and video stream from the audio and video server and plays it, pulls the message stream from the message stream server and caches it, as well as the audio and video timestamp based on the audio and video frames and the message timestamp of the message in the cache Determine the message to be played synchronously with the audio and video frame to be played in the message stream, and play it.
  • the processing flow of the method for synchronously playing the message stream and the audio and video stream can be as shown in Figure 6, and the details can be as follows:
  • step 601 the streaming terminal pushes the collected audio and video streams to the audio and video server, and pushes the collected message streams to the message server.
  • Step 602 The audio and video server receives the audio and video stream, and adds an audio and video time stamp to each audio and video frame in the received audio and video stream.
  • the audio and video server can obtain the audio and video frame time, that is, the audio and video frame acquisition time, write Enter the SEI field of the audio and video frame as the audio and video timestamp of the audio and video frame.
  • Step 603 The message server receives the message stream, and adds a message timestamp to each message in the received message stream.
  • the message server can write the time when the message server obtained the message, that is, the message acquisition time into the timestamp field of the message. As the message timestamp of the message. It should be noted that the server time of the message server and the server time of the audio and video server need to be synchronized to accurately mark the audio and video frames and messages generated at the same time.
  • Step 604 the streaming terminal pulls the audio and video streams from the audio and video server and plays them, pulls the message streams from the message stream server and caches them, and based on the audio and video timestamp of the audio and video frames and the message timestamp of the message, in the cache Determine the message to be played synchronously with the audio and video frame to be played in the message stream, and play it.
  • Fig. 7 is a sequence diagram of the processing of whiteboard messages between devices.
  • the websocket server is a server type of the above message server, which can perform data processing on whiteboard messages, such as adding message timestamps; as a whiteboard platform, the whiteboard server can provide corresponding background technical support, such as configuring whiteboard permissions and whiteboard
  • the whiteboard server can store the whiteboard data generated during the live broadcast.
  • the push end can first establish a connection channel with the websocket server.
  • the push end can obtain the initialization data from the whiteboard server, perform the initialization operation of the whiteboard, and return the initialized whiteboard data to the whiteboard server after the initialization is completed.
  • the push segment can execute the whiteboard drawing command issued by the teacher or the student, draw the whiteboard, and send the whiteboard data to the websocket server through the above-mentioned connection channel.
  • the websocket server can process the whiteboard data, such as adding a message time stamp, and after the processing is completed, push the processed whiteboard data to the whiteboard server and the streaming terminal respectively.
  • the whiteboard server can store the received whiteboard data for easy playback and other processing.
  • the streaming terminal can cache the received whiteboard data locally, and then determine when to play the whiteboard data based on the message timestamp in the whiteboard data and the audio and video timestamps of the audio and video frames.
  • the streaming terminal can determine the message whose message timestamp is less than or equal to the audio and video timestamp of the audio and video frame based on the audio and video timestamp of the audio and video stream from the same time source and the message timestamp of the message stream as A message that is played synchronously with video frames.
  • the streaming end plays the audio and video frames, if there is a message that is played synchronously, the streaming end can play the audio and video frames and the corresponding messages synchronously; if there is no synchronized playback message, the streaming end will You can only play audio and video frames.
  • an embodiment of the present application also provides a streaming end.
  • the streaming end includes:
  • the streaming module 801 is used to pull audio and video streams from the audio and video server and play them, and pull the message streams from the message server and cache them, wherein each audio and video frame in the audio and video stream is added with an audio and video time stamp , Each message in the message stream is added with a message time stamp, and the time source taken by the audio and video time stamp and the message time stamp is a synchronization time source;
  • the synchronous playback module 802 is configured to determine, in the buffered message stream, a message to be played synchronously with the audio and video frame to be played based on the audio and video time stamp of the audio and video frame and the message time stamp of the message, and It plays.
  • each audio and video frame in the audio and video stream is added with an audio and video time stamp
  • each message in the message stream is added with a message time stamp, including:
  • Each audio and video frame in the audio and video stream is added with the local collection time written in the SEI field by the push end, and each message in the message stream is added with the local collection written in the timestamp field by the push end time.
  • each audio and video frame in the audio and video stream is added with an audio and video time stamp
  • each message in the message stream is added with a message time stamp, including:
  • Each audio and video frame in the audio and video stream is added with the audio and video frame acquisition time written by the audio and video server in the SEI field, and each message in the message stream is added with the message written by the message server in the timestamp field.
  • the time is acquired, and the server time of the audio and video server and the server time of the message server are kept synchronized.
  • the synchronous playing module 802 is used to:
  • the message in the message stream whose message time stamp is less than or equal to the audio and video time stamp of the to-be-played audio and video frame is determined to be a message that is played synchronously with the to-be-played audio and video frame.
  • an embodiment of the present application also provides a streaming end.
  • the streaming end includes:
  • the time stamp module 901 is configured to add an audio and video time stamp to each audio and video frame in the collected audio and video stream, and to add a message time stamp to each message in the collected message stream, where the audio and video time stamp and the The message timestamp is the local time of the push end;
  • the pushing module 902 is configured to push the audio and video stream to the audio and video server, and push the message stream to the message stream server.
  • time stamp module 901 is used to:
  • the determining the message to be played synchronously with the audio and video frame in the message stream based on the audio and video time stamp of the audio and video frame and the message time stamp of the message includes:
  • the message in the message stream whose message time stamp is less than or equal to the audio and video time stamp of the to-be-played audio and video frame is determined to be a message that is played synchronously with the to-be-played audio and video frame.
  • an embodiment of the present application also provides a system for synchronously playing message streams and audio and video streams.
  • the system includes an audio and video server 1011 and a message server 1012, wherein:
  • the audio and video server 1011 is configured to receive audio and video streams, and add an audio and video time stamp to each audio and video frame in the received audio and video stream;
  • the message server 1012 is configured to receive a message stream and add a message timestamp to each message in the received message stream; wherein the server time of the message server and the server time of the audio and video server are synchronized.
  • the audio and video server 1011 is used to:
  • the message server 1012 is used to:
  • the message acquisition time is written into the timestamp field of each message in the received message stream, so that the streaming end is based on the audio and video timestamp of the audio and video frame and the message timestamp of the message, in the buffered message stream Determine the message to be played synchronously with the audio and video frame to be played, and play it.
  • the determining the message to be played synchronously with the audio and video frame in the message stream based on the audio and video time stamp of the audio and video frame and the message time stamp of the message includes:
  • the message in the message stream whose message time stamp is less than or equal to the audio and video time stamp of the to-be-played audio and video frame is determined to be a message that is played synchronously with the to-be-played audio and video frame.
  • an embodiment of the present application also provides a system for synchronously playing message streams and audio and video streams.
  • the system includes the aforementioned streaming terminal, audio and video server, and message server. And pull the stream, where:
  • the push end is used to push the collected audio and video streams to the audio and video server, and push the collected message streams to the message server;
  • the audio and video server is used to add an audio and video time stamp to each audio and video frame in the received audio and video stream.
  • the message server is used to add a message timestamp to each message in the received message stream; wherein the server time of the message server and the server time of the audio and video server are synchronized.
  • the streaming terminal is used to pull the audio and video streams from the audio and video server and play them, pull the message streams from the message stream server and cache them, and the audio and video time based on the audio and video frames And the message timestamp of the message, determine the message to be played synchronously with the audio and video frame to be played in the buffered message stream, and play it.
  • an embodiment of the present application also provides a system for synchronously playing message streams and audio and video streams.
  • the system includes the aforementioned streaming terminal, audio and video server, and message server. And pull the stream, where:
  • the push end is used to add an audio and video time stamp to each audio and video frame in the collected audio and video stream, and to add a message time stamp to each message in the collected message stream.
  • the push end is also used to push audio and video streams to the audio and video server, and push the message stream to the message stream server.
  • the streaming terminal is used to pull audio and video streams from the audio and video server and play them, pull the message stream from the message stream server and cache it, and the audio and video timestamp based on the audio and video frames and the message timestamp of the message in the cache Determine the message to be played synchronously with the audio and video frame to be played in the message stream, and play it.

Abstract

本申请公开了一种同步播放消息流与音视频流的方法、装置和系统,属于流媒体直播技术领域。所述方法包括:拉流端从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存(201),其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;所述拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放(202)。采用本申请,可以实现音视频流和消息流同步播放,提高在线直播教育的互动和体验效果。

Description

一种同步播放消息流与音视频流的方法、装置和系统
交叉引用
本申请引用于2019年04月04日递交的名称为“一种同步播放消息流与音视频流的方法、装置和系统”的第201910272614.8号中国专利申请,其通过引用被全部并入本申请。
技术领域
本申请涉及流媒体直播技术领域,特别涉及一种同步播放消息流与音视频流的方法、装置和系统。
背景技术
随着电子设备和互联网的不断发展,教育行业已经从传统的课堂教育演化到基于互联网的在线直播教育。同时,在线直播教育也不再是简单的教师直播上课的方式,而是在此之上不断添加例如白板、文字聊天和点名等多样化的互动方式,通过多样化的互动方式,教师可以更加形象地将知识进行传递。
在在线直播教育过程中,一方面,教师所在的教师端可以通过个人计算机、摄像头、耳麦等直播设备采集直播过程中产生的音视频流,然后可以将音视频流推送到相应的音视频服务器。经过音视频服务器的处理后,学生所在的学生端可以通过直播设备从音视频服务器处拉取音视频流进行观看。另一方面,对于直播过程中产生的白板、文字聊天和点名等消息流,可以经由推流端(教师端或学生端)的直播设备推送到相应的消息流服务器,然后由拉流端(学生端或教师端)的直播设备从消息流服务器处拉取消息流,再将其呈现给观看者(学生或教师)。
在实现本申请的过程中,发明人发现现有技术至少存在以下问题: 即使采用目前延迟时间最短的RTMP(Real Time Messaging Protocol,实时消息传输协议),从音视频流的采集到最终可拉流观看也需要2-3秒不等的时间。而白板、聊天和点名等消息流只是文本流的传输,不需要进行复杂的流处理,基本上是实时传输,这就造成消息流先于音视频流到达观看端。由于观看端在接收到音视频流或文本流时将立即进行直播,导致白板、文字聊天等消息与对应的音视频不能同步播放,降低了在线直播教育的互动和体验效果。
发明内容
为了解决现有技术的问题,本申请部分实施例提供了一种同步播放消息流与音视频流的方法、装置和系统。所述技术方案如下:
第一方面,提供了一种同步播放消息流与音视频流的方法,所述方法在拉流端执行,包括:
从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;
基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
第二方面,提供了一种同步播放消息流与音视频流的方法,所述方法在推流端执行,包括:
对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,其中,所述音视频时间戳和所述消息时间戳取所述推流端的本地时间;
将所述音视频流推送至音视频服务器,将所述消息流推送至消息流服务器,以使拉流端从所述音视频服务器处拉取所述音视频流并播放,从所述消息 流服务器处拉取所述消息流并缓存,以及基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
第三方面,提供了一种同步播放消息流与音视频流的方法,所述方法包括:
音视频服务器接收音视频流,并对接收到的音视频流中每个音视频帧添加音视频时间戳;
消息服务器接收消息流,并对接收到的消息流中每条消息添加消息时间戳,其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
第四方面,提供了一种拉流端,所述拉流端包括:
拉流模块,用于从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;
同步播放模块,用于基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
第五方面,提供了一种推流端,所述推流端包括:
时间戳模块,用于对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,其中,所述音视频时间戳和所述消息时间戳取所述推流端的本地时间;
推送模块,用于将所述音视频流推送至音视频服务器,将所述消息流推 送至消息流服务器。
第六方面,提供了一种同步播放消息流与音视频流的系统,所述系统包括音视频服务器和消息服务器,其中:
所述音视频服务器,用于接收音视频流,并对接收到的音视频流中每个音视频帧添加音视频时间戳;
所述消息服务器,用于接收消息流,并对接收到的消息流中每条消息添加消息时间戳,其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
第七方面,提供了一种同步播放消息流与音视频流的系统,所述系统包括上述第四方面提供的拉流端、上述第五方面提供的推流端和上述第六方面提供的音视频服务器及消息服务器。
本申请实施例提供的技术方案带来的有益效果包括:
在本实施例中,拉流端在接收到消息流后,不立即对其进行播放,而是做缓存处理。同时,拉流端可以基于取自同一时间来源的音视频流的音视频时间戳以及消息流的消息时间戳,将消息时间戳小于等于音视频帧的音视频时间戳的消息,确定为与音视频帧同步播放的消息。这样,当拉流端对音视频帧进行播放时,如果存在同步播放的消息,拉流端则可以对音视频帧及对应的消息进行同步播放;如果不存在同步播放的消息,拉流端则可以只对音视频帧进行播放。通过对消息的播放时机进行控制,可以避免对拉取的消息进行实时播放所带来的不同步问题,从而可以提高在线直播教育的互动和体验效果。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申 请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1是本申请实施例提供的一种同步播放消息流与音视频流的系统结构示意图;
图2是本申请实施例提供的一种同步播放消息流与音视频流的方法流程图;
图3是本申请实施例提供的一种同步播放消息流与音视频流的方法流程图;
图4是本申请实施例提供的一种同步播放消息流与音视频流的方法流程图;
图5是本申请实施例提供的一种建立连接的处理流程图;
图6是本申请实施例提供的一种同步播放消息流与音视频流的方法流程图;
图7是本申请实施例提供的一种时序图;
图8是本申请实施例提供的一种拉流端结构示意图;
图9是本申请实施例提供的一种推流端结构示意图;
图10是本申请实施例提供的一种同步播放消息流与音视频流的系统结构示意图。
具体实施例
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。
本申请实施例提供了一种同步播放消息流与音视频流的方法,该方法可以由推流端、音视频服务器、消息服务器以及拉流端共同实现。当推流端推送音视频流时,推流端可以是部署在教师端的直播设备,推流端可以将教师在教学过程中的图像和声音转换为音视频流,并推送给音视频服务器;拉流端可以 是位于学生端的直播设备,拉流端可以从音视频服务器处拉取音视频流,并进行播放。当推流端推送消息流时,推流端可以是部署在教师端或学生端的直播设备,推流端可以将产生的消息流推送到消息流服务器;拉流端可以是部署在学生端或教师端的直播设备,推流端可以从消息流服务器处拉取消息流,并进行播放(这里指对消息流进行渲染展示)。上述音视频服务器和消息服务器可以是CDN(Content Delivery Network,内容分发网络)系统中的任意一台CDN节点服务器,其可以将缓存的音视频流或消息流分发至各个拉流端。具体的系统框架可以参照图1所示。上述推流端、音视频服务器、消息服务器以及拉流端中均可以包括处理器、存储器和收发器,处理器可以用于进行同步播放消息流与音视频流的处理,存储器可以用于存储处理过程中需要的数据以及产生的数据,收发器可以用于接收和发送处理过程中的相关数据。
下面将结合具体实施例,对图2所示的一种同步播放消息流与音视频流的方法的处理流程进行详细的说明,内容可以如下:
步骤201,拉流端从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存。
在实施中,以拉流端为学生端为例,当学生想要观看某直播课程时,可以在智能手机或电脑等直播设备上打开支持在线直播教育的视频播放软件,然后可以通过视频播放软件查找上述直播课程并点击播放按钮。这样,拉流端可以从缓存有直播课程的音视频流的音视频服务器处,拉取音视频流并进行播放。同时,拉流端可以从缓存有消息流的消息服务器处拉取消息流,并缓存在拉流端,待消息流符合播放要求时,拉流端再对消息流进行播放处理。需要说明的是,拉流端拉取的音视频流和消息流,均添加有取自同步时间来源的时间戳,这样,可以通过取自同一时间来源的时间戳,准确地对同一时间生成的音视频帧和消息进行标记,作为后续对音视频流和消息流同步播放的判断基础。具体的,推流端和服务器均可以进行添加时间戳的处理。如果由推流端添加时间戳,则推流端只要基于同一时间来源,如推流端本地时间,对录制的音视频流和产 生的消息流添加时间戳,即可以符合上述的同步时间来源的要求。如果由服务器添加时间戳,则可以对音视频服务器的服务器时间和消息服务器的服务器时间进行同步校正处理,即可以符合上述的同步时间来源的要求。
步骤202,拉流端基于音视频帧的音视频时间戳和消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
在实施中,拉流端在缓存完消息流之后,可以对消息流中消息的播放时机进行控制,实现消息与对应的音视频帧同步播放。具体的,拉流端可以基于音视频流中每个音视频帧的音视频时间戳,以及消息流中每条消息的消息时间戳,在缓存的消息流中确定出与各个待播放音视频帧进行同步播放的消息。这样,当拉流端对音视频帧进行播放时,如果存在同步播放的消息,拉流端则可以对音视频帧及对应的消息进行同步播放;如果不存在同步播放的消息,拉流端则可以只对音视频帧进行播放。需要说明的是,拉流端在播放完某消息后,可以对其进行丢弃处理,如将其从缓存中删除或将其标记为丢弃数据,以避免重复对同一消息进行多次播放。这样,通过对消息的播放时机进行控制,可以避免对拉取的消息进行实时播放所带来的不同步问题,从而可以提高在线直播教育的互动和体验效果。
可选的,上述步骤202的具体处理可以如下:将缓存的消息流中消息时间戳小于等于待播放的音视频帧的音视频时间戳的消息,确定为与待播放的音视频帧同步播放的消息。
在实施中,为了实现消息流和音视频流同步播放,拉流端在拉取到消息流后,不对其进行播放,而是对其进行缓存,等待拉流端拉取到该消息对应的音视频帧。待拉流端拉取到该消息对应的音视频帧后,拉流端再对该消息与对应的音视频帧进行同步播放。具体的,拉流端在播放每个音视频帧之前,可以通过比较消息时间戳和音视频时间戳的大小关系,在未播放的消息中查找是否存在与当前待播放的音视频帧同步播放的消息。拉流端可以按照FIFO(First In First Out,先进先出)原则,对缓存的每一条消息的消息时间戳进行遍历。如果 存在消息时间戳小于等于当前待播放的音视频帧的音视频时间戳的消息,则拉流端可以将该消息确定为与待播放的音视频帧同步播放的消息;如果缓存的所有消息的消息时间戳均大于待播放的音视频帧的音视频时间戳,则表明拉流端还未获取到已缓存的各条消息所对应的音视频帧,这时拉流端不对消息流中的消息进行播放处理,直至拉取到新的音视频帧,且存在消息时间戳小于等于待播放的音视频帧的音视频时间戳的消息,再对消息进行同步播放。上述处理可以如图3所示。
可选的,如果由推流端进行时间戳的添加处理,则同步播放消息流与音视频流的方法的处理流程可以如图4所示,具体可以如下:
步骤401,推流端对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳。
在实施中,以推流端为教师端为例,教师可以通过相应的直播设备对讲述的课程进行直播,并可以在直播过程中采用白板、文字聊天和点名等互动方式进行教学。这样,推流端可以采集到音视频流和消息流,进而推流端可以基于推流端的本地时间,对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳。
可选的,上述步骤401的具体处理可以如下:推流端将音视频流中每个音视频帧的本地采集时间,写入音视频帧的SEI字段;推流端将消息流中每条消息的本地采集时间,写入每条消息的时间戳字段。
在实施中,推流端在采集到某音视频帧后,可以将该音视频帧的本地采集时间写入到该音视频帧的SEI(Supplemental enhancement information,附加增强信息)字段,作为该音视频帧的音视频时间戳。相应的,推流端在采集到某条消息后,可以将该消息的本地采集时间写入到该消息的时间戳字段。例如,推流端在本地时间2019/4/3 11:12:27采集到了消息{"msg":"A"},则推流端可以将该消息的本地采集时间转换为unix时间戳后写入到该消息的时间戳字段timestamp中,即{"msg":"A","timestamp":"1554261147000"}。
步骤402,推流端将音视频流推送至音视频服务器,将消息流推送至消息流服务器。
在实施中,推流端在向音视频服务器和消息流服务器分别推送音视频流和消息流之前,可以预先与音视频服务器和消息流服务器进行鉴权、建立连接等处理,处理过程可以如图5所示。
步骤403,拉流端从音视频服务器处拉取音视频流并播放,从消息流服务器处拉取消息流并缓存,以及基于音视频帧的音视频时间戳和消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
可选的,如果由服务器进行时间戳的添加处理,则同步播放消息流与音视频流的方法的处理流程可以如图6所示,具体可以如下:
步骤601,推流端将采集的音视频流推送至音视频服务器,将采集的消息流推送至消息服务器。
步骤602,音视频服务器接收音视频流,并对接收到的音视频流中每个音视频帧添加音视频时间戳。
在实施中,与上述推流端添加音视频时间戳类似,音视频服务器在接收到某个音视频帧后,可以将音视频服务器获取该音视频帧的时间,即音视频帧获取时间,写入到该音视频帧的SEI字段,作为该音视频帧的音视频时间戳。
步骤603,消息服务器接收消息流,并对接收到的消息流中每条消息添加消息时间戳。
在实施中,与上述推流端添加消息时间戳类似,消息服务器在接收到某条消息后,可以将消息服务器获取该消息的时间,即消息获取时间,写入到该消息的时间戳字段,作为该消息的消息时间戳。需要说明的是,消息服务器的服务器时间和音视频服务器的服务器时间需要保持同步,以准确地对同一时间生成的音视频帧和消息进行标记。
步骤604,拉流端从音视频服务器处拉取音视频流并播放,从消息流服务 器处拉取消息流并缓存,以及基于音视频帧的音视频时间戳和消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
下面以白板消息为例,进一步对消息流和音视频流的同步播放处理进行介绍。如图7所示,图7为白板消息在各个设备之间处理的时序图。其中,websocket服务器是上述消息服务器的一种服务器类型,其可以对白板消息进行数据加工,如添加消息时间戳;白板服务器作为白板平台,其可以提供相应的后台技术支持,如配置白板权限、白板背景图片等,同时,白板服务器可以对直播过程中产生的白板数据进行存储。首先,推流端可以先同websocket服务器建立连接通道。之后,推流端可以从白板服务器获取初始化数据,进行白板的初始化操作,并在初始化完成后,将初始化后的白板数据返回给白板服务器。之后,推流段可以执行教师端或学生端发出的白板绘制命令,对白板进行绘制,并将白板数据通过上述连接通道发送给websocket服务器。接着,websocket服务器可以对白板数据进行加工,如添加消息时间戳,并在加工完成后,将加工后的白板数据分别推送至白板服务器和拉流端。白板服务器可以将接收到的白板数据进行存储,便于回播等处理。拉流端可以将接收到的白板数据缓存在本地,然后基于白板数据中的消息时间戳,以及音视频帧的音视频时间戳,确定何时对白板数据进行播放。
在本实施例中,拉流端在接收到消息流后,不立即对其进行播放,而是做缓存处理。同时,拉流端可以基于取自同一时间来源的音视频流的音视频时间戳以及消息流的消息时间戳,将消息时间戳小于等于音视频帧的音视频时间戳的消息,确定为与音视频帧同步播放的消息。这样,当拉流端对音视频帧进行播放时,如果存在同步播放的消息,拉流端则可以对音视频帧及对应的消息进行同步播放;如果不存在同步播放的消息,拉流端则可以只对音视频帧进行播放。通过对消息的播放时机进行控制,可以避免对拉取的消息进行实时播放所带来的不同步问题,从而可以提高在线直播教育的互动和体验效果。
基于相同的技术构思,本申请实施例还提供了一种拉流端,如图8所示,所述拉流端包括:
拉流模块801,用于从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;
同步播放模块802,用于基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
可选的,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
所述音视频流中每个音视频帧添加有由推流端在SEI字段写入的本地采集时间,所述消息流中每条消息添加有由推流端在时间戳字段写入的本地采集时间。
可选的,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
所述音视频流中每个音视频帧添加有由音视频服务器在SEI字段写入的音视频帧获取时间,所述消息流中每条消息添加有由消息服务器在时间戳字段写入的消息获取时间,且所述音视频服务器的服务器时间和所述消息服务器的服务器时间保持同步。
可选的,所述同步播放模块802用于:
将消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
基于相同的技术构思,本申请实施例还提供了一种推流端,如图9所示, 所述推流端包括:
时间戳模块901,用于对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,其中,所述音视频时间戳和所述消息时间戳取所述推流端的本地时间;
推送模块902,用于将所述音视频流推送至音视频服务器,将所述消息流推送至消息流服务器。
可选的,所述时间戳模块901用于:
将音视频流中每个音视频帧的本地采集时间,写入音视频帧的SEI字段;
将消息流中每条消息的本地采集时间,写入每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
可选的,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述消息流中确定与音视频帧同步播放的消息,包括:
将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
基于相同的技术构思,本申请实施例还提供了一种同步播放消息流与音视频流的系统,如图10所示,所述系统包括音视频服务器1011和消息服务器1012,其中:
所述音视频服务器1011,用于接收音视频流,并对接收到的音视频流中每个音视频帧添加音视频时间戳;
所述消息服务器1012,用于接收消息流,并对接收到的消息流中每条消息添加消息时间戳;其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
可选的,所述音视频服务器1011用于:
将音视频帧获取时间写入接收到的音视频流中每个音视频帧的SEI字段;
所述消息服务器1012用于:
将消息获取时间写入接收到的消息流中每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
可选的,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述消息流中确定与音视频帧同步播放的消息,包括:
将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
基于相同的技术构思,本申请实施例还提供了一种同步播放消息流与音视频流的系统,如图1所示,所述系统包括前述提到的推流端、音视频服务器、消息服务器以及拉流端,其中:
推流端,用于将采集的音视频流推送至音视频服务器,将采集的消息流推送至消息服务器;
音视频服务器,用于对接收到的音视频流中每个音视频帧添加音视频时间戳。
消息服务器,用于对接收到的消息流中每条消息添加消息时间戳;其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
拉流端,用于从所述音视频服务器处拉取所述音视频流并播放,从所述消息流服务器处拉取所述消息流并缓存,以及基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
基于相同的技术构思,本申请实施例还提供了一种同步播放消息流与音视频流的系统,如图1所示,所述系统包括前述提到的推流端、音视频服务器、消息服务器以及拉流端,其中:
推流端,用于对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳。
推流端,还用于将音视频流推送至音视频服务器,将消息流推送至消息流服务器。
拉流端,用于从音视频服务器处拉取音视频流并播放,从消息流服务器处拉取消息流并缓存,以及基于音视频帧的音视频时间戳和消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (21)

  1. 一种同步播放消息流与音视频流的方法,其中,所述方法在拉流端执行,包括:
    从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;
    基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  2. 根据权利要求1所述的方法,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
    所述音视频流中每个音视频帧添加有由推流端在SEI字段写入的本地采集时间,所述消息流中每条消息添加有由推流端在时间戳字段写入的本地采集时间。
  3. 根据权利要求1所述的方法,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
    所述音视频流中每个音视频帧添加有由音视频服务器在SEI字段写入的音视频帧获取时间,所述消息流中每条消息添加有由消息服务器在时间戳字段写入的消息获取时间,且所述音视频服务器的服务器时间和所述消息服务器的服务器时间保持同步。
  4. 根据权利要求2或3所述的方法,其中,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,包括:
    将消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  5. 一种同步播放消息流与音视频流的方法,其中,所述方法在推流端执行,包括:
    对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,其中,所述音视频时间戳和所述消息时间戳取所述推流端的本地时间;
    将所述音视频流推送至音视频服务器,将所述消息流推送至消息流服务器。
  6. 根据权利要求5所述的方法,其中,所述对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,包括:
    将音视频流中每个音视频帧的本地采集时间,写入音视频帧的SEI字段;
    将消息流中每条消息的本地采集时间,写入每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  7. 根据权利要求6所述的方法,其中,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,包括:
    将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  8. 一种同步播放消息流与音视频流的方法,其中,所述方法包括:
    音视频服务器接收音视频流,并对接收到的音视频流中每个音视频帧添加音视频时间戳;
    消息服务器接收消息流,并对接收到的消息流中每条消息添加消息时间戳,其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
  9. 根据权利要求8所述的方法,其中,所述对接收到的音视频流中每个音视频帧添加音视频时间戳,包括:
    所述音视频服务器将音视频帧获取时间写入接收到的音视频流中每个音视 频帧的SEI字段;
    所述对接收到的消息流中每条消息添加消息时间戳,包括:
    所述消息服务器将消息获取时间写入接收到的消息流中每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  10. 根据权利要求9所述的方法,其中,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,包括:
    将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  11. 一种拉流端,其中,所述拉流端包括:
    拉流模块,用于从音视频服务器处拉取音视频流并播放,从消息服务器处拉取消息流并缓存,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,且所述音视频时间戳和所述消息时间戳所取的时间来源为同步时间来源;
    同步播放模块,用于基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在所述缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  12. 根据权利要求11所述的拉流端,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
    所述音视频流中每个音视频帧添加有由推流端在SEI字段写入的本地采集时间,所述消息流中每条消息添加有由推流端在时间戳字段写入的本地采集时间。
  13. 根据权利要求11所述的拉流端,其中,所述音视频流中每个音视频帧添加有音视频时间戳,所述消息流中每条消息添加有消息时间戳,包括:
    所述音视频流中每个音视频帧添加有由音视频服务器在SEI字段写入的音 视频帧获取时间,所述消息流中每条消息添加有由消息服务器在时间戳字段写入的消息获取时间,且所述音视频服务器的服务器时间和所述消息服务器的服务器时间保持同步。
  14. 根据权利要求12或13所述的拉流端,其中,所述同步播放模块用于:
    将消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  15. 一种推流端,其中,所述推流端包括:
    时间戳模块,用于对采集的音视频流中每个音视频帧添加音视频时间戳,对采集的消息流中每条消息添加消息时间戳,其中,所述音视频时间戳和所述消息时间戳取所述推流端的本地时间;
    推送模块,用于将所述音视频流推送至音视频服务器,将所述消息流推送至消息流服务器。
  16. 根据权利要求15所述的推流端,其中,所述时间戳模块用于:
    将音视频流中每个音视频帧的本地采集时间,写入音视频帧的SEI字段;
    将消息流中每条消息的本地采集时间,写入每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  17. 根据权利要求16所述的推流端,其中,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与音视频帧同步播放的消息,包括:
    将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  18. 一种同步播放消息流与音视频流的系统,其中,所述系统包括音视频服务器和消息服务器,其中:
    所述音视频服务器,用于接收音视频流,并对接收到的音视频流中每个音 视频帧添加音视频时间戳;
    所述消息服务器,用于接收消息流,并对接收到的消息流中每条消息添加消息时间戳,其中,所述消息服务器的服务器时间和所述音视频服务器的服务器时间保持同步。
  19. 根据权利要求18所述的系统,其中,所述音视频服务器用于:
    将音视频帧获取时间写入接收到的音视频流中每个音视频帧的SEI字段;
    所述消息服务器用于:
    将消息获取时间写入接收到的消息流中每条消息的时间戳字段,以使拉流端基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,并对其进行播放。
  20. 根据权利要求19所述的系统,其中,所述基于所述音视频帧的音视频时间戳和所述消息的消息时间戳,在缓存的消息流中确定与待播放的音视频帧同步播放的消息,包括:
    将所述消息流中消息时间戳小于等于所述待播放的音视频帧的音视频时间戳的消息,确定为与所述待播放的音视频帧同步播放的消息。
  21. 一种同步播放消息流与音视频流的系统,其中,所述系统包括如权利要求11-14任一所述的拉流端、如权利要求15-17任一所述的推流端和如权利要求18-20任一所述的音视频服务器及消息服务器。
PCT/CN2019/086061 2019-04-04 2019-05-08 一种同步播放消息流与音视频流的方法、装置和系统 WO2020199304A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19752632.0A EP3742742A4 (en) 2019-04-04 2019-05-08 METHOD, APPARATUS AND SYSTEM FOR SYNCHRONOUS PLAYBACK OF MESSAGE STREAMS AND AUDIO / VIDEO STREAMS
US16/544,991 US11102540B2 (en) 2019-04-04 2019-08-20 Method, device and system for synchronously playing message stream and audio-video stream

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910272614.8A CN110035311A (zh) 2019-04-04 2019-04-04 一种同步播放消息流与音视频流的方法、装置和系统
CN201910272614.8 2019-04-04

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/544,991 Continuation US11102540B2 (en) 2019-04-04 2019-08-20 Method, device and system for synchronously playing message stream and audio-video stream

Publications (1)

Publication Number Publication Date
WO2020199304A1 true WO2020199304A1 (zh) 2020-10-08

Family

ID=67237398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/086061 WO2020199304A1 (zh) 2019-04-04 2019-05-08 一种同步播放消息流与音视频流的方法、装置和系统

Country Status (3)

Country Link
EP (1) EP3742742A4 (zh)
CN (1) CN110035311A (zh)
WO (1) WO2020199304A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112770122A (zh) * 2020-12-31 2021-05-07 上海网达软件股份有限公司 一种在云导播台视频同步的方法及系统
CN113766261A (zh) * 2021-09-06 2021-12-07 百果园技术(新加坡)有限公司 一种确定预拉取时长方法、装置、电子设备及存储介质
CN114827664A (zh) * 2022-04-27 2022-07-29 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110958466A (zh) * 2019-12-17 2020-04-03 杭州当虹科技股份有限公司 一种基于rtmp传输的sdi信号同步回传方法
DE102020117371A1 (de) * 2020-07-01 2022-01-05 SPORTTOTAL TECHNOLOGY GmbH Codec zur Übertragung von Sportveranstaltungen
CN112929713B (zh) * 2021-02-07 2024-04-02 Oppo广东移动通信有限公司 数据同步方法、装置、终端及存储介质
CN113259738B (zh) * 2021-05-08 2022-07-29 广州市奥威亚电子科技有限公司 音视频同步的方法、装置、电子设备及存储介质
CN113473163B (zh) * 2021-05-24 2023-04-07 康键信息技术(深圳)有限公司 网络直播过程中的数据传输方法、装置、设备及存储介质
CN113542815B (zh) * 2021-09-17 2022-03-08 广州易方信息科技股份有限公司 视频直播同步方法及系统
CN114143584B (zh) * 2021-09-29 2024-03-26 杭州当虹科技股份有限公司 在多终端之间实现同步播放的播出系统和方法
CN115942021B (zh) * 2023-02-17 2023-06-27 央广新媒体文化传媒(北京)有限公司 音视频流同步播放方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2840803A1 (en) * 2013-08-16 2015-02-25 Samsung Electronics Co., Ltd. Display apparatus and control method thereof
US20150296228A1 (en) * 2014-04-14 2015-10-15 David Mo Chen Systems and Methods for Performing Multi-Modal Video Datastream Segmentation
CN105323655A (zh) * 2015-10-10 2016-02-10 上海慧体网络科技有限公司 一种在移动终端根据时间戳同步视频/比分的方法
CN106326343A (zh) * 2016-08-05 2017-01-11 重庆锐畅科技有限公司 一种基于音视频数据关联同步的电子白板数据共享系统
CN108156480A (zh) * 2017-12-27 2018-06-12 腾讯科技(深圳)有限公司 一种视频字幕生成的方法、相关装置及系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6173317B1 (en) * 1997-03-14 2001-01-09 Microsoft Corporation Streaming and displaying a video stream with synchronized annotations over a computer network
EP1290894A2 (en) * 2000-05-30 2003-03-12 Nokia Corporation Video message sending
CN102655606A (zh) * 2012-03-30 2012-09-05 浙江大学 为基于p2p网络的直播节目添加实时字幕和手语服务的方法及系统
CN105959772B (zh) * 2015-12-22 2019-04-23 合一网络技术(北京)有限公司 流媒体与字幕即时同步显示、匹配处理方法、装置及系统
CN106993239B (zh) * 2017-03-29 2019-12-10 广州酷狗计算机科技有限公司 直播过程中的信息显示方法
CN106816055B (zh) * 2017-04-05 2019-02-01 杭州恒生数字设备科技有限公司 一种可交互的低功耗教学直播录播系统及方法
CN107666619B (zh) * 2017-06-15 2019-11-08 北京金山云网络技术有限公司 直播数据传输方法、装置、电子设备、服务器及存储介质
CN107743247A (zh) * 2017-09-27 2018-02-27 福建天泉教育科技有限公司 一种ppt在线演示方法及系统
CN108600773B (zh) * 2018-04-25 2021-08-10 腾讯科技(深圳)有限公司 字幕数据推送方法、字幕展示方法、装置、设备及介质
CN109547831B (zh) * 2018-11-19 2021-06-01 网宿科技股份有限公司 一种白板与视频同步的方法、装置、计算设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2840803A1 (en) * 2013-08-16 2015-02-25 Samsung Electronics Co., Ltd. Display apparatus and control method thereof
US20150296228A1 (en) * 2014-04-14 2015-10-15 David Mo Chen Systems and Methods for Performing Multi-Modal Video Datastream Segmentation
CN105323655A (zh) * 2015-10-10 2016-02-10 上海慧体网络科技有限公司 一种在移动终端根据时间戳同步视频/比分的方法
CN106326343A (zh) * 2016-08-05 2017-01-11 重庆锐畅科技有限公司 一种基于音视频数据关联同步的电子白板数据共享系统
CN108156480A (zh) * 2017-12-27 2018-06-12 腾讯科技(深圳)有限公司 一种视频字幕生成的方法、相关装置及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3742742A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112770122A (zh) * 2020-12-31 2021-05-07 上海网达软件股份有限公司 一种在云导播台视频同步的方法及系统
CN112770122B (zh) * 2020-12-31 2022-10-14 上海网达软件股份有限公司 一种在云导播台视频同步的方法及系统
CN113766261A (zh) * 2021-09-06 2021-12-07 百果园技术(新加坡)有限公司 一种确定预拉取时长方法、装置、电子设备及存储介质
CN114827664A (zh) * 2022-04-27 2022-07-29 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质
CN114827664B (zh) * 2022-04-27 2023-10-20 咪咕文化科技有限公司 多路直播混流方法、服务器、终端设备、系统及存储介质

Also Published As

Publication number Publication date
CN110035311A (zh) 2019-07-19
EP3742742A1 (en) 2020-11-25
EP3742742A4 (en) 2020-11-25

Similar Documents

Publication Publication Date Title
WO2020199304A1 (zh) 一种同步播放消息流与音视频流的方法、装置和系统
WO2020103203A1 (zh) 一种白板与视频同步的方法、装置、计算设备及存储介质
US11252444B2 (en) Video stream processing method, computer device, and storage medium
CA2438194C (en) Live navigation web-conferencing system and method
CN104539436B (zh) 一种课堂内容实时直播方法及系统
CN112616062B (zh) 一种字幕显示方法、装置、电子设备及存储介质
CN107820115A (zh) 实现视频信息预览的方法、装置及客户端和存储介质
CN112291498B (zh) 音视频数据传输的方法、装置和存储介质
CN106327929A (zh) 一种用于信息化的可视化数据控制方法及系统
CN112601101A (zh) 一种字幕显示方法、装置、电子设备及存储介质
CN110460864A (zh) 一种提高在线直播授课画质的方法
CN112351295A (zh) 在线教育直播回放同步的方法
WO2019196577A1 (zh) 一种流媒体回放方法、服务器、客户端及计算机设备
WO2017016266A1 (zh) 一种实现同步播放的方法和装置
US11102540B2 (en) Method, device and system for synchronously playing message stream and audio-video stream
CN110111614A (zh) 一种音视频教学实现音屏同步的方法和系统
CN113542906A (zh) 一种基于rtsp视频的网页无插件播放方法
CN112995720B (zh) 一种音视频同步方法和装置
KR20150112113A (ko) 이벤트 처리 기반의 온라인 강의 콘텐츠 관리방법
KR20010067612A (ko) 가상현실 기반의 인터넷 원격 강의 시스템 및 방법
CN114938443A (zh) 一种基于流媒体的实验实操考试实时评分方法
CN114339284A (zh) 直播延迟的监控方法、设备、存储介质及程序产品
CN206993282U (zh) 一种文件传输管理系统
JP2020174378A (ja) 異種ネットワーキング環境におけるメディアレンダリングの同期化
WO2014169634A1 (zh) 媒体播放处理方法、装置、系统及媒体服务器

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2019752632

Country of ref document: EP

Effective date: 20190821

NENP Non-entry into the national phase

Ref country code: DE