WO2017107578A1 - 流媒体与字幕即时同步显示、匹配处理方法、装置及系统 - Google Patents

流媒体与字幕即时同步显示、匹配处理方法、装置及系统 Download PDF

Info

Publication number
WO2017107578A1
WO2017107578A1 PCT/CN2016/098659 CN2016098659W WO2017107578A1 WO 2017107578 A1 WO2017107578 A1 WO 2017107578A1 CN 2016098659 W CN2016098659 W CN 2016098659W WO 2017107578 A1 WO2017107578 A1 WO 2017107578A1
Authority
WO
WIPO (PCT)
Prior art keywords
subtitle
video
audio data
layer
streaming media
Prior art date
Application number
PCT/CN2016/098659
Other languages
English (en)
French (fr)
Inventor
徐晶
李萌
孙俊
顾思斌
潘柏宇
王冀
Original Assignee
合一网络技术(北京)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 合一网络技术(北京)有限公司 filed Critical 合一网络技术(北京)有限公司
Priority to EP16877389.3A priority Critical patent/EP3334175A4/en
Priority to US15/757,775 priority patent/US20190387263A1/en
Publication of WO2017107578A1 publication Critical patent/WO2017107578A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23406Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving management of server-side video buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23614Multiplexing of additional data and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/242Synchronization processes, e.g. processing of PCR [Program Clock References]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43072Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of multiple content streams on the same device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44004Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • the present invention relates to the field of streaming media live broadcast technology, and in particular, to a method and device for synchronous display of streaming media and subtitles, a synchronous matching processing method and device for streaming media and subtitles, and a system for synchronous display of streaming media and subtitles.
  • subtitle translation greatly reduces visual interference and improves synchronization levels compared to simultaneous interpretation.
  • most of them use the mode of separate playback of video and subtitles.
  • the subtitles and videos cannot be synchronized with real-time sound and picture subtitles, and a transparent layer is placed on the video for subtitle display.
  • mobile adapter adaptation is not possible.
  • the means to achieve subtitle translation is relatively backward and the operation is complicated.
  • the patent CN102655606A discloses a method and system for adding real-time captioning and sign language services to a live program based on a P2P network, which includes the following steps:
  • the real-time subtitles described in step 1) are as follows:
  • the quick-recording personnel can record the subtitle content in real time.
  • the quick-recorder reviews the content of the subtitles that have been entered.
  • step 2) the real-time sign language is produced, and the specific steps are as follows:
  • the sign language translators translate the program content in real time
  • Step 3 The live webcast program is synchronized with the subtitle and the sign language, and the specific steps are as follows:
  • the caption buffer and the sign language video buffer are matched to the corresponding sign language video frame and subtitle respectively, and if there is, the live video is displayed while displaying the subtitle and sign language video; if not, only the display is displayed. Live video.
  • the existing live web subtitles are evolved from the addition of subtitles in the field of broadcasting and television.
  • the subtitles are added in the signal terminal through the hardware subtitle device, resulting in the real time synchronization of subtitles and video and audio.
  • the present invention provides a method for displaying subtitles based on streaming media live broadcast to solve the above-mentioned technical problems.
  • the invention provides a method for synchronous display of streaming media and subtitles, comprising: collecting the collected streaming media
  • the video and audio data in the body is encoded and sent to the live server; the caption data corresponding to the video and audio data is obtained and sent to the live broadcast server; the live broadcast server buffers the encoded video and audio data according to a preset delay time.
  • forming a subtitle layer according to the subtitle data and buffering establishing a synchronous matching relationship between the subtitle layer and the video and audio data, and then transmitting the two; receiving the subtitle layer having the synchronous matching relationship and the
  • the video and audio data are mixed to form streaming media information, and the streaming media information is distributed to a network node for output.
  • the establishing a synchronization matching relationship between the cached subtitle layer and the video and audio data includes:
  • the cached video and audio data is marked according to its play time point to form a play time axis
  • a caption time axis matching the play time axis of the video and audio data for the caption layer, or establishing a display start time stamp and an end time stamp of the caption layer according to the play time axis;
  • the display start time stamp and end time stamp of the layer are collectively referred to as a subtitle time stamp.
  • the subtitle layer having a synchronous matching relationship and the video and audio data are mixed, including:
  • a subtitle layer is synthesized with the video and audio data.
  • establishing a synchronization matching relationship between the subtitle layer and the video and audio data including:
  • the modification of the subtitle layer includes: inserting a preset subtitle, skipping, correcting a subtitle or a subtitle operation.
  • the length of the play time axis is the sum of the length of the video and audio data and the preset delay time.
  • the acquiring the subtitle data corresponding to the video and audio data and sending the data to the live broadcast server includes: correcting the subtitle data corresponding to the video and audio data.
  • the live broadcast server performs the encoded video and audio data according to a preset delay time.
  • Caching including: delaying buffering each frame of the video and audio data, or delaying buffering the beginning portion of the video and audio data, or delay buffering the end portion of the video and audio data, or according to The position of the subtitle is pre-modified or the position of the video and audio data is pre-adjusted, and the video and audio frame corresponding to the position is delayed.
  • the invention also provides an instant synchronous display device for streaming media and subtitles, comprising:
  • the video and audio collection and coding unit is configured to encode the video and audio data in the collected streaming media and send the data to the live broadcast server;
  • a subtitle obtaining unit configured to acquire subtitle data of the video and audio data, form a subtitle layer, and send the subtitle layer to the live broadcast server;
  • the live broadcast server buffers the encoded video and audio data according to a preset delay time, and buffers the subtitle layer, and establishes a synchronization matching relationship between the cached subtitle layer and the video and audio data. Then send both;
  • a hybrid coding unit configured to receive the subtitle layer and the video and audio data having a synchronous matching relationship, and mix the two, and then distribute the output to the network node according to a predetermined transmission protocol.
  • the processing unit includes:
  • a play time axis forming unit configured to mark the cached video and audio data according to a play time point thereof to form a play time axis
  • the subtitle time axis forming unit is configured to establish, for the subtitle layer, a subtitle time axis that matches a play time axis of the video and audio data;
  • the subtitle time a stamp forming unit configured to establish a display start time stamp and an end time stamp of the subtitle layer according to the play time axis;
  • the display start time stamp and the end time stamp of the subtitle layer are collectively referred to as a subtitle time stamp.
  • the hybrid coding unit includes:
  • a composite embedding unit for embedding a subtitle time axis of the subtitle layer onto a play time axis of the video and audio data, or for embedding the start time stamp on a play time axis of the video and audio data And ending the time stamp, synthesizing the subtitle layer with the video and audio data.
  • the processing unit includes:
  • a subtitle layer correcting unit configured to correct a subtitle layer having the synchronous matching relationship, form a new subtitle layer, and overwrite the original subtitle layer;
  • an adjusting unit configured to adjust the play time axis or the subtitle time axis corresponding to the modified content, or the subtitle timestamp, so that the new subtitle layer and the video and audio data are synchronously matched.
  • the subtitle layer correction unit is configured to perform an operation of inserting a preset subtitle, skipping, correcting a subtitle, or subtitle on a subtitle layer.
  • the caption acquiring unit includes: a caption data correcting unit, configured to perform correction on acquiring caption data corresponding to the video and audio data.
  • the processing unit includes: a delay buffer unit, configured to delay buffering each frame of the video and audio data, or delay buffering a start portion of the video and audio data, or The end portion of the video and audio data is buffered for delay, or the video and audio data frame corresponding to the position is delayed according to the position of the pre-modified subtitle or the position of the pre-adjusted video data.
  • a delay buffer unit configured to delay buffering each frame of the video and audio data, or delay buffering a start portion of the video and audio data, or The end portion of the video and audio data is buffered for delay, or the video and audio data frame corresponding to the position is delayed according to the position of the pre-modified subtitle or the position of the pre-adjusted video data.
  • the invention also provides a processing method for streaming media and subtitle synchronization matching, comprising:
  • a synchronization matching relationship is established for the video and audio data and the subtitle layer, and then transmitted.
  • the establishing a synchronization matching relationship between the video and audio data and the subtitle layer includes:
  • the cached video and audio data is marked according to its play time point to form a play time axis
  • a caption time axis matching the play time axis of the video and audio data for the caption layer, or establishing a display start time stamp and an end time stamp of the caption layer according to the play time axis;
  • the display start time stamp and end time stamp of the layer are collectively referred to as a subtitle time stamp.
  • the establishing a synchronization matching relationship between the subtitle layer and the video and audio data includes:
  • the received encoded video and audio data is buffered according to a preset delay time, including:
  • each frame of the video and audio data or delay buffering the beginning portion of the video and audio data, or delay buffering the end portion of the video and audio data, or root According to the position of the pre-modified subtitle or the position of the pre-adjusted video and audio data, the video and audio data frame corresponding to the position is delayed.
  • the present invention provides a processing device for streaming media and subtitle synchronization matching, which includes:
  • a delay buffer unit configured to buffer the received encoded video and audio data according to a preset delay time
  • a subtitle layer forming unit configured to form the subtitle data corresponding to the received video and audio data, form a subtitle layer, and cache;
  • a synchronization matching relationship establishing unit configured to establish a synchronization matching relationship between the video and audio data and the subtitle layer, and then send the same.
  • the synchronization matching relationship establishing unit includes:
  • a play time axis forming unit configured to mark the cached video and audio data according to a play time point thereof to form a play time axis
  • the subtitle time axis forming unit is configured to establish, for the subtitle layer, a subtitle time axis that matches a play time axis of the video and audio data
  • the subtitle time a stamp establishing unit configured to establish a display start time stamp and an end time stamp of the subtitle layer according to the play time axis
  • the display start time stamp and the end time stamp of the subtitle layer are collectively referred to as a subtitle time stamp.
  • the synchronization matching relationship establishing unit includes:
  • a subtitle layer correcting unit configured to correct a subtitle layer having the synchronous matching relationship, form a new subtitle layer, and overwrite the original subtitle layer;
  • an adjusting unit configured to adjust the play time axis or the subtitle time axis corresponding to the modified content, or the subtitle timestamp, so that the new subtitle layer and the video and audio data are synchronously matched.
  • the delay buffer unit is configured to delay buffering each frame of the video and audio data, or perform delay buffering on a beginning portion of the video and audio data, or end the video and audio data.
  • the delay buffer is partially performed, or the video and audio frames corresponding to the position are delayed according to the pre-modified subtitle position or the position of the pre-adjusted video and audio data.
  • the invention also provides a system for synchronous display of streaming media and subtitles, comprising:
  • a caption acquisition device configured to input caption data that matches the video and audio data, and send the caption data to the live broadcast server according to a predetermined caption transmission protocol
  • a live broadcast service device configured to cache the encoded video and audio data according to a preset delay time, and form a subtitle layer according to the subtitle data and cache, and establish a synchronous matching relationship between the subtitle layer and the video and audio data, and then Send both;
  • a hybrid encoding device configured to mix the received subtitle layer having a synchronous matching relationship with the video and audio data to form streaming media information, and distribute the streaming media information to a network node according to a predetermined transmission protocol.
  • the hybrid coding device includes:
  • a synthesis processor configured to embed a subtitle time axis of the subtitle layer onto a play time axis of the video and audio data, or for embedding the start time stamp on a play time axis of the video and audio data And an end time stamp; synthesizing the subtitle layer with the video and audio data.
  • the live broadcast service device includes:
  • a subtitle layer modifier for modifying a subtitle layer having the synchronous matching relationship to form a new subtitle layer and overwriting the original subtitle layer; adjusting the subtitle timeline or the play time axis corresponding to the modified content, or Adjusting the play time axis or the caption time axis corresponding to the modified content, so that the new subtitle layer is synchronously matched with the video and audio data.
  • the caption acquisition device includes: a caption data modifier, configured to perform correction on acquiring caption data corresponding to the video and audio data.
  • the above provides a method, device and system for synchronous display and matching processing of streaming media and subtitles.
  • the method for synchronously displaying streaming media and subtitles is to send the collected video and audio data to the live server, and broadcast
  • the server caches the subtitle data related to the video and audio data, and sends the subtitle data related to the video and audio data to the live server.
  • the live broadcast server forms a subtitle layer according to the subtitle data and caches the subtitle.
  • the caption data and/or the caption layer can be corrected, so that the matching degree between the caption and the video and audio data is more accurate, the error rate of the subtitle is reduced, the accuracy of the synchronous display of the video and audio is guaranteed, and the subtitle is synchronized with the video and audio.
  • the display is not geographically restricted.
  • FIG. 1 is a flow chart of a method for displaying synchronous display of streaming media and subtitles according to the present invention
  • FIG. 2 is a schematic structural diagram of a streaming media and subtitle instant synchronization display device provided by the present invention
  • FIG. 3 is a flow chart of a processing method for streaming media and subtitle synchronization matching provided by the present invention
  • FIG. 4 is a schematic structural diagram of a processing apparatus for streaming media and subtitle synchronization matching provided by the present invention
  • FIG. 5 is a schematic diagram of a system for displaying synchronous display of streaming media and subtitles according to the present invention
  • FIG. 6 is a block diagram showing the structure of a streaming media and subtitle instant synchronization display device according to another embodiment of the present invention.
  • FIG. 7 is a block diagram showing the structure of a processing device for streaming media and subtitle synchronization matching according to another embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for displaying synchronous display of streaming media and subtitles according to the present invention.
  • the invention mainly aims to display the video and audio files of the live scene collected in real time during playback.
  • the video file is synchronized with the subtitle file, so that the subtitle and the video and audio file are instantly synchronized on the display device. Specifically take the following steps:
  • Step S100 encode the video and audio data in the collected streaming media, and send the data to the live broadcast server.
  • the video and audio data in the streaming media may be recorded on the live video or live broadcast event, and the satellite and digital high-definition signals are generated, and the satellite and the digital high-definition signal are collected by the encoder, and The collected signals are encoded, encoded, and sent to the live server.
  • encoding the video and audio data may be implemented by third party software, such as Windows Media Encoder.
  • the encoded video and audio data may be sent to the live broadcast server according to a predetermined transmission protocol, and the predetermined transport protocol may be RTMP (Real Time Messaging Protocol), and the transport protocol may include a basic protocol of RTMP and RTMPT. /RTMPS/RTMPE and many other variants.
  • RTMP Real Time Messaging Protocol
  • live broadcast program or the live broadcast event site described herein is not subject to geographical restrictions, and the collected live program signal or the live broadcast event signal is not restricted by the input of the signal source.
  • Step S110 Acquire subtitle data corresponding to the video and audio data, and send the data to the live broadcast server.
  • the caption data of the video and audio data may be subjected to simultaneous interpretation at the live broadcast or live event, and the audio and video are synchronized and translated, and the shorthand person inputs the translated content into the caption management system and sends it to the live broadcast. server.
  • the transmission of the caption data can also be transmitted using the same transmission protocol as the video and audio data.
  • the acquired caption data corresponding to the video and audio data may be corrected, the typos generated due to human factors may be modified, and the accuracy of the caption data may be improved.
  • Step S120 The live broadcast server buffers the encoded video and audio data according to a preset delay time, and forms a subtitle layer according to the subtitle data and caches, and establishes a synchronization matching relationship between the subtitle layer and the video and audio data. Then send both.
  • the live broadcast server caches the encoded video and audio data according to a preset delay time.
  • the video and audio data may be cached in a storage space of the live server, and the preset delay time may be set according to requirements. Between 30 seconds and 90 seconds, this time can be determined according to the size of the storage space.
  • the manner of storing the video and audio data may be performed by delay processing each frame, or performing delay processing on the beginning portion of the video and audio data, or performing end processing on the video and audio data. Delay processing and other methods.
  • a delay of 30 seconds is implemented in the server for each frame in the video and audio data, or if the video and audio data displays 25 frames in one second, the frame of the 25 frames can be delayed by 30 seconds, that is, 25 Frame/second ⁇ 30 seconds, wherein 30 seconds is the delay time; thereby facilitating processing of the caption data after the received caption data, and establishing a synchronous matching relationship between the caption data and the video and audio data, the synchronization matching relationship is
  • the subtitle layer is presented at the video and audio position where the subtitle needs to be displayed.
  • the preset delay time can be set to 30 to 90 seconds, and the delay time can be set according to the amount of storage in the live streaming server. It is a preferred implementation of display and is not intended to limit the setting of the delay time of the present invention. For video and audio data delay, it is beneficial to improve the accuracy of subtitle and video and audio data synchronization.
  • the live broadcast server may also perform delay processing after receiving the subtitle data, which is more advantageous for synchronous matching between the subtitle layer and the video and audio data.
  • the establishment of the relationship may be performed by the live broadcast server.
  • a synchronization matching relationship is established between the subtitle layer and the video and audio data.
  • the specific implementation manner may be various.
  • the present invention describes the establishment of a synchronization matching relationship in the following two manners.
  • a first embodiment forming, by the play time point mark, the play time point mark, forming a play time axis, and establishing a caption time axis matching the play time axis of the video and audio data for the caption layer;
  • the video and audio data that is cached is marked according to its play time point to form a play time axis, and a time stamp for triggering display of the subtitle layer is established on the play time axis.
  • the above two embodiments are used to describe the relationship between the video and audio data and the subtitle layer.
  • the two implementations actually establish the subtitle layer display time based on the video and audio playback time. Synchronous matching relationship between audio data and subtitle layer.
  • the establishment of the synchronization matching relationship between the video and audio data and the subtitle layer is not limited to the above two methods, and the synchronization matching of the two can also be performed by marking the video and audio data frames, for example, displaying the subtitle layer in the video and audio data.
  • the identifier of the frame picture is added, and the subtitle layer display mark is set on the subtitle layer, and the synchronous matching relationship between the two is realized by the video and audio identifier and the subtitle layer identifier.
  • the manner of establishing a synchronous matching relationship between the video and audio data and the subtitle layer is not limited to the above content, and the above is merely an example for realizing a synchronous matching relationship between the two.
  • the length of the play time axis may be the sum of the length of the video and audio data and the preset delay time.
  • the subtitle layer having the synchronous matching relationship may be modified to form a new subtitle layer. And overlaying on the original subtitle layer, and then adjusting the play time axis or the subtitle time axis corresponding to the modified content, or the subtitle time stamp, so that the new subtitle layer and the video and audio data are synchronously matched.
  • adjusting the subtitle time axis can be performed by using a black transparent layer to cover the position of the modified subtitle. For example, when the subtitle layer is corrected, a subtitle is deleted, and the subtitle duration is 3 seconds. , corresponding to the video and audio playback time axis less than 75 frames, you can establish a black transparent overlay layer, covering the position of 75 frames of video and audio data, and then adjust the playback time axis.
  • the modification of the subtitle layer includes: inserting a preset subtitle, skipping, correcting a subtitle or a subtitle on a subtitle, for example, for a specific title or a specific word, the time code can be implemented by artificially subtitles to complete the skip. Fix the work.
  • the one-click subtitle function can be applied to the politically sensitive vocabulary, by skipping the sensitive vocabulary by controlling the video and audio playback timeline, directly updating and performing the upper screen operation, thereby making the content displayed by the subtitle layer more accurate and avoiding The emergence of sensitive vocabulary to improve the security of live video.
  • the modification of the subtitle layer may be implemented in the live broadcast server, or the live broadcast server may first send the matched subtitle layer in the subtitle layer. After the layer is modified, it returns to the live server, and the live server adjusts the received subtitle layer to synchronize the modified subtitle layer with the video and audio data, and then sends it in. Line mixing process. Therefore, the modification of the subtitle layer in the present invention can be completed not only in the live broadcast server but also outside the live broadcast server.
  • Step S130 mixing the received subtitle layer having the synchronous matching relationship with the video and audio data to form streaming media information, and distributing the streaming media information to a network node for output.
  • the two can be mixed in the following manner.
  • the subtitle time axis of the subtitle layer may be embedded on the playing time axis of the video and audio data based on the synchronization matching relationship established by the playing time axis and the subtitle time axis, and the specific implementation may be a time scale of the subtitle time axis. It is combined with the time scale of the video and audio data playback timeline to achieve mixing. For example, according to the playback timeline established by the video and audio playback time, it is assumed that there is a subtitle of 2 seconds starting from the 10th second of the video, and a 2 second subtitle timeline is established in the 11th second of the video playback.
  • video and audio start to play at 25 frames per second, and to the 251st frame, that is, at the 11th second, the subtitle timeline is added to the playback timeline, and then the video and audio data is played back to 300.
  • the subtitle time axis stops, the subtitle layer disappears, and so on, so that the audio and video data and the subtitle layer are synchronously mixed, and the mixed video and audio data is distributed to each network node for output.
  • the manner is mainly based on the video and audio data play time axis, and the time point displayed on the subtitle layer is displayed thereon.
  • the subtitle layer displays a timestamp, and when the video and audio data is played to the time point, the timestamp is triggered, so that the subtitle layer is displayed. For example, suppose that there is a subtitle of 2 seconds starting from the 10th second of the video, a timestamp displayed in the subtitle layer in the 11th second of the video playback, and a timestamp of the subtitle stop in the 13th second of the video playback.
  • the mixing is that the video and audio start to play at 25 frames per second, and at the 251th frame, that is, at the 11th second, the playback time axis automatically triggers the display time stamp of the subtitle layer, thereby enabling the subtitle layer. Displayed on the video, then when the video and audio data is played to 300 frames, that is, at the 13th second, the video playback time axis automatically triggers the stop time stamp of the subtitle layer, the subtitle layer disappears, and so on, thereby achieving video and audio.
  • the mix of data and subtitle layers is that the video and audio start to play at 25 frames per second, and at the 251th frame, that is, at the 11th second, the playback time axis automatically triggers the display time stamp of the subtitle layer, thereby enabling the subtitle layer. Displayed on the video, then when the video and audio data is played to 300 frames, that is, at the 13th second, the video playback time axis automatically triggers the stop time stamp of the subtitle layer, the subtitle layer disappears, and so on
  • the caption layer can display the caption layer at the position of the caption layer of the video and audio data, so as to realize the simultaneous synchronous display of the two.
  • the matching of the subtitle layer and the video and audio data may be achieved through automatic matching of the system or by manual intervention, and the manual intervention mode may be In the position where the subtitle layer needs to be displayed, the subtitle layer is manually added.
  • the implementation of the above mixing process can be implemented by an encoder, and the live broadcast server sends the video and audio data and the subtitle layer that establish the synchronous matching relationship to the hybrid encoder, and the two are mixed by the hybrid encoder and finally transmitted.
  • the mixed video and audio data and the subtitle layer in this step can be transmitted through a network transmission protocol (for example, http protocol) and displayed on the display device.
  • a network transmission protocol for example, http protocol
  • the present invention provides a method for synchronously displaying streaming media and subtitles, and transmitting the encoded video and audio data to a live broadcast server, and the live broadcast server caches the video and audio according to a preset delay time.
  • Obtaining the subtitle data related to the video and audio data to form a subtitle layer and the live broadcast server establishes a synchronous matching relationship and sends the data, and after mixing the video and audio data and the subtitle layer having the synchronous matching relationship, the network node distributes through the network node. Go out and finally display the video and audio data and the subtitle layer in real time on the display device.
  • the obtained video and audio data and the subtitle data are subjected to delay processing, and the matching of the subtitle and the video and audio data can be effectively adjusted, so that the subtitle can be displayed on the video and audio screen in real time.
  • the delay time is set, the matching degree between the subtitle and the video and audio data is more accurate, the error rate of the subtitle is reduced, the synchronous display of the video and audio is guaranteed, and the subtitle display is not restricted by the region.
  • the streaming media and subtitle instant synchronous display method provided by the present invention can further improve the display of the subtitle layer by modifying the subtitle layer; and adjusting the subtitle layer time axis or time after correcting the subtitle layer. Poke, can achieve more accurate subtitles and video and audio image matching, further improve the accuracy of synchronization, and use manual intervention to further improve the accuracy of the matching and the accuracy of the synchronous output, thus ensuring the accuracy and real-time performance of the subtitle layer .
  • FIG. 2 is Please refer to a schematic diagram of a synchronous display device for streaming media and subtitles. Since the device embodiment is substantially similar to the method embodiment, the description is relatively simple, and the relevant portions of the method embodiments are only relevant, and the device embodiments described below are merely illustrative.
  • the device specifically includes:
  • the video and audio collection and encoding unit 200 is configured to encode the video and audio data in the collected streaming media and send the data to the live broadcast server.
  • the subtitle obtaining unit 210 is configured to acquire subtitle data of the video and audio data, form a subtitle layer, and send the subtitle layer to the live broadcast server.
  • the caption acquisition unit 210 includes a caption data correction unit configured to correct subtitle data corresponding to the video and audio data.
  • the processing unit 220 is configured to cache the encoded video and audio data according to a preset delay time, and buffer the subtitle layer, and establish a synchronization matching relationship between the cached subtitle layer and the video and audio data. And then send both.
  • the processing unit 220 includes:
  • a delay buffer unit configured to delay buffering each frame of the video and audio data, or delay buffering a beginning portion of the video and audio data, or delay buffering an end portion of the video and audio data Or delaying the video and audio data frame corresponding to the position according to the pre-modified subtitle position or the position of the pre-adjusted video data.
  • a play time axis forming unit configured to mark the cached video and audio data according to a play time point thereof to form a play time axis
  • the subtitle time axis forming unit is configured to establish, for the subtitle layer, a subtitle time axis that matches a play time axis of the video and audio data;
  • the subtitle time a stamp forming unit configured to establish a display start time stamp and an end time stamp of the subtitle layer according to the play time axis;
  • the display start time stamp and the end time stamp of the subtitle layer are collectively referred to as a subtitle time stamp.
  • a subtitle layer correction unit is configured to correct the subtitle layer having the synchronous matching relationship to form a new subtitle layer and overlay the original subtitle layer.
  • the subtitle layer correction unit is configured to perform an operation of inserting a preset subtitle, skipping, correcting a subtitle or a subtitle on a subtitle layer.
  • an adjusting unit configured to adjust the play time axis or the subtitle time axis corresponding to the modified content, or the subtitle timestamp, so that the new subtitle layer and the video and audio data are synchronously matched.
  • the hybrid encoding unit 230 is configured to receive the subtitle layer and the video and audio data having a synchronous matching relationship, and mix the two, and then distribute the output to the network node according to a predetermined transmission protocol.
  • the hybrid encoding unit 230 includes: a synthesis embedding unit for embedding a subtitle time axis of the subtitle layer onto a play time axis of the video and audio data, or for playing a timeline of the video and audio data
  • the start time stamp and the end time stamp are embedded on the top, and the caption layer is synthesized with the video and audio data.
  • FIG. 3 is a flowchart of a processing method for streaming media and subtitle synchronization matching provided by the present invention.
  • the processing method of the streaming media and the subtitle synchronization matching is described in detail in the streaming media and subtitle instant synchronization display method provided by the present invention. Therefore, the description herein is illustrative. For details, refer to FIG. 1 and related descriptions.
  • the method includes:
  • Step S300 The received encoded video and audio data is buffered according to a preset delay time.
  • the step S300 includes: delay buffering each frame of the video and audio data, or delay buffering the beginning portion of the video and audio data, or delay buffering the end portion of the video and audio data. Or delaying the video and audio data frame corresponding to the position according to the pre-modified subtitle position or the position of the pre-adjusted video and audio data.
  • Step S310 The received subtitle data corresponding to the video and audio data is formed into a subtitle layer and buffered.
  • Step S320 Establish a synchronous matching relationship between the video and audio data and the subtitle layer, and then send.
  • the method includes:
  • the cached video and audio data is marked according to its play time point to form a play time axis.
  • a caption time axis matching the play time axis of the video and audio data for the caption layer, or establishing a display start time stamp and an end time stamp of the caption layer according to the play time axis;
  • the display start time stamp and end time stamp of the layer are collectively referred to as a subtitle time stamp.
  • the present invention further provides a processing device for streaming media and subtitle synchronization matching. Since the device embodiment is substantially similar to the method embodiment, it is described. For the sake of simplicity, reference is made to the partial description of the method embodiments, and the device embodiments described below are merely illustrative.
  • FIG. 4 is a schematic structural diagram of a processing apparatus for streaming media and subtitle synchronization matching provided by the present invention.
  • the device includes:
  • the delay buffer unit 400 is configured to buffer the received encoded video and audio data according to a preset delay time.
  • the delay buffer unit 400 is configured to delay buffering each frame of the video and audio data, or delay buffering the beginning portion of the video and audio data, or delay the end portion of the video and audio data. Time buffering, or delaying the video and audio frames corresponding to the position according to the pre-modified subtitle position or the position of the pre-adjusted video and audio data.
  • a subtitle layer forming unit 410 configured to form the subtitle data corresponding to the video and audio data, form a subtitle layer, and cache;
  • the synchronization matching relationship establishing unit 420 is configured to establish a synchronization matching relationship between the video and audio data and the subtitle layer, and then send the same.
  • the synchronization matching relationship establishing unit 420 includes: a playing time axis forming unit, configured to form the playing time axis according to the playing time point mark of the cached video and audio data.
  • the subtitle time axis forming unit is configured to establish, for the subtitle layer, a subtitle time axis that matches a play time axis of the video and audio data
  • the subtitle time a stamp establishing unit configured to establish a display start time stamp and an end time stamp of the subtitle layer according to the play time axis
  • the display start time stamp and the end time stamp of the subtitle layer are collectively referred to as a subtitle time stamp.
  • a subtitle layer correcting unit configured to correct a subtitle layer having the synchronous matching relationship, form a new subtitle layer, and overwrite the original subtitle layer;
  • An adjustment unit configured to adjust the play timeline or subtitle time corresponding to the corrected content
  • the axis, or the subtitle timestamp causes the new subtitle layer to be synchronously matched with the video and audio data.
  • the present invention further provides a system for displaying subtitles based on streaming media live broadcast.
  • FIG. 5 it is a schematic diagram of a system for synchronous display of streaming media and subtitles provided by the present invention. Since the system embodiment is substantially similar to the embodiment of the method, the description is relatively simple, and the relevant portions can be referred to the description of the method embodiments. The system embodiments described below are merely illustrative.
  • the system specifically includes:
  • the collecting and encoding device 500 is configured to collect video and audio data in the streaming media for encoding, and send the data to the live broadcast server; the device is mainly capable of collecting video and audio data of the live live event, or other live video and audio data.
  • a caption acquisition device 510 configured to acquire caption data corresponding to the video and audio data, and send the data to the live broadcast server;
  • the caption acquisition device 510 includes: a caption data corrector, configured to acquire caption data corresponding to the video and audio data Make corrections.
  • the live broadcast service device 520 is configured to cache the encoded video and audio data according to a preset delay time, and form a subtitle layer according to the subtitle data and cache, and establish a synchronization matching relationship between the subtitle layer and the video and audio data. Then send both.
  • the live broadcast service device 520 includes:
  • a data information processor configured to mark the cached video and audio data according to a play time point thereof, to form a play time axis; and to establish a caption time for the caption layer to match a play time axis of the video and audio data The axis, or for establishing a start time stamp and an end time stamp of the subtitle layer display according to the play time axis.
  • a subtitle layer modifier for modifying a subtitle layer having the synchronous matching relationship to form a new subtitle layer and overwriting the original subtitle layer; adjusting the subtitle timeline or the play time axis corresponding to the modified content, or Adjusting the play time axis or the caption time axis corresponding to the modified content, so that the new subtitle layer is synchronously matched with the video and audio data.
  • the hybrid encoding device 530 is configured to mix the received subtitle layer having a synchronous matching relationship with the video and audio data to form streaming media information, and transmit the streaming media information according to a predetermined transmission protocol, and finally Displayed on the terminal device.
  • the hybrid encoding device 530 includes: a synthesizing processor for subtitles of the subtitle layer An inter-axis is embedded on a play time axis of the video and audio data, or is used to embed the start time stamp and an end time stamp on a play time axis of the video and audio data; Video and audio data synthesis.
  • the above is a method and device for synchronous display of streaming media and subtitles provided by the present invention; a processing method and device for synchronous matching of streaming media and subtitles; and a system for synchronous display of streaming media and subtitles.
  • the method provided by the present invention enables the obtained video and audio data and the caption data to be synthesized into a whole file and sent to the display device after establishing a synchronous matching relationship, so that the video and audio data and the subtitle layer can be displayed in an instant synchronization manner. Improve the synchronization accuracy of the two.
  • FIG. 6 is a block diagram showing the structure of a streaming media and subtitle instant synchronization display device according to another embodiment of the present invention.
  • the streaming media and subtitle instant synchronization display device 1100 may be a host server having a computing capability, a personal computer PC, or a portable computer or terminal that can be carried.
  • the specific embodiments of the present invention do not limit the specific implementation of the computing node.
  • the streaming media and subtitle instant synchronization display device 1100 includes a processor 1110, a communication interface 1120, a memory 1130, and a bus 1140.
  • the processor 1110, the communication interface 1120, and the memory 1130 complete communication with each other through the bus 1140.
  • Communication interface 1120 is for communicating with network devices, including, for example, a virtual machine management center, shared storage, and the like.
  • the processor 1110 is configured to execute a program.
  • the processor 1110 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • ASIC Application Specific Integrated Circuit
  • the memory 1130 is used to store files.
  • the memory 1130 may include a high speed RAM memory and may also include a non-volatile memory such as at least one disk memory.
  • Memory 1130 can also be a memory array.
  • the memory 1130 may also be partitioned, and the blocks may be combined into a virtual volume according to certain rules.
  • the foregoing program may be a program generation including a computer operation instruction. code.
  • the program can be specifically used to: implement the operations of each step in the streaming media and subtitle instant synchronization display method.
  • FIG. 7 is a block diagram showing the structure of a processing device for streaming media and subtitle synchronization matching according to another embodiment of the present invention.
  • the processing device 1200 for streaming media and subtitle synchronization matching may be a host server with computing power, a personal computer PC, or a portable computer or terminal that can be carried.
  • the specific embodiments of the present invention do not limit the specific implementation of the computing node.
  • the processing device 1200 for streaming media and subtitle sync matching includes a processor 1110, a communication interface 1120, a memory 1130, and a bus 1140.
  • the processor 1110, the communication interface 1120, and the memory 1130 complete communication with each other through the bus 1140.
  • Communication interface 1120 is for communicating with network devices, including, for example, a virtual machine management center, shared storage, and the like.
  • the processor 1110 is configured to execute a program.
  • the processor 1110 may be a central processing unit CPU, or an application specific integrated circuit ASIC, or one or more integrated circuits configured to implement embodiments of the present invention.
  • the memory 1130 is used to store files.
  • Memory 1130 may include high speed RAM memory and may also include non-volatile memory, such as at least one disk memory.
  • Memory 1130 can also be a memory array.
  • the memory 1130 may also be partitioned, and the blocks may be combined into a virtual volume according to certain rules.
  • the above program may be program code including computer operating instructions.
  • the program is specifically applicable to: implementing operations of each step in a processing method for streaming media and subtitle synchronization matching.
  • the function is implemented in the form of computer software and sold or used as a stand-alone product, it is considered to some extent that all or part of the technical solution of the present invention (for example, a part contributing to the prior art) is It is embodied in the form of computer software products.
  • the computer software product is typically stored in a computer readable non-volatile storage medium, including instructions for causing a computer device (which may be a personal computer, server, or network device, etc.) to perform all of the methods of various embodiments of the present invention. Or part of the steps.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a removable hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk.
  • the method, device and system for synchronous display and matching processing of streaming media and subtitles delaying processing of acquired video and audio data in a live broadcast program or live broadcast site at home and abroad, and The relationship between the audio data and the subtitle layer is established, so that the matching of the subtitle and the video and audio data can be effectively adjusted, and the subtitle can be displayed on the video and audio screen in real time with the video and audio data, and synchronized with the video and audio; Since the delay time of the video and audio is set, the subtitle data and/or the subtitle layer can be corrected, so that the matching degree between the subtitle and the video and audio data is more accurate, the error rate of the subtitle is reduced, and the accurate display of the audio and video and the subtitle is ensured. Sex, and the simultaneous display of subtitles and video and audio is not geographically restricted.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明公开了一种流媒体与字幕即时同步显示、匹配处理方法、装置及系统,其中同步显示的方法,包括:将采集的流媒体中的视音频数据进行编码,并发送至直播服务器;获取对应所述视音频数据的字幕数据,并发送至直播服务器;所述直播服务器将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并将所述流媒体信息分发至网络节点上输出,从而保证视音频数据与字幕层的同步即时显示,提高字幕层与视音频数据匹配的准确度。

Description

流媒体与字幕即时同步显示、匹配处理方法、装置及系统
交叉引用
本申请主张2015年12月22日提交的中国专利申请号为201510970843.9的优先权,其全部内容通过引用包含于此。
技术领域
本发明涉及流媒体直播技术领域,特别涉及一种流媒体与字幕即时同步显示的方法、装置,流媒体与字幕同步匹配处理方法及装置,以及流媒体与字幕即时同步显示的系统。
背景技术
随着互联网+模式的迅速推广,以及流媒体直播的发展,字幕翻译相比较同声传译方式来说大大降低了视觉干扰及提高了同步水平。目前在全球互联网流媒体直播领域,大多采用视频单独播放、字幕单独翻译的模式,字幕和视频无法做到真正的实时声画字幕同步,并且做一层透明层放在视频上,用以字幕显示,无法进行移动端适配。总体上来说,实现字幕翻译的手段较为落后,操作复杂。
例如,专利CN102655606A公开了一种基于P2P网络的直播节目添加实时字幕和手语服务的方法及系统,其包括以下步骤:
1)根据节目的电视直播或现场,制作得到相应的实时字幕。
2)根据节目的电视直播或现场,制作得到相应的实时手语。
3)获取网络直播视频流,实时字幕和实时手语流,保存至各自的缓冲区。
步骤1)所述的制作实时字幕,具体步骤为:
1)根据节目的电视直播或现场,速录人员实时录入字幕内容。
2)速录人员对已经录入的字幕内容进行审核。
3)为字幕内容添加同步信息,主要包括时间戳,顺序号,误差偏移量。
4)将处理后的字幕推送至字幕服务器。
步骤2)所述的制作实时手语,具体步骤为:
1)根据节目的电视直播或现场,手语翻译人员实时翻译节目内容;
2)实时录制手语翻译人员的翻译内容,并且为手语视频添加同步信息,主要包括时间戳,误差偏移量;
3)将处理后的手语视频推送至手语流媒体服务器。
步骤3)所述的网络直播节目分别与字幕和手语同步并播放,具体步骤为:
1)获取网络直播视频流,实时字幕流和实时手语流,保存至各自的缓冲区;
2)解析缓冲区中直播节目视频帧、手语视频帧以及字幕的时间戳;
3)根据直播节目视频帧的时间戳,分别到字幕缓冲区和手语视频缓冲区匹配到相应的手语视频帧和字幕,如果有则显示直播视频的同时显示字幕与手语视频;没有,则只显示直播视频。
该现有技术所记载的技术方案是无法做到实时声画字幕同步,通过该方案制作好的字幕和手语即便加上误差偏移量也不可能同步加在直播视频的正确时间轴上。
此外,现存在的网络直播字幕均由广电领域添加字幕演变而来,通过硬件字幕设备在信号终端完成字幕添加,导致互联网字幕无法做到字幕与视音频的真正时间同步。
如何提供一种基于流媒体直播即时显示字幕的方法、装置及系统,能够实现流媒体直播视音频与字幕的达到即时同步显示,成为本领域技术人员需要解决的技术问题。
发明内容
本发明提供一种基于流媒体直播即时显示字幕的方法,以解决上述存在的技术问题。
本发明提供一种流媒体与字幕即时同步显示方法,包括:将采集的流媒 体中的视音频数据进行编码,并发送至直播服务器;获取对应所述视音频数据的字幕数据,并发送至直播服务器;所述直播服务器将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并将所述流媒体信息分发至网络节点上输出。
可选的,所述为缓存后的所述字幕层和所述视音频数据建立同步匹配关系,包括:
对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
可选的,所述将具有同步匹配关系的所述字幕层和所述视音频数据混合,包括:
将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳;将所述字幕层与所述视音频数据合成。
可选的,为所述字幕层和所述视音频数据建立同步匹配关系,包括:
对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
可选的,对所述字幕层的修正包括:插入预设字幕,跳过,修正字幕或者一键上字幕的操作。
可选的,所述播放时间轴的长度为视音频数据时间长度与所述预设延时时间之和。
可选的,所述获取对应所述视音频数据的字幕数据,并发送至直播服务器,包括:对获取对应所述视音频数据的字幕数据进行校正。
可选的,所述直播服务器将编码后的视音频数据根据预设延时时间进行 缓存,包括:对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频帧。
本发明还提供一种流媒体与字幕即时同步显示装置,包括:
视音频采集编码单元,用于将采集的流媒体中的视音频数据进行编码,并发送至直播服务器;
字幕获取单元,用于获取所述视音频数据的字幕数据,形成字幕层,并发送至直播服务器;
处理单元,所述直播服务器将编码后的视音频数据根据预设延时时间进行缓存,以及缓存所述字幕层,并为缓存后的所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;
混合编码单元,用于接收具有同步匹配关系的所述字幕层和所述视音频数据,并将二者混合,之后根据预定的传输协议分发至网络节点上输出。
可选的,所述处理单元包括:
播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
字幕时间轴形成单元或者字幕时间戳形成单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;所述字幕时间戳形成单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
可选的,所述混合编码单元包括:
合成嵌入单元,用于将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳,将所述字幕层与所述视音频数据合成。
可选的,所述处理单元包括:
字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
调整单元,用于调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
可选的,所述字幕层修正单元,用于对所述字幕层进行插入预设字幕、跳过、修正字幕或者一键上字幕的操作。
可选的,所述字幕获取单元包括:字幕数据修正单元,用于对获取对应所述视音频数据的字幕数据进行校正。
可选的,所述处理单元包括:延时缓存单元,用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视频数据的位置,延时该位置对应的视音频数据帧。
本发明还提供一种用于流媒体和字幕同步匹配的处理方法,包括:
将接收的编码后的视音频数据根据预设延时时间缓存;
将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存;
为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。
可选的,所述为所述视音频数据和所述字幕层建立同步匹配关系,包括:
对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
可选的,所述为所述字幕层和所述视音频数据建立同步匹配关系,包括:
对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
可选的,所述将接收的编码后的视音频数据根据预设延时时间缓存,包括:
对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根 据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频数据帧。
本发明一种用于流媒体和字幕同步匹配的处理装置,其特征在于,包括:
延时缓存单元,用于将接收的编码后的视音频数据根据预设延时时间缓存;
字幕层形成单元,用于将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存;
同步匹配关系建立单元,用于为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。
可选的,所述同步匹配关系建立单元包括:
播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
字幕时间轴形成单元或字幕时间戳建立单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;所述字幕时间戳建立单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
可选的,所述同步匹配关系建立单元包括:
字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
调整单元,用于调整与修正内容相对应的所述播放时间轴或字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
可选的,所述延时缓存单元用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频帧。
本发明还提供一种流媒体与字幕即时同步显示的系统,包括:
采集编码设备,用于采集流媒体中的视音频数据进行编码,并根据预定 的视音频传输协议发送至直播服务器;
字幕获取设备,用于输入与所述视音频数据相匹配的字幕数据,并根据预定的字幕传输协议发送至所述直播服务器;
直播服务设备,用于将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;
混合编码设备,用于将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并根据预定的传输协议,将所述流媒体信息分发至网络节点上输出。
可选的,所述混合编码设备包括:
合成处理器,用于将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳;将所述字幕层与所述视音频数据合成。
可选的,所述直播服务设备包括:
字幕层修正器,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;调整与修正内容相对应的所述字幕时间轴或播放时间轴,或者调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,使所述新字幕层与所述视音频数据同步匹配。
可选的,所述字幕获取设备包括:字幕数据修正器,用于对获取对应所述视音频数据的字幕数据进行校正。
以上为本发明提供一种流媒体与字幕即时同步显示、匹配处理的方法、装置及系统,其中,流媒体与字幕即时同步显示方法是将采集编码后的视音频数据发送至直播服务器中,直播服务器根据预设的延时时间对其进行缓存,同时获取与所述视音频数据相关的字幕数据,并发送至直播服务器中,直播服务器根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,将所述流媒体信息分发至网络节点上输出,由此使得在境内外直播节目或直播活动现场,对获取的视音频数据进行延时处理,并通过将视音频数据与字幕层之间建立同步匹 配的关系,从而可有效的调整字幕与视音频数据的匹配,实现字幕可实时的与视音频数据同步的显示在视音频画面上,并与视音频同步;由于设定视音频的延时时长,从而能够对字幕数据和/或字幕层进行修正,使得字幕与视音频数据的匹配度更加精准,降低字幕的错误率,保证视音频与字幕同步显示的准确性,并且字幕与视音频的同步显示不受地域限制。
附图说明
图1是本发明提供的一种流媒体与字幕即时同步显示方法的流程图;
图2是本发明提供的一种流媒体与字幕即时同步显示装置的结构示意图;
图3是本发明提供的一种用于流媒体和字幕同步匹配的处理方法的流程图;
图4是本发明提供的一种用于流媒体和字幕同步匹配的处理装置的结构示意图;
图5是本发明提供的一种流媒体与字幕即时同步显示的系统的示意图;
图6示出了本发明的另一个实施例的一种流媒体与字幕即时同步显示设备的结构框图;
图7示出了本发明的另一个实施例的一种用于流媒体和字幕同步匹配的处理设备的结构框图。
具体实施方式
在下面的描述中阐述了很多具体细节以便于充分理解本发明。但是本发明能够以很多不同于在此描述的其它方式来实施,本领域技术人员可以在不违背本发明内涵的情况下做类似推广,因此本发明不受下面公开的具体实施的限制。
请参考图1所示,图1是本发明提供的一种本发明提供的一种流媒体与字幕即时同步显示方法的流程图。
本发明主要是将采集的直播现场的视音频文件,在播放时实时的显示于 所述视音频文件同步的字幕文件,从而使字幕与视音频文件即时同步的呈现于显示设备上。具体采用如下步骤:
步骤S100:将采集的流媒体中的视音频数据进行编码,并发送至直播服务器。
在上述步骤中,所述流媒体中的视音频数据可以是在直播节目或者直播活动现场,对视音频进行录制,产生卫星及数字高清信号等,通过编码机采集卫星及数字高清信号,并对采集的信号进行编码,编码后发送至直播服务器。
在该步骤中,对所述视音频数据进行编码可以通过第三方软件实现,例如:Windows Media Encoder等。
编码后的视音频数据可以根据预定的传输协议发送至直播服务器,所述预定的传输协议可以是RTMP(Real Time Messaging Protocol,即:实时消息传输协议),传输协议可以包括RTMP的基本协议以及RTMPT/RTMPS/RTMPE等多种变种。
需要说明的是,此处所述的直播节目或者直播活动现场不受地域限制,且所采集的直播节目信号或直播活动现场的信号也不受信号源的输入限制。
步骤S110:获取对应所述视音频数据的字幕数据,并发送至所述直播服务器。
在该步骤中,所述视音频数据的字幕数据可以是在直播节目或直播活动现场经过同声传译,对视音频同步有声翻译,速记人员将翻译内容录入在字幕管理系统中,并发送至直播服务器。
此处字幕数据的发送也可以采用如同视音频数据相同的传输协议进行传输。
为提高字幕录入的准确性,本实施中还可以对获取的对应于所述视音频数据的字幕数据进行校正,修改由于人为原因导致出现的错别字等问题,提高字幕数据的准确性。
步骤S120:所述直播服务器将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送。
在该步骤中,直播服务器将编码后的视音频数据根据预设延时时间进行缓存,具体可以将视音频数据缓存在直播服务器的存储空间内,所述预设延时时间可以根据需求设定在介于30秒到90秒之间,该时间可以根据存储空间的大小来确定。在本实施中对所述视音频数据的存储方式可以采用对每一帧进行延时处理,或者对所述视音频数据的开始部分进行延时处理,或者对所述视音频数据的结束部分进行延时处理等方式。例如:对视音频数据中的每一帧在服务器中实现30秒的延时缓存,或者是,视音频数据如果一秒显示25帧,则可以对该25帧的画面延时30秒,即25帧/秒×30秒,其中30秒为延时间;从而有利于在收到的字幕数据后对字幕数据进行处理,并为字幕数据与视音频数据建立同步匹配的关系,所述同步匹配关系是,在视音频数据显示时,在需要显示字幕的视音频位置将字幕层呈现。
可以理解是,在该实施例中,所述预设延时时间可以设定为30至90秒,延时时间可以根据流媒体直播服务器中存储量的大小来设定延时的时长,以上仅为一种显示较优的实现方式,并不用于限制本发明的延时时长的设定。对于视音频数据延时有利于提高字幕与视音频数据同步准确性。
需要说明的是,在本实施中,相应于视音频数据的延时,所述直播服务器收到字幕数据后也可以对其进行延时处理,更有利于字幕层与视音频数据之间同步匹配关系的建立。
在该步骤中,为所述字幕层与视音频数据之间建立同步匹配关系,具体实现方式可以有多种,本发明以下述两种方式对建立同步匹配关系进行说明。
第一种实施方式:对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴,并为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;
第二种实施方式:对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴,并在所述播放时间轴上建立触发所述字幕层显示的时间戳。
以上采用两种实施方式对视音频数据与字幕层之间建立同步匹配的关系进行了说明,该这两种实施方式实际上是以视音频播放时间为基础,建立字幕层显示时间,从而实现视音频数据与字幕层之间的同步匹配关系。可以 理解的是,视音频数据与字幕层同步匹配关系的建立并不仅仅限于上述两种方式,还可以通过对视音频数据帧进行标记实现二者的同步匹配,例如:在视音频数据显示字幕层的帧画面位置处加入标识,在字幕层上设置与所述视音频标识相同字幕层显示标记,通过视音频标识与字幕层标识实现二者之间的同步匹配关系。
对于视音频数据与字幕层之间建立同步匹配关系的方式并不限于上述内容,以上仅为实现二者之间具有同步匹配关系的举例说明。
需要说明的是,在上述两种方式中,所述播放时间轴的长度可以为视音频数据时间长度与所述预设延时时间之和。
在本步骤中,为保证字幕层的准确性,在为所述字幕层和所述视音频数据建立同步匹配关系后,可以对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上,之后再调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
可以理解的是,此处的调整所述字幕时间轴可以通过采用黑色透明层覆盖修正字幕的位置上即可,例如:对字幕层进行修正时,删除了一个字幕,该字幕持续时间为3秒,对应视音频播放时间轴上少了75帧,则可以通过建立黑色透明覆盖层,覆盖在视音频数据的75帧的位置上,进而实现播放时间轴的调整。
对所述字幕层的修正包括:插入预设字幕,跳过,修正字幕或者一键上字幕等操作,例如:对于特定的称谓、特定词,可以通过人为的调配字幕体现时间码来完成跳过修正工作。一键上字幕功能可以运用于对于有政治敏感词汇,通过控制视音频播放时间轴的跳过该些敏感词汇,直接进行更新、上屏操作,从而使字幕层显示的内容更为准确,以及避免敏感词汇的出现,提高直播视频的安全性。
此处需要说明的是,在建立视音频数据和字幕层同步匹配关系后,对字幕层的修改可以是在直播服务器中实现,也可以通过直播服务器先将匹配后的字幕层发送,在对字幕层修改后返回至直播服务器,直播服务器再对收到的字幕层进行调整,使修改后的字幕层与视音频数据同步匹配,之后发送进 行混合处理。因此,本发明中对字幕层的修改不仅可以在直播服务器中完成,也可以在直播服务器以外完成。
步骤S130:将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并将所述流媒体信息分发至网络节点上输出。
在该步骤中,基于步骤S120中的实施一和实施例二建立的同步匹配关系,可以通过以下方式将二者混合。
基于上述通过播放时间轴和字幕时间轴建立的同步匹配关系,可将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,具体实现可以是将字幕时间轴的时间刻度和视音频数据播放时间轴的时间刻度合成,进而实现混合。例如:按照视音频的播放时间建立的播放时间轴,假设在视频出现的第10秒开始有一个持续2秒的字幕,在视频播放的第11秒建立一个2秒的字幕时间轴,混合匹配则是,视音频开始按每秒25帧的时间开始播放,到第251帧的时候,也就是说在第11秒时,将字幕时间轴加入到播放时间轴上,之后在视音频数据播放到300帧时,字幕时间轴停止,字幕层消失,以此类推,从而达到视音频数据与字幕层的同步混合,并将混合后的视音频数据分发到各个网络节点上输出。
基于上述通过播放时间轴建立与其匹配的字幕层显示起始时间戳和结束时间戳的方式,该种方式主要基于视音频数据播放时间轴,在其上位于字幕层所显示的时间点上打有字幕层显示时间戳,当视音频数据播放到该时间点上,触发该时间戳,进而使字幕层显示。例如:假设在视频出现的第10秒开始有一个持续2秒的字幕,在视频播放的第11秒打一个字幕层显示的时间戳,在视频播放的第13秒打一个字幕停止的时间戳,混合则是,视音频开始按每秒25帧的时间开始播放,到第251帧的时候,也就是说在第11秒时,将播放时间轴自动触发字幕层的显示时间戳,进而使字幕层显示在该视频上,之后在视音频数据播放到300帧时,也就是在第13秒时,视频播放时间轴自动触发字幕层的停止时间戳,字幕层消失,以此类推,从而达到视音频数据与字幕层的混合。
在采用在视音频数据显示字幕层的帧画面位置处加入标识,在字幕层上设置与所述视音频标识相同字幕层显示标记,通过视音频标识与字幕层标识实现二者之间的同步匹配关系时,将二者混合是将二者的标记重叠,使得视 音频数据在显示设备上播放时,当标记显示时,字幕层则能够在视音频数据显示字幕层的位置显示字幕层,实现二者即时同步显示。
需要说明的是,对于上述描述的视音频数据与字幕层混合的方式中,可以通过系统自动匹配,也可以通过人工干预的方式实现字幕层与视音频数据的匹配混合,人工干预方式,可以是在字幕层需要显示的位置,人工加入字幕层等方式。
上述混合过程的实现可以通过编码器实现,直播服务器将建立同步匹配关系的视音频数据和字幕层发送至混合编码器,通过混合编码器将二者进行混合,并最终发送。
可以理解的是,该步骤中将混合后的视音频数据和字幕层可以通过网路传输协议(例如:http协议)传输,并显示在显示设备上。
根据上述内容可以获知,本发明提供的一种流媒体与字幕即时同步显示方法,将采集编码后的视音频数据发送至直播服务器中,直播服务器根据预设的延时时间对其进行缓存,同时将获取得到与所述视音频数据相关的字幕数据形成字幕层,直播服务器将二者建立同步匹配关系并发送,经过对具有同步匹配关系的视音频数据和字幕层进行混合之后,通过网络节点分发出去,最终在显示设备上使视音频数据和字幕层即时同步显示。由此使得在境内外直播节目或直播活动现场,通过获取的视音频数据和字幕数据进行延时处理后,可有效的调整字幕与视音频数据的匹配,达到字幕可实时显示在视音频画面上;并且由于设定延时时长从而使得字幕与视音频数据的匹配度更加精准,降低字幕的错误率,保证视音频与字幕的同步显示,并且不受字幕显示不受地域限制。
另外,本发明提供的一种流媒体与字幕即时同步显示方法还可以通过对字幕层的修正,使字幕层的显示更加准确;以及在对字幕层进行修正后,通过调整字幕层时间轴或时间戳,可以实现更加精准的字幕与视音频画面的匹配度,进一步提高同步的精确度,以及利用人工干预的方式,进一步提高配合精度和同步输出的精度,从而保证字幕层的准确性和实时性。
以上是对本发明提供的一种流媒体与字幕即时同步显示方法的说明,本发明还提供一种流媒体与字幕即时同步显示的装置,请参看图2,其为本申 请一种流媒体与字幕即时同步显示装置结构示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可,下述描述的装置实施例仅仅是示意性的。
如图2所示,该装置具体包括:
视音频采集编码单元200,用于将采集的流媒体中的视音频数据进行编码,并发送至直播服务器。
字幕获取单元210,用于获取所述视音频数据的字幕数据,形成字幕层,并发送至直播服务器。所述字幕获取单元210包括:字幕数据修正单元,用于对获取对应所述视音频数据的字幕数据进行校正。
处理单元220,所述直播服务器将编码后的视音频数据根据预设延时时间进行缓存,以及缓存所述字幕层,并为缓存后的所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送。
所述处理单元220包括:
延时缓存单元,用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视频数据的位置,延时该位置对应的视音频数据帧。
播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
字幕时间轴形成单元或者字幕时间戳形成单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;所述字幕时间戳形成单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上。所述字幕层修正单元,用于对所述字幕层进行插入预设字幕、跳过、修正字幕或者一键上字幕的操作。
调整单元,用于调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
混合编码单元230,用于接收具有同步匹配关系的所述字幕层和所述视音频数据,并将二者混合,之后根据预定的传输协议分发至网络节点上输出。
所述混合编码单元230包括:合成嵌入单元,用于将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳,将所述字幕层与所述视音频数据合成。
以上是对本发明提供的一种流媒体与字幕即时同步显示装置的说明,由于装置实施例基本相似于方法实施例,因此,描述仅为示意性,此处不再赘述。
基于上述本发明还提供一种用于流媒体和字幕同步匹配的处理方法,如图3所示,图3是本发明提供的一种用于流媒体和字幕同步匹配的处理方法流程图。由于流媒体和字幕同步匹配的处理方法,在本发明提供的流媒体与字幕即时同步显示方法中有详细说明,因此,此处描述为示意性,具体内容可参考图1及相关说明。
该方法包括:
步骤S300:将接收的编码后的视音频数据根据预设延时时间缓存。
所述步骤S300包括:对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频数据帧。
步骤S310:将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存。
步骤S320:为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。在所述步骤S320中,包括:
对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴。
为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在 原字幕层上。
调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
基于上述提供的一种用于流媒体和字幕同步匹配的处理方法,本发明还提供一种用于流媒体和字幕同步匹配的处理装置,由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可,下述描述的装置实施例仅仅是示意性的。
请参考图4所示,图4是本发明提供的一种用于流媒体和字幕同步匹配的处理装置的结构示意图。
该装置包括:
延时缓存单元400,用于将接收的编码后的视音频数据根据预设延时时间缓存。所述延时缓存单元400用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频帧。
字幕层形成单元410,用于将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存;
同步匹配关系建立单元420,用于为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。
所述同步匹配关系建立单元420,包括:播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴。
字幕时间轴形成单元或字幕时间戳建立单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;所述字幕时间戳建立单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
调整单元,用于调整与修正内容相对应的所述播放时间轴或字幕时间 轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
基于上述图1至图4,本发明还提供一种基于流媒体直播即时显示字幕的系统,请参看图5,其为是本发明提供的一种流媒体与字幕即时同步显示的系统的示意图。由于系统实施例基本相似于方法的实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可,下述描述的系统实施例仅仅是示意性的。
该系统具体包括:
采集编码设备500,用于采集流媒体中的视音频数据进行编码,并发送至直播服务器;该设备主要能够采集直播现场活动的视音频数据,或者其他直播视音频数据等。
字幕获取设备510,用于获取对应所述视音频数据的字幕数据,并发送至直播服务器;所述字幕获取设备510包括:字幕数据修正器,用于对获取对应所述视音频数据的字幕数据进行校正。
直播服务设备520,用于将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送。
所述直播服务设备520包括:
数据信息处理器,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;以及用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,用于根据所述播放时间轴,建立所述字幕层显示的起始时间戳和结束时间戳。
字幕层修正器,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;调整与修正内容相对应的所述字幕时间轴或播放时间轴,或者调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,使所述新字幕层与所述视音频数据同步匹配。
混合编码设备530,用于将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并根据预定的传输协议,将所述流媒体信息传输发送出去,最终显示于终端设备上。
所述混合编码设备530包括:合成处理器,用于将所述字幕层的字幕时 间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳;将所述字幕层与所述视音频数据合成。
以上为本发明提供的一种流媒体与字幕即时同步显示的方法、装置;用于流媒体和字幕同步匹配的处理方法、装置;以及流媒体与字幕即时同步显示的系统。通过本发明提供的方法能够使得获得的视音频数据和字幕数据在经过建立同步匹配关系后,合成为一个整体文件并发送至显示设备上,从而使视音频数据和字幕层能够即时同步的显示,提高二者的同步精准度。
图6示出了本发明的另一个实施例的一种流媒体与字幕即时同步显示设备的结构框图。所述流媒体与字幕即时同步显示设备1100可以是具备计算能力的主机服务器、个人计算机PC、或者可携带的便携式计算机或终端等。本发明具体实施例并不对计算节点的具体实现做限定。
所述流媒体与字幕即时同步显示设备1100包括处理器(processor)1110、通信接口(Communications Interface)1120、存储器(memory)1130和总线1140。其中,处理器1110、通信接口1120、以及存储器1130通过总线1140完成相互间的通信。
通信接口1120用于与网络设备通信,其中网络设备包括例如虚拟机管理中心、共享存储等。
处理器1110用于执行程序。处理器1110可能是一个中央处理器CPU,或者是专用集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。
存储器1130用于存放文件。存储器1130可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。存储器1130也可以是存储器阵列。存储器1130还可能被分块,并且所述块可按一定的规则组合成虚拟卷。
在一种可能的实施方式中,上述程序可为包括计算机操作指令的程序代 码。该程序具体可用于:实现流媒体与字幕即时同步显示方法中各步骤的操作。
图7示出了本发明的另一个实施例的一种用于流媒体和字幕同步匹配的处理设备的结构框图。所述用于流媒体和字幕同步匹配的处理设备1200可以是具备计算能力的主机服务器、个人计算机PC、或者可携带的便携式计算机或终端等。本发明具体实施例并不对计算节点的具体实现做限定。
所述用于流媒体和字幕同步匹配的处理设备1200包括处理器1110、通信接口1120、存储器1130和总线1140。其中,处理器1110、通信接口1120、以及存储器1130通过总线1140完成相互间的通信。
通信接口1120用于与网络设备通信,其中网络设备包括例如虚拟机管理中心、共享存储等。
处理器1110用于执行程序。处理器1110可能是一个中央处理器CPU,或者是专用集成电路ASIC,或者是被配置成实施本发明实施例的一个或多个集成电路。
存储器1130用于存放文件。存储器1130可能包含高速RAM存储器,也可能还包括非易失性存储器,例如至少一个磁盘存储器。存储器1130也可以是存储器阵列。存储器1130还可能被分块,并且所述块可按一定的规则组合成虚拟卷。
在一种可能的实施方式中,上述程序可为包括计算机操作指令的程序代码。该程序具体可用于:实现用于流媒体和字幕同步匹配的处理方法中各步骤的操作。
本领域普通技术人员可以意识到,本文所描述的实施例中的各示例性单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件形式来实现,取决于技术方案的特定应用和设计约束条件。专业技术人员可以针对特定的应用选择不同的方法来实现所描 述的功能,但是这种实现不应认为超出本发明的范围。
如果以计算机软件的形式来实现所述功能并作为独立的产品销售或使用时,则在一定程度上可认为本发明的技术方案的全部或部分(例如对现有技术做出贡献的部分)是以计算机软件产品的形式体现的。该计算机软件产品通常存储在计算机可读取的非易失性存储介质中,包括若干指令用以使得计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各实施例方法的全部或部分步骤。而前述的存储介质包括U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。
实用性
根据本发明实施例所提供的流媒体与字幕即时同步显示、匹配处理的方法、装置及系统,在境内外直播节目或直播活动现场,对获取的视音频数据进行延时处理,并通过将视音频数据与字幕层之间建立同步匹配的关系,从而可有效的调整字幕与视音频数据的匹配,实现字幕可实时的与视音频数据同步的显示在视音频画面上,并与视音频同步;由于设定视音频的延时时长,从而能够对字幕数据和/或字幕层进行修正,使得字幕与视音频数据的匹配度更加精准,降低字幕的错误率,保证视音频与字幕同步显示的准确性,并且字幕与视音频的同步显示不受地域限制。

Claims (28)

  1. 一种流媒体与字幕即时同步显示方法,其特征在于:
    将采集的流媒体中的视音频数据进行编码,并发送至直播服务器;
    获取对应所述视音频数据的字幕数据,并发送至直播服务器;
    所述直播服务器将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;
    将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并将所述流媒体信息分发至网络节点上输出。
  2. 根据权利要求1所述的流媒体与字幕即时同步显示方法,其特征在于:所述为所述字幕层和所述视音频数据建立同步匹配关系,包括:
    对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
    为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
  3. 根据权利要求2所述的流媒体与字幕即时同步显示方法,其特征在于:将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,包括:
    将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳;将所述字幕层与所述视音频数据合成。
  4. 根据权利要求2所述的流媒体与字幕即时同步显示方法,其特征在于:为所述字幕层和所述视音频数据建立同步匹配关系,包括:
    对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
    调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
  5. 根据权利要求4所述的流媒体与字幕即时同步显示方法,其特征在于:对所述字幕层的修正包括:插入预设字幕,跳过,修正字幕或者一键上字幕的操作。
  6. 根据权利要求2所述的流媒体与字幕即时同步显示方法,其特征在于:所述播放时间轴的长度为视音频数据时间长度与所述预设延时时间之和。
  7. 根据权利要求1所述的流媒体与字幕即时同步显示方法,其特征在于:所述获取对应所述视音频数据的字幕数据,并发送至直播服务器,包括:
    对获取对应所述视音频数据的字幕数据进行校正。
  8. 根据权利要求1所述的流媒体与字幕即时同步显示方法,其特征在于:所述直播服务器将编码后的视音频数据根据预设延时时间进行缓存,包括:
    对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频帧。
  9. 一种流媒体与字幕即时同步显示装置,其特征在于,包括:
    视音频采集编码单元,用于将采集的流媒体中的视音频数据进行编码,并发送至直播服务器;
    字幕获取单元,用于获取所述视音频数据的字幕数据,形成字幕层,并发送至直播服务器;
    处理单元,所述直播服务器将编码后的视音频数据根据预设延时时间进行缓存,以及缓存所述字幕层,并为缓存后的所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;
    混合编码单元,用于接收具有同步匹配关系的所述字幕层和所述视音频数据,并将二者混合,之后根据预定的传输协议分发至网络节点上输出。
  10. 根据权利要求9所述的流媒体与字幕即时同步显示装置,其特征在于,所述处理单元包括:
    播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
    字幕时间轴形成单元或者字幕时间戳形成单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字 幕时间轴;所述字幕时间戳形成单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
  11. 根据权利要求10所述的流媒体与字幕即时同步显示装置,其特征在于,所述混合编码单元包括:
    合成嵌入单元,用于将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳,将所述字幕层与所述视音频数据合成。
  12. 根据权利要求10所述的流媒体与字幕即时同步显示装置,其特征在于,所述处理单元包括:
    字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
    调整单元,用于调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
  13. 根据权利要求12所述的流媒体与字幕即时同步显示装置,其特征在于,所述字幕层修正单元,用于对所述字幕层进行插入预设字幕、跳过、修正字幕或者一键上字幕的操作。
  14. 根据权利要求9所述的流媒体与字幕即时同步显示装置,其特征在于,所述字幕获取单元包括:字幕数据修正单元,用于对获取对应所述视音频数据的字幕数据进行校正。
  15. 根据权利要求9所述的流媒体与字幕即时同步显示装置,其特征在于,所述处理单元包括:延时缓存单元,用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视频数据的位置,延时该位置对应的视音频数据帧。
  16. 一种用于流媒体和字幕同步匹配的处理方法,其特征在于,包括:
    将接收的编码后的视音频数据根据预设延时时间缓存;
    将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存;
    为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。
  17. 根据权利要求16所述的用于流媒体和字幕同步匹配的处理方法,其特征在于,所述为所述视音频数据和所述字幕层建立同步匹配关系,包括:
    对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;
    为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
  18. 根据权利要求17所述的用于流媒体和字幕同步匹配的处理方法,其特征在于,所述为所述字幕层和所述视音频数据建立同步匹配关系,包括:
    对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
    调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
  19. 根据要求16所述的用于流媒体和字幕同步匹配的处理方法,其特征在于,所述将接收的编码后的视音频数据根据预设延时时间缓存,包括:
    对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频数据帧。
  20. 一种用于流媒体和字幕同步匹配的处理装置,其特征在于,包括:
    延时缓存单元,用于将接收的编码后的视音频数据根据预设延时时间缓存;
    字幕层形成单元,用于将接收的与所述视音频数据对应的字幕数据,形成字幕层,并缓存;
    同步匹配关系建立单元,用于为所述视音频数据和所述字幕层建立同步匹配关系,之后发送。
  21. 根据权利要求20所述的用于流媒体和字幕同步匹配的处理装置,其特征在于,所述同步匹配关系建立单元包括:
    播放时间轴形成单元,用于对缓存的所述视音频数据按照其播放时间点 标记,形成播放时间轴;
    字幕时间轴形成单元或字幕时间戳建立单元,其中,所述字幕时间轴形成单元,用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴;所述字幕时间戳建立单元,用于根据所述播放时间轴,建立所述字幕层的显示起始时间戳和结束时间戳;所述字幕层的显示起始时间戳和结束时间戳统称为字幕时间戳。
  22. 根据权利要求21所述的用于流媒体和字幕同步匹配的处理装置,其特征在于,所述同步匹配关系建立单元包括:
    字幕层修正单元,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;
    调整单元,用于调整与修正内容相对应的所述播放时间轴或字幕时间轴,或所述字幕时间戳,使所述新字幕层与所述视音频数据同步匹配。
  23. 根据权利要求20所述的用于流媒体和字幕同步匹配的处理装置,其特征在于,所述延时缓存单元用于对所述视音频数据的每一帧延时缓存,或者对所述视音频数据的开始部分进行延时缓存,或者对所述视音频数据的结束部分进行延时缓存,或者根据预修改字幕位置或者预调整视音频数据的位置,延时该位置对应的视音频帧。
  24. 一种流媒体与字幕即时同步显示的系统,其特征在于:
    采集编码设备,用于采集流媒体中的视音频数据进行编码,并根据预定的视音频传输协议发送至直播服务器;
    字幕获取设备,用于输入与所述视音频数据相匹配的字幕数据,并根据预定的字幕传输协议发送至所述直播服务器;
    直播服务设备,用于将编码后的视音频数据根据预设延时时间缓存,以及根据所述字幕数据形成字幕层并缓存,为所述字幕层和所述视音频数据建立同步匹配关系,之后将二者发送;
    混合编码设备,用于将接收的具有同步匹配关系的所述字幕层和所述视音频数据混合,形成流媒体信息,并根据预定的传输协议,将所述流媒体信息分发至网络节点上输出。
  25. 根据权利要求24所述的流媒体与字幕即时同步显示的系统,其特 征在于:所述直播服务设备包括:数据信息处理器,用于对缓存的所述视音频数据按照其播放时间点标记,形成播放时间轴;以及用于为所述字幕层建立与所述视音频数据的播放时间轴匹配的字幕时间轴,或者,用于根据所述播放时间轴,建立所述字幕层显示的起始时间戳和结束时间戳。
  26. 根据权利要求25所述的流媒体与字幕即时同步显示的系统,其特征在于:所述混合编码设备包括:
    合成处理器,用于将所述字幕层的字幕时间轴嵌入至所述视音频数据的播放时间轴上,或者,用于在所述视音频数据的播放时间轴上嵌入所述起始时间戳和结束时间戳;将所述字幕层与所述视音频数据合成。
  27. 根据权利要求25所述的流媒体与字幕即时同步显示的系统,其特征在于,所述直播服务设备包括:
    字幕层修正器,用于对具有所述同步匹配关系的字幕层进行修正,形成新字幕层,并覆盖在原字幕层上;调整与修正内容相对应的所述字幕时间轴或播放时间轴,或者调整与修正内容相对应的所述播放时间轴或所述字幕时间轴,使所述新字幕层与所述视音频数据同步匹配。
  28. 根据权利要求25所述的流媒体与字幕即时同步显示的系统,其特征在于:所述字幕获取设备包括:字幕数据修正器,用于对获取对应所述视音频数据的字幕数据进行校正。
PCT/CN2016/098659 2015-12-22 2016-09-12 流媒体与字幕即时同步显示、匹配处理方法、装置及系统 WO2017107578A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP16877389.3A EP3334175A4 (en) 2015-12-22 2016-09-12 Streaming media and caption instant synchronization displaying and matching processing method, device and system
US15/757,775 US20190387263A1 (en) 2015-12-22 2016-09-12 Synchronously displaying and matching streaming media and subtitles

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510970843.9 2015-12-22
CN201510970843.9A CN105959772B (zh) 2015-12-22 2015-12-22 流媒体与字幕即时同步显示、匹配处理方法、装置及系统

Publications (1)

Publication Number Publication Date
WO2017107578A1 true WO2017107578A1 (zh) 2017-06-29

Family

ID=56917057

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098659 WO2017107578A1 (zh) 2015-12-22 2016-09-12 流媒体与字幕即时同步显示、匹配处理方法、装置及系统

Country Status (4)

Country Link
US (1) US20190387263A1 (zh)
EP (1) EP3334175A4 (zh)
CN (1) CN105959772B (zh)
WO (1) WO2017107578A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111586437A (zh) * 2020-04-08 2020-08-25 天津车之家数据信息技术有限公司 一种弹幕消息处理方法、系统、计算设备及存储介质
CN111601154A (zh) * 2020-05-08 2020-08-28 北京金山安全软件有限公司 一种视频处理方法及相关设备
EP3787300A4 (en) * 2018-04-25 2021-03-03 Tencent Technology (Shenzhen) Company Limited VIDEO STREAM PROCESSING METHOD AND DEVICE, COMPUTER DEVICE AND STORAGE MEDIUM
CN113766342A (zh) * 2021-08-10 2021-12-07 安徽听见科技有限公司 字幕合成方法及相关装置、电子设备、存储介质
CN113873306A (zh) * 2021-09-23 2021-12-31 深圳市多狗乐智能研发有限公司 一种将实时翻译字幕叠加画面经硬件投射到直播间的方法
CN116471436A (zh) * 2023-04-12 2023-07-21 央视国际网络有限公司 信息的处理方法及装置、存储介质、电子设备

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107872678B (zh) * 2016-09-26 2019-08-27 腾讯科技(深圳)有限公司 基于直播的文本展示方法和装置、直播方法和装置
CN106993239B (zh) * 2017-03-29 2019-12-10 广州酷狗计算机科技有限公司 直播过程中的信息显示方法
CN109413475A (zh) * 2017-05-09 2019-03-01 北京嘀嘀无限科技发展有限公司 一种视频中字幕的调整方法、装置和服务器
CN107295307A (zh) * 2017-07-13 2017-10-24 安徽声讯信息技术有限公司 基于远程控制的文字与视频同步控制系统
CN107527618A (zh) * 2017-07-13 2017-12-29 安徽声讯信息技术有限公司 一种音频文字同步播放系统
CN108040282A (zh) * 2017-12-21 2018-05-15 山东亿海兰特通信科技有限公司 一种视频播放方法及装置
CN108111872B (zh) * 2018-01-09 2021-01-01 武汉斗鱼网络科技有限公司 一种音频直播系统
CN108111896B (zh) * 2018-01-16 2020-05-05 北京三体云联科技有限公司 一种字幕同步方法及装置
CN108039175B (zh) 2018-01-29 2021-03-26 北京百度网讯科技有限公司 语音识别方法、装置及服务器
WO2019194742A1 (en) * 2018-04-04 2019-10-10 Nooggi Pte Ltd A method and system for promoting interaction during live streaming events
CN108833403A (zh) * 2018-06-11 2018-11-16 颜彦 一种具有嵌入式代码移植的融媒体信息发布生成方法
CN108924664B (zh) * 2018-07-26 2021-06-08 海信视像科技股份有限公司 一种节目字幕的同步显示方法及终端
US11102540B2 (en) 2019-04-04 2021-08-24 Wangsu Science & Technology Co., Ltd. Method, device and system for synchronously playing message stream and audio-video stream
CN110035311A (zh) * 2019-04-04 2019-07-19 网宿科技股份有限公司 一种同步播放消息流与音视频流的方法、装置和系统
US11211073B2 (en) * 2019-04-22 2021-12-28 Sony Corporation Display control of different verbatim text of vocal deliverance of performer-of-interest in a live event
CN111835988B (zh) * 2019-04-23 2023-03-07 阿里巴巴集团控股有限公司 字幕的生成方法、服务器、终端设备及系统
CN111835697B (zh) * 2019-04-23 2021-10-01 华为技术有限公司 一种媒体流发送方法、装置、设备和系统
CN110234028A (zh) * 2019-06-13 2019-09-13 北京大米科技有限公司 音视频数据同步播放方法、装置、系统、电子设备及介质
CN112584078B (zh) * 2019-09-27 2022-03-18 深圳市万普拉斯科技有限公司 视频通话方法、装置、计算机设备和存储介质
CN110740283A (zh) * 2019-10-29 2020-01-31 杭州当虹科技股份有限公司 一种基于视频通讯的语音转文字方法
US11134317B1 (en) 2020-03-25 2021-09-28 Capital One Services, Llc Live caption feedback systems and methods
CN111654658B (zh) * 2020-06-17 2022-04-15 平安科技(深圳)有限公司 音视频通话的处理方法、系统、编解码器及存储装置
CN111726686B (zh) * 2020-08-24 2020-11-24 上海英立视电子有限公司 基于电视的虚拟卡拉ok系统及方法
CN111988654B (zh) * 2020-08-31 2022-10-18 维沃移动通信有限公司 视频数据对齐方法、装置和电子设备
CN112135155B (zh) * 2020-09-11 2022-07-19 上海七牛信息技术有限公司 音视频的连麦合流方法、装置、电子设备及存储介质
CN112511910A (zh) * 2020-11-23 2021-03-16 浪潮天元通信信息系统有限公司 实时字幕的处理方法和装置
CN112616062B (zh) * 2020-12-11 2023-03-10 北京有竹居网络技术有限公司 一种字幕显示方法、装置、电子设备及存储介质
CN114979788A (zh) * 2021-02-24 2022-08-30 上海哔哩哔哩科技有限公司 弹幕展示方法及装置
CN113301428A (zh) * 2021-05-14 2021-08-24 上海樱帆望文化传媒有限公司 一种电竞赛事直播字幕装置
CN115474066A (zh) * 2021-06-11 2022-12-13 北京有竹居网络技术有限公司 一种字幕处理方法、装置、电子设备和存储介质
CN114679618B (zh) * 2022-05-27 2022-08-02 成都有为财商教育科技有限公司 一种流媒体数据接收方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101197946A (zh) * 2006-12-06 2008-06-11 中兴通讯股份有限公司 视频和文字同步装置
CN101540847A (zh) * 2008-03-21 2009-09-23 株式会社康巴思 字幕制作系统及字幕制作方法
US20090307267A1 (en) * 2008-06-10 2009-12-10 International Business Machines Corporation. Real-time dynamic and synchronized captioning system and method for use in the streaming of multimedia data
CN103686450A (zh) * 2013-12-31 2014-03-26 广州华多网络科技有限公司 视频处理方法及系统
CN103986940A (zh) * 2014-06-03 2014-08-13 王军明 一种视频字幕的流化方法
CN104795083A (zh) * 2015-04-30 2015-07-22 联想(北京)有限公司 一种信息处理方法和电子设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7561178B2 (en) * 2005-09-13 2009-07-14 International Business Machines Corporation Method, apparatus and computer program product for synchronizing separate compressed video and text streams to provide closed captioning and instant messaging integration with video conferencing
US8843368B2 (en) * 2009-08-17 2014-09-23 At&T Intellectual Property I, L.P. Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment
CN101692693B (zh) * 2009-09-29 2011-09-28 北京中科大洋科技发展股份有限公司 一种多功能一体化演播室系统和演播方法
CN102196319A (zh) * 2010-03-17 2011-09-21 中兴通讯股份有限公司 一种流媒体直播业务系统及实现方法
ES2370218B1 (es) * 2010-05-20 2012-10-18 Universidad Carlos Iii De Madrid Procedimiento y dispositivo para sincronizar subtítulos con audio en subtitulación en directo.
US9749504B2 (en) * 2011-09-27 2017-08-29 Cisco Technology, Inc. Optimizing timed text generation for live closed captions and subtitles
CN102655606A (zh) * 2012-03-30 2012-09-05 浙江大学 为基于p2p网络的直播节目添加实时字幕和手语服务的方法及系统
US10582268B2 (en) * 2015-04-03 2020-03-03 Philip T. McLaughlin System and method for synchronization of audio and closed captioning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101197946A (zh) * 2006-12-06 2008-06-11 中兴通讯股份有限公司 视频和文字同步装置
CN101540847A (zh) * 2008-03-21 2009-09-23 株式会社康巴思 字幕制作系统及字幕制作方法
US20090307267A1 (en) * 2008-06-10 2009-12-10 International Business Machines Corporation. Real-time dynamic and synchronized captioning system and method for use in the streaming of multimedia data
CN103686450A (zh) * 2013-12-31 2014-03-26 广州华多网络科技有限公司 视频处理方法及系统
CN103986940A (zh) * 2014-06-03 2014-08-13 王军明 一种视频字幕的流化方法
CN104795083A (zh) * 2015-04-30 2015-07-22 联想(北京)有限公司 一种信息处理方法和电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3334175A4 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3787300A4 (en) * 2018-04-25 2021-03-03 Tencent Technology (Shenzhen) Company Limited VIDEO STREAM PROCESSING METHOD AND DEVICE, COMPUTER DEVICE AND STORAGE MEDIUM
US11463779B2 (en) 2018-04-25 2022-10-04 Tencent Technology (Shenzhen) Company Limited Video stream processing method and apparatus, computer device, and storage medium
CN111586437A (zh) * 2020-04-08 2020-08-25 天津车之家数据信息技术有限公司 一种弹幕消息处理方法、系统、计算设备及存储介质
CN111586437B (zh) * 2020-04-08 2022-09-06 天津车之家数据信息技术有限公司 一种弹幕消息处理方法、系统、计算设备及存储介质
CN111601154A (zh) * 2020-05-08 2020-08-28 北京金山安全软件有限公司 一种视频处理方法及相关设备
CN111601154B (zh) * 2020-05-08 2022-04-29 北京金山安全软件有限公司 一种视频处理方法及相关设备
CN113766342A (zh) * 2021-08-10 2021-12-07 安徽听见科技有限公司 字幕合成方法及相关装置、电子设备、存储介质
CN113766342B (zh) * 2021-08-10 2023-07-18 安徽听见科技有限公司 字幕合成方法及相关装置、电子设备、存储介质
CN113873306A (zh) * 2021-09-23 2021-12-31 深圳市多狗乐智能研发有限公司 一种将实时翻译字幕叠加画面经硬件投射到直播间的方法
CN116471436A (zh) * 2023-04-12 2023-07-21 央视国际网络有限公司 信息的处理方法及装置、存储介质、电子设备
CN116471436B (zh) * 2023-04-12 2024-05-31 央视国际网络有限公司 信息的处理方法及装置、存储介质、电子设备

Also Published As

Publication number Publication date
CN105959772A (zh) 2016-09-21
US20190387263A1 (en) 2019-12-19
EP3334175A4 (en) 2018-10-10
CN105959772B (zh) 2019-04-23
EP3334175A1 (en) 2018-06-13

Similar Documents

Publication Publication Date Title
WO2017107578A1 (zh) 流媒体与字幕即时同步显示、匹配处理方法、装置及系统
WO2019205872A1 (zh) 视频流处理方法、装置、计算机设备及存储介质
US10477262B2 (en) Broadcast management system
JP6610555B2 (ja) 受信装置、送信装置、およびデータ処理方法
TWI544791B (zh) 經由寬頻網路所接收第二節目內容之解碼方法及解碼器裝置
JP5903924B2 (ja) 受信装置および字幕処理方法
KR102469142B1 (ko) 미디어 스트림 재생들 사이를 트랜지션하는 동안 트랜지션 프레임들의 동적 재생
CN105323655A (zh) 一种在移动终端根据时间戳同步视频/比分的方法
KR101841313B1 (ko) 멀티미디어 흐름 처리 방법 및 대응하는 장치
US20140281011A1 (en) System and method for replicating a media stream
JP2021061596A (ja) 放送サービス再送信システムおよび視聴用携帯端末
JP5707642B2 (ja) オーディオ信号及びビデオ信号の同期化誤差の補正方法及び装置
JP7208531B2 (ja) 同期制御装置、同期制御方法及び同期制御プログラム
TWI788722B (zh) 用於與內容呈現設備結合使用的方法、非暫時性電腦可讀儲存介質及計算系統
WO2018224839A2 (en) Methods and systems for generating a reaction video
WO2017195668A1 (ja) 受信装置、及び、データ処理方法
JP7125692B2 (ja) 放送サービス通信ネットワーク配信装置および方法
JP2005167668A (ja) 複数映像時刻同期表示端末、複数映像時刻同期表示方法、プログラム、および記録媒体
JP4762340B2 (ja) 信号処理装置及び信号処理方法
JP2024096838A (ja) 放送サービス通信ネットワーク配信装置および方法
JP2011223603A (ja) 信号処理装置及び信号処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16877389

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2016877389

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE