WO2022253079A1 - 基于hls流的字幕显示方法及装置 - Google Patents

基于hls流的字幕显示方法及装置 Download PDF

Info

Publication number
WO2022253079A1
WO2022253079A1 PCT/CN2022/095045 CN2022095045W WO2022253079A1 WO 2022253079 A1 WO2022253079 A1 WO 2022253079A1 CN 2022095045 W CN2022095045 W CN 2022095045W WO 2022253079 A1 WO2022253079 A1 WO 2022253079A1
Authority
WO
WIPO (PCT)
Prior art keywords
subtitle
stream
display
index file
file
Prior art date
Application number
PCT/CN2022/095045
Other languages
English (en)
French (fr)
Inventor
江平
洪冲
朱兴昌
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2022253079A1 publication Critical patent/WO2022253079A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • H04N21/4355Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Definitions

  • Embodiments of the present invention relate to the field of multimedia, and in particular, relate to a method and device for displaying subtitles based on HLS (HTTP Live Streaming) streams.
  • HLS HTTP Live Streaming
  • Today's multimedia live broadcast service is an important application in the field of audio and video, and the multimedia live broadcast protocol mainly uses DASH (Dynamic Adaptive Streaming over HTTP), HLS (HTTP Live Streaming) and MSS (Microsoft Smoothing Streaming), among which the MEPG-DASH standard is MEPG ( Moving Picture Experts Group) launched a dynamic adaptive streaming protocol based on HTTP (Hypertext Transfer Protocol) in order to standardize a variety of adaptive streaming technologies that exist in the industry. It supports DRM (Digital Right Management), HTTP transfer, low-latency streaming and Among many other features, HLS is an HTTP-based streaming communication protocol implemented by Apple.
  • DASH Dynamic Adaptive Streaming over HTTP
  • HLS HTTP Live Streaming
  • MSS Microsoft Smoothing Streaming
  • MEPG-DASH standard is MEPG ( Moving Picture Experts Group) launched a dynamic adaptive streaming protocol based on HTTP (Hypertext Transfer Protocol) in order to standardize a variety of adaptive streaming technologies that exist in the industry. It supports
  • the DASH protocol as a live broadcast standard is widely used, and many terminal players are compatible with it, but for native players of IOS terminals (including but not limited to iphone, ipad, appleTV, etc.), it is mainly compatible with HLS format and common video file packaging Format.
  • Common picture format subtitles include DVB-subtitle, smpte-tt and other formats.
  • the media server mainly adopts the DASH protocol. If subtitle information is included, it is usually a graphic subtitle based on the DVB-subtitle standard, and text subtitles cannot be provided at the same time.
  • the native player of the IOS terminal does not support the media playback of the DASH protocol, and does not support the analysis and display of DVB-subtitle graphic subtitles.
  • Embodiments of the present invention provide a method and device for displaying subtitles based on HLS streams, so as to at least solve the problem in the related art that IOS terminals do not support the display of picture subtitles of DASH streams.
  • a method for displaying subtitles based on an HLS stream including: transcoding the DASH media stream into an HLS stream, and transcoding the subtitle stream in the DASH media stream into a subtitle file in a picture encoding format ; Download and play the video and audio files in the HLS stream through the player; download and parse the subtitle file through the subtitle parser to obtain subtitle display information; obtain the current playing time of the player through the subtitle parser , and select corresponding subtitles according to the subtitle display information and display them synchronously.
  • transcoding the DASH media stream into an HLS stream, and transcoding the subtitle stream in the DASH media stream into a subtitle file in a picture encoding format includes: slicing the DASH media stream according to the HLS protocol and transcoding encapsulation, wherein, the video stream and the audio stream are transcoded into media files in the original encoding format, the subtitle stream is transcoded into a subtitle file in the picture subtitle encoding format, and the index file is modified.
  • the index file includes a main index file and a sub-index file, wherein the sub-index file includes a video index file, an audio index file, and a subtitle index file; modifying the index file includes: Used to identify subtitle information.
  • the subtitle display information includes at least one of the following: display time of the subtitle, display picture content, display style, display position size, and display picture encoding format.
  • downloading and playing the video and audio files in the HLS stream through the player includes: downloading the index file through the player, downloading and summing each media segment according to the index file Parsing; decoding and playing the downloaded and parsed video and audio files through the player.
  • downloading and parsing the subtitle file through a subtitle parser to obtain subtitle display information includes: downloading the index file through a subtitle parser, and downloading subtitle segments according to the index file; The downloaded subtitle file is decapsulated, and the decoding reference time and subtitle information are obtained; the subtitle information is decoded according to the corresponding picture coding format, and the subtitle display information is obtained.
  • the method further includes: periodically updating the sub-index file through the subtitle parser.
  • the DASH media stream includes a plurality of subtitle streams, and after selecting corresponding subtitles according to the subtitle display information and synchronously displaying them, it also includes: judging whether to switch from the current first subtitle stream To the second subtitle stream, if yes, the subtitle information of the first subtitle stream in the buffer zone is cleared; the sub-index file of the second subtitle stream is updated according to the second subtitle stream information parsed in the main index file; The subtitle parser downloads and decodes the fragments of the second subtitle stream, and synchronously displays the subtitles corresponding to the second subtitle stream according to the current playing time of the player.
  • a subtitle display device based on an HLS stream including: a transcoding module configured to transcode the DASH media stream into an HLS stream, and transcode the subtitle stream in the DASH media stream A subtitle file in a picture encoding format; a player, configured to download and play the video and audio files in the HLS stream; a subtitle parser, configured to download and parse the subtitle file to obtain subtitle display information, and obtain the playback The current playing time of the player, and select the corresponding subtitles according to the subtitle display information and display them synchronously.
  • the transcoding module is further configured to slice and transcode the DASH media stream according to the HLS protocol, wherein the video stream and audio stream are transcoded into media files according to the original encoding format, Transcode the subtitle stream into a subtitle file in the picture subtitle encoding format, and modify the index file.
  • the subtitle display information includes at least one of the following: display time of the subtitle, display picture content, display style, display position size, and display picture encoding format.
  • the player is further configured to download the index file, download and analyze each media segment according to the index file, and decode and analyze the downloaded and analyzed video and audio files. play.
  • the subtitle parser includes: a download module configured to download the index file, and download subtitle segments according to the index file; a parsing module configured to download the downloaded subtitle The subtitle file is decapsulated, and obtains decoding reference time and subtitle information; the decoding module is configured to decode the subtitle information according to the corresponding picture coding format, and obtains the subtitle display information; the synchronization module is configured to obtain the information of the player The current playing time; a display module, configured to select corresponding subtitles according to the subtitle display information and display them synchronously.
  • the transcoding module is located at the server side, and the player and subtitle parser are located at the terminal side.
  • a computer-readable storage medium is also provided, and a computer program is stored in the computer-readable storage medium, wherein the computer program is configured to perform any one of the above methods when running Steps in the examples.
  • an electronic device including a memory and a processor, wherein a computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above Steps in the method examples.
  • Fig. 1 is the flow chart of the subtitle display method based on HLS flow according to the embodiment of the present invention
  • FIG. 2 is a block diagram of a subtitle display device based on an HLS stream according to an embodiment of the present invention
  • FIG. 3 is a structural block diagram of a subtitle display device based on an HLS stream according to another embodiment of the present invention.
  • Fig. 4 is the basic structure and main workflow of the picture subtitle display scheme based on HLS flow according to the embodiment of the present invention.
  • FIG. 5 is a live flow chart of a method for displaying picture subtitles based on HLS streams according to an embodiment of the present invention
  • FIG. 6 is an on-demand flowchart of a method for displaying picture subtitles based on HLS streams according to an embodiment of the present invention
  • Fig. 7 is a flow chart of HLS stream-based multi-subtitle switching according to an embodiment of the present invention.
  • this method comprises the following steps:
  • Step S102 transcoding the DASH media stream into an HLS stream, and transcoding the subtitle stream in the DASH media stream into a subtitle file in a picture encoding format
  • Step S104 downloading and playing the video and audio files in the HLS stream through the player;
  • Step S106 downloading and parsing the subtitle file through a subtitle parser to obtain subtitle display information
  • Step S108 obtain the current playing time of the player through the subtitle parser, select corresponding subtitles according to the subtitle display information, and display them synchronously.
  • step S102 may further include: slicing and transcoding the DASH media stream according to the HLS protocol, wherein the video stream and audio stream are transcoded into media files in the original encoding format, and the subtitle stream is transcoded Encode the subtitle file for picture subtitles and modify the index file.
  • the index file includes a main index file and a sub-index file, wherein the sub-index file includes a video index file, an audio index file, and a subtitle index file; modifying the index file includes: Used to identify subtitle information.
  • the subtitle display information includes at least one of the following: display time of the subtitle, display picture content, display style, display position size, and display picture encoding format.
  • step S104 may further include: downloading the index file through the player, downloading and analyzing each media segment according to the index file; Video and audio files are decoded and played.
  • step S106 may further include: downloading the index file through a subtitle parser, and downloading subtitle segments according to the index file; decapsulating the downloaded subtitle file, and Obtain the decoding reference time and subtitle information; decode the subtitle information according to the corresponding picture coding format, and obtain the subtitle display information.
  • step S108 it may further include: regularly updating the sub-index file through the subtitle parser.
  • the DASH media stream includes a plurality of subtitle streams, and after selecting the corresponding subtitle according to the subtitle display information and synchronously displaying it, it may further include: judging whether to start from the current first subtitle The flow is switched to the second subtitle stream, if so, the subtitle information of the first subtitle stream in the buffer is cleared; the sub-index file of the second subtitle stream is updated according to the second subtitle stream information parsed in the main index file; The subtitle parser downloads and decodes the segments of the second subtitle stream, and synchronously displays the subtitles corresponding to the second subtitle stream according to the current playing time of the player.
  • a subtitle display device based on HLS stream is also provided, and the device is used to implement the above embodiments and preferred implementation manners, and what has been explained will not be repeated.
  • the term "module” may be a combination of software and/or hardware that realizes a predetermined function.
  • the devices described in the following embodiments are preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
  • FIG. 2 is a structural block diagram of an HLS stream-based subtitle display device according to an embodiment of the present invention. As shown in FIG. 2 , the device includes: a transcoding module 10 , a player 20 and a subtitle parser 30 .
  • the transcoding module 10 is configured to transcode the DASH media stream into an HLS stream, and transcode the subtitle stream in the DASH media stream into a subtitle file in a picture encoding format.
  • the player 20 is configured to download and play the video and audio files in the HLS stream.
  • the subtitle parser 30 is configured to download and parse the subtitle file to obtain subtitle display information, obtain the current playing time of the player, select corresponding subtitles according to the subtitle display information, and display them synchronously.
  • the transcoding module 10 is further configured to slice and transcode the DASH media stream according to the HLS protocol, wherein the video stream and the audio stream are transcoded into media files in the original encoding format , transcode the subtitle stream into a subtitle file in the picture subtitle encoding format, and modify the index file.
  • the subtitle display information includes at least one of the following: display time of the subtitle, display picture content, display style, display position size, and display picture encoding format.
  • the player 20 is also configured to download the index file, download and analyze each media segment according to the index file, and decode the downloaded and analyzed video and audio files and play.
  • the subtitle parser 30 includes a download module 31 , a parsing module 32 , a decoding module 33 , a synchronization module 34 and a display module 35 .
  • the download module 31 is configured to download the index file, and download subtitle segments according to the index file.
  • the parsing module 32 is configured to decapsulate the downloaded subtitle file, and obtain decoding reference time and subtitle information.
  • the decoding module 33 is configured to decode the subtitle information according to the corresponding picture coding format, and acquire the subtitle display information.
  • the synchronization module 34 is configured to obtain the current playing time of the player.
  • the display module 35 is configured to select corresponding subtitles according to the subtitle display information and display them synchronously.
  • the transcoding module is located at the server side, and the player and subtitle parser are located at the terminal side.
  • the above-mentioned modules can be realized by software or hardware. For the latter, it can be realized by the following methods, but not limited to this: the above-mentioned modules are all located in the same processor; or, the above-mentioned modules can be combined in any combination The forms of are located in different processors.
  • the involved hardware architecture mainly includes a source end, an encoder, a server, and a terminal.
  • the live media stream is output after the original picture information is collected, or the produced media file is output as an on-demand media stream.
  • the service side transcodes and encapsulates the video stream and audio stream in the media stream, and performs transcoding and distribution according to the HLS protocol.
  • the original subtitle stream in the media stream it is transcoded into a picture subtitle format and encapsulated.
  • the terminal side consists of two parts: the player and the subtitle parser.
  • the player decodes and plays the media stream information by parsing the index file. Since the player does not support the subtitle format, the subtitle stream is not parsed and displayed by the player.
  • the subtitle parser downloads and parses the subtitle file by analyzing the subtitle stream information in the index file. After parsing the subtitle file, the subtitle parser obtains information such as display time, display style, and display image of the subtitle, and stores them in the corresponding subtitle file information mapping library.
  • the subtitle parser selects and synchronously displays subtitle information by obtaining the current playing time of the player.
  • the workflow of this embodiment mainly includes the following steps:
  • the live video is encoded by the video capture device and then the live RTSP stream is output;
  • the live video output device pushes the stream to the server side for DASH packaging and distribution;
  • the live DASH stream is transcoded, encapsulated and sliced according to the HLS protocol, in which the video stream and audio stream are transcoded into ts media files in the original encoding format, and the subtitle stream is transcoded into smpte-tt or other picture subtitle encoding formats subtitle file, and modify the index file;
  • the terminal simultaneously creates a player and a subtitle parser, and configures an HLS live channel
  • the player downloads the m3u8 index file for analysis, and then selects the video stream, audio stream, and subtitle stream for downloading;
  • the player internally decodes the downloaded video files, audio files, and subtitle files. Since the player does not support picture subtitles, the subtitle stream cannot be decoded, so it is not displayed, and the video stream and audio stream are normally decoded and played;
  • the subtitle parser downloads the m3u8 index file for analysis, and extracts relevant information of the subtitle stream, such as subtitle language, segment duration, etc.;
  • the subtitle parser internally downloads the subtitle file in the HLS live video source, decapsulates and decodes it, and obtains information such as the display time, display picture content, display style, and display position size of the subtitle;
  • the subtitle parser selects appropriate subtitles for synchronous display based on the real-time playback time of the player.
  • the player if the player supports the subtitle format, it will conflict with the subtitle display of the subtitle parser, which needs to be judged by the terminal control layer.
  • the decoding format can be extended inside the subtitle parser, and more subtitle formats can be supported after extension.
  • a live stream is taken as an example to describe the display of picture subtitles in the HLS live stream as an example.
  • the process includes the following steps:
  • the service side creates a main index file based on the media information in the media stream based on the HLS protocol, and the main index file can contain multiple stream information. If you use the original index file directly, the native player cannot play it, so you need to modify the index file, for example, add a custom extension field to identify the subtitle information to solve the problem of being unable to play. Examples are as follows:
  • the server slices and transcodes the video and audio in the media stream into TS fragment files according to the HLS protocol
  • the service side transcodes and encapsulates the subtitles in the media stream according to the m4s encapsulation format and the smpte-tt encoding format;
  • the player on the terminal side downloads the main index file, and downloads and analyzes each media segment according to the main index file;
  • the player on the terminal side decodes and displays the downloaded and analyzed audio and video information
  • the subtitle parser on the terminal side downloads the main index file, and downloads the subtitle fragments according to the main index file;
  • the subtitle parser on the terminal side decapsulates the downloaded subtitle file to obtain the decoding reference time and subtitle information
  • the subtitle parser on the terminal side decodes the subtitle information in smpte-tt format to obtain information such as subtitle display time, display style, display picture content, and display picture encoding format;
  • the subtitle parser on the terminal side decodes the display picture according to the display picture encoding format information acquired by smpte-tt decoding;
  • the subtitle parser on the terminal side obtains the real-time time of the player, searches for subtitles that meet the display time conditions, and displays them synchronously;
  • the service side judges whether the sub-index file needs to be updated, and updates the sub-index file if necessary;
  • the terminal-side player and subtitle parser regularly update the sub-index file, and repeat steps 6)-11).
  • video-on-demand is taken as an example, and the display of picture subtitles in the HLS video-on-demand media stream is described as an example.
  • the process includes the following steps:
  • the service side downloads the video-on-demand source for packaging and distribution;
  • the service side creates a main index file based on the media information in the media stream based on the HLS protocol.
  • the main index file contains multi-channel stream information. For an example, refer to step 2) of the live broadcast embodiment;
  • the server slices and transcodes the video and audio in the media stream into TS fragment files according to the HLS protocol
  • the service side transcodes and encapsulates the subtitles in the media stream according to the m4s encapsulation format and the smpte-tt encoding format;
  • the player on the terminal side downloads the main index file, and downloads and analyzes each media segment according to the main index file;
  • the player on the terminal side decodes and displays the downloaded and analyzed audio and video information
  • the subtitle parser on the terminal side downloads the main index file, and downloads the subtitle fragments according to the main index file;
  • the subtitle parser on the terminal side decapsulates the downloaded subtitle file to obtain the decoding reference time and subtitle information
  • the subtitle parser on the terminal side decodes the subtitle information in smpte-tt format to obtain information such as subtitle display time, display style, display picture content, and display picture encoding format;
  • the subtitle parser on the terminal side decodes the display picture according to the display picture encoding format information acquired by smpte-tt decoding;
  • the subtitle parser on the terminal side obtains the real-time time of the player, searches for subtitles that meet the display time conditions, and displays them synchronously.
  • the method for displaying picture subtitles of multiple subtitle streams is described.
  • two subtitle streams are taken as an example, the first subtitle stream is subtitle stream a, and the second subtitle stream is subtitle stream b.
  • the process includes the following steps:
  • the service side creates a main index file based on the media information in the media stream based on the HLS protocol.
  • the main index file contains multi-channel stream information, including multiple subtitle media stream information. Examples are as follows:
  • the server slices and transcodes the video and audio in the media stream into TS fragment files according to the HLS protocol;
  • the service side transcodes and encapsulates the subtitles in the media stream according to the m4s encapsulation format and the smpte-tt encoding format;
  • the subtitle parser on the terminal side downloads the main index file, and parses information such as multi-subtitle languages and sub-index files;
  • the subtitle parser on the terminal side downloads the first subtitle stream a in the multi-subtitle by default;
  • the subtitle parser on the terminal side downloads the sub-index file for parsing, and downloads subtitle fragments according to the file names in the index file;
  • the subtitle parser on the terminal side decapsulates the downloaded subtitle file to obtain the decoding reference time and subtitle information
  • the subtitle parser on the terminal side decodes the subtitle information in smpte-tt format to obtain information such as subtitle display time, display style, display picture content, and display picture encoding format;
  • the subtitle parser on the terminal side decodes the display picture according to the display picture encoding format information obtained by smpte-tt decoding;
  • the subtitle parser on the terminal side stores the parsed subtitle information into a buffer
  • the subtitle parser on the terminal side acquires the real-time time of the player, searches for subtitles that meet the display time conditions in the subtitle information buffer, and displays them synchronously;
  • the subtitle parser on the terminal side regularly updates the sub-index file
  • the subtitle parser on the terminal side clears the subtitle information buffer
  • the terminal side subtitle parser updates the sub-index file of the subtitle stream b according to the subtitle stream b information parsed in the main index file;
  • the subtitle parser at the terminal side selects the fragments to start downloading according to the current time of the player and downloads them sequentially;
  • Embodiments of the present invention also provide a computer-readable storage medium, in which a computer program is stored, wherein the computer program is set to execute the steps in any one of the above method embodiments when running.
  • the above-mentioned computer-readable storage medium may include but not limited to: U disk, read-only memory (Read-Only Memory, referred to as ROM), random access memory (Random Access Memory, referred to as RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
  • ROM read-only memory
  • RAM random access memory
  • mobile hard disk magnetic disk or optical disk and other media that can store computer programs.
  • An embodiment of the present invention also provides an electronic device, including a memory and a processor, where a computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any one of the above method embodiments.
  • the electronic device may further include a transmission device and an input and output device, wherein the transmission device is connected to the processor, and the input and output device is connected to the processor.
  • each module or each step of the above-mentioned present invention can be realized by a general-purpose computing device, and they can be concentrated on a single computing device, or distributed in a network formed by multiple computing devices In fact, they can be implemented in program code executable by a computing device, and thus, they can be stored in a storage device to be executed by a computing device, and in some cases, can be performed in an order different from that shown here. Or described steps, or they are made into individual integrated circuit modules, or multiple modules or steps among them are made into a single integrated circuit module to realize. As such, the present invention is not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明提供了一种基于HLS流的字幕显示方法及装置,该方法包括:将DASH媒体流转码为HLS流,并将DASH媒体流中的字幕流转码为图片编码格式的字幕文件(S102);通过播放器下载并播放HLS流中的视频和音频文件(S104);通过字幕解析器下载并解析字幕文件以获取字幕显示信息(S106);通过字幕解析器获取播放器的当前播放时间,并根据字幕显示信息选取对应的字幕并进行同步显示(S108)。在本发明中,通过将DASH媒体流中的字幕流转码为图片字幕编码格式的字幕文件,通过字幕解析器下载并解析字幕文件,并进行同步显示,从而解决了IOS终端不支持DASH流的图片字幕显示的问题。

Description

基于HLS流的字幕显示方法及装置 技术领域
本发明实施例涉及多媒体领域,具体而言,涉及一种基于HLS(HTTP Live Streaming)流的字幕显示方法及装置。
背景技术
现今多媒体直播业务是音视频领域的重要应用,而多媒体直播协议主要采用DASH(Dynamic Adaptive Streaming over HTTP)、HLS(HTTP Live Streaming)和MSS(Microsoft Smoothing Streaming)等,其中MEPG-DASH标准是MEPG(Moving Picture Experts Group)为了对业界存在的多种自适应流技术进行规范而推出的基于HTTP(Hypertext Transfer Protocol)的动态自适应流协议,支持DRM(Digital Right Management)、HTTP传递、低延迟流以及许多其他功能,HLS是由苹果公司实现的基于HTTP的流媒体通信协议。作为直播标准的DASH协议应用广泛,很多终端播放器都对其进行了兼容,但是对于IOS终端(包括但不限于iphone、ipad、appleTV等)的原生播放器,主要兼容HLS格式与常见视频文件封装格式。
同时,很多播放器可兼容的字幕格式有限,主要包含常见的文本字幕格式,但是不支持内容更丰富的图片字幕,大大降低了用户的体验感与终端功能场景的完整性。常见的图片格式字幕包括DVB-subtitle,smpte-tt等格式。
对于常见的直播/点播频道,媒体服务器主要是采用DASH协议,若包含字幕信息,通常是基于DVB-subtitle标准的图形字幕,且无法同时提供文字字幕。而IOS终端原生播放器不支持DASH协议的媒体播放,且不支持DVB-subtitle图形字幕的解析显示。
发明内容
本发明实施例提供了一种基于HLS流的字幕显示方法及装置,以至少解决相关技术中IOS终端不支持DASH流的图片字幕显示的问题。
根据本发明的一个实施例,提供了一种基于HLS流的字幕显示方法,包括:将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件;通过播放器下载并播放所述HLS流中的视频和音频文件;通过字幕解析器下载并解析所述字幕文件以获取字幕显示信息;通过所述字幕解析器获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
在一个示例性实施例中,将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件,包括:将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
在一个示例性实施例中,所述索引文件包括主索引文件和子索引文件,其中,所述子索 引文件包括视频索引文件、音频索引文件和字幕索引文件;修改索引文件包括:自定义扩展字段用于标识字幕信息。
在一个示例性实施例中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
在一个示例性实施例中,通过播放器下载并播放所述HLS流中的视频和音频文件包括:通过所述播放器下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析;通过所述播放器对下载解析后的视频和音频文件进行解码和播放。
在一个示例性实施例中,通过字幕解析器下载并解析所述字幕文件以获取字幕显示信息包括:通过字幕解析器下载所述索引文件,并根据所述索引文件对字幕分片进行下载;对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息;对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息。
在一个示例性实施例中,根据所述字幕显示信息选取对应的字幕并进行同步显示之后,还包括:通过所述字幕解析器定时更新子索引文件。
在一个示例性实施例中,所述DASH媒体流中包含多个字幕流,根据所述字幕显示信息选取对应的字幕并进行同步显示之后,还包括:判断是否需要从当前的第一字幕流切换至第二字幕流,如果是,则将缓冲区中的所述第一字幕流的字幕信息清空;按照主索引文件中解析的第二字幕流信息更新第二字幕流的子索引文件;通过所述字幕解析器下载并解码所述第二字幕流的分片,并根据所述播放器的当前播放时间,同步显示所述第二字幕流对应的字幕。
根据本发明的另一个实施例,提供了一种基于HLS流的字幕显示装置,包括:转码模块,设置为将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件;播放器,设置为下载并播放所述HLS流中的视频和音频文件;字幕解析器,设置为下载并解析所述字幕文件以获取字幕显示信息,获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
在一个示例性实施例中,所述转码模块,还设置为将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
在一个示例性实施例中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
在一个示例性实施例中,所述播放器,还设置为下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析,并对下载解析后的视频和音频文件进行解码和播放。
在一个示例性实施例中,所述字幕解析器包括:下载模块,设置为下载所述索引文件,并根据所述索引文件对字幕分片进行下载;解析模块,设置为对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息;解码模块,设置为对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息;同步模块,设置为获取所述播放器的当前播放时间;显示模块,设置为根据所述字幕显示信息选取对应的字幕并进行同步显示。
在一个示例性实施例中,所述转码模块位于服务器侧,所述播放器和字幕解析器位于终端侧。
根据本发明的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施 例中的步骤。
根据本发明的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。
在本发明的上述实施例中,通过将DASH媒体流中的字幕流转码为图片字幕编码格式的字幕文件,通过字幕解析器下载并解析所述字幕文件,并进行同步显示,从而解决了IOS终端不支持DASH流的图片字幕显示的问题,能够给用户带来图片字幕的丰富字幕效果,并可扩展支持多平台多播放器的兼容操作。
附图说明
图1是根据本发明实施例的基于HLS流的字幕显示方法的流程图;
图2是根据本发明实施例的基于HLS流的字幕显示装置的结构框图;
图3是根据本发明另一实施例的基于HLS流的字幕显示装置的结构框图;
图4是根据本发明实施例的基于HLS流的图片字幕显示方案的基本架构和主要工作流程;
图5是根据本发明实施例的基于HLS流的图片字幕显示方法的直播流程图;
图6是根据本发明实施例的基于HLS流的图片字幕显示方法的点播流程图
图7是根据本发明实施例的基于HLS流的多字幕切换流程图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本发明的实施例。
在本实施例中提供了一种基于HLS流的字幕显示方法,如图1所述,该方法包括如下步骤:
步骤S102,将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件;
步骤S104,通过播放器下载并播放所述HLS流中的视频和音频文件;
步骤S106,通过字幕解析器下载并解析所述字幕文件以获取字幕显示信息;
步骤S108,通过所述字幕解析器获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
在一个示例性实施例中,步骤S102可进一步包括:将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
在一个示例性实施例中,所述索引文件包括主索引文件和子索引文件,其中,所述子索引文件包括视频索引文件、音频索引文件和字幕索引文件;修改索引文件包括:自定义扩展字段用于标识字幕信息。
在一个示例性实施例中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
在一个示例性实施例中,步骤S104可进一步包括:通过所述播放器下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析;通过所述播放器对下载解析后的视频和音频文件进行解码和播放。
在一个示例性实施例中,步骤S106可进一步包括:通过字幕解析器下载所述索引文件,并根据所述索引文件对字幕分片进行下载;对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息;对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息。
在一个示例性实施例中,在步骤S108之后,还可进一步包括:通过所述字幕解析器定时更新子索引文件。
在一个示例性实施例中,所述DASH媒体流中包含多个字幕流,根据所述字幕显示信息选取对应的字幕并进行同步显示之后,还可进一步包括:判断是否需要从当前的第一字幕流切换至第二字幕流,如果是,则将缓冲区中的所述第一字幕流的字幕信息清空;按照主索引文件中解析的第二字幕流信息更新第二字幕流的子索引文件;通过所述字幕解析器下载并解码所述第二字幕流的分片,并根据所述播放器的当前播放时间,同步显示所述第二字幕流对应的字幕。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
在本实施例中还提供了一种基于HLS流的字幕显示装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。
图2是根据本发明实施例的基于HLS流的字幕显示装置的结构框图,如图2所示,该装置包括:转码模块10、播放器20和字幕解析器30。
转码模块10,设置为将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件。
播放器20,设置为下载并播放所述HLS流中的视频和音频文件。
字幕解析器30,设置为下载并解析所述字幕文件以获取字幕显示信息,获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
在一个示例性实施例中,所述转码模块10,还设置为将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
在一个示例性实施例中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
在一个示例性实施例中,所述播放器20,还设置为下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析,并对下载解析后的视频和音频文件进行解码和播放。
如图3所示,在一个示例性实施例中,所述字幕解析器30包括下载模块31、解析模块32、解码模块33、同步模块34和显示模块35。
下载模块31,设置为下载所述索引文件,并根据所述索引文件对字幕分片进行下载。
解析模块32,设置为对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息。
解码模块33,设置为对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息。
同步模块34,设置为获取所述播放器的当前播放时间。
显示模块35,设置为根据所述字幕显示信息选取对应的字幕并进行同步显示。
在本实施例中,所述转码模块位于服务器侧,所述播放器和字幕解析器位于终端侧。
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。
为了便于对发明所提供的技术方案的理解,下面将结合具体的应用场景进行详细描述。
首先对基于HLS流图片字幕显示方案的基本构成和主要工作流程进行详细说明。如图3所示,在本实施例中,所涉及的硬件架构主要包括源端、编码器、服务器和终端。
在本实施例中,对原始图片信息进行采集后输出直播媒体流,或是将制作的媒体文件输出点播媒体流。媒体流推至服务侧后,服务侧对媒体流中的视频流、音频流进行转码和封装,并依据HLS协议进行转码分发。针对媒体流中的原有的字幕流,将其转码为图片字幕格式并进行封装。同时自定义扩展索引文件,添加字幕信息。
终端侧由播放器和字幕解析器两部分组成。播放器通过解析索引文件中的媒体流信息进行解码和播放,由于播放器不支持字幕格式,因此字幕流不被播放器解析和显示。字幕解析器通过解析索引文件中的字幕流信息进行字幕文件的下载和解析。字幕解析器解析字幕文件后获取字幕的显示时间、显示样式、显示图像等信息,并存入对应的字幕文件信息映射库。
字幕解析器通过获取播放器的当前播放时间,进行字幕信息的选取与同步显示。
如图4所示,本实施例的工作流程主要包括如下步骤:
1)直播画面通过视频采集设备编码后输出直播RTSP流;
2)直播视频输出设备将流推至服务器侧进行DASH封装分发;
3)服务器侧将直播DASH流按照HLS协议进行转码封装与切片,其中视频流、音频流转码为按照原编码格式的ts媒体文件,字幕流转码为smpte-tt或其他的图片字幕编码格式的字幕文件,并修改索引文件;
4)终端同时创建播放器和字幕解析器,配置HLS直播频道;
5)播放器下载m3u8索引文件进行解析,并进而选择视频流、音频流、字幕流进行下载;
6)播放器内部对下载的视频文件、音频文件、字幕文件进行解码操作,其中由于播放器不支持图片字幕,字幕流无法被解码因此不显示,视频流和音频流被正常解码播放;
7)字幕解析器下载m3u8索引文件进行解析,并提取字幕流的相关信息,如字幕语言、分片时长等;
8)字幕解析器内部下载HLS直播片源中的字幕文件,并对其进行解封装和解码,获得字幕的显示时间、显示图片内容、显示样式、显示位置尺寸等信息;
9)字幕解析器基于播放器的实时播放时间,选择合适的字幕进行同步显示。
在本实施例中,若播放器支持字幕格式,会与字幕解析器的字幕显示冲突,需要由终端控制层判断。另外,字幕解析器内部可进行解码格式的扩展,扩展后可支持更多的字幕格式。
实施例一
在本实施例中,以直播流为例,对HLS直播流中的图片字幕显示为例进行说明。如图5所示,该流程包括如下步骤:
1)对原始直播视频信息进行采集后输出直播媒体流推送至CDN服务侧;
2)服务侧基于HLS协议根据媒体流中的媒体信息创建主索引文件,主索引文件可包含多路流信息。如果直接使用原索引文件,则原生播放器无法播放,所以需修改索引文件,例如,添加自定义扩展字段,用于标识字幕信息,以解决无法播放问题。示例如下:
#EXTM3U
#EXT-X-ZTE-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs1",LANGUAGE="tr",NAME="chi",AUTOSELECT=YES,DEFAULT=YES,URI="dvbsub_sec-t1.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3650086,RESOLUTION=1280x720,FRAME-RATE=59.960,CODECS="avc1.640020,mp4a.40.2",ZTE_SUBTITLES="subs1"
sec-v1-a1.m3u8
3)服务侧将媒体流中的视频、音频按照HLS协议进行切片和转码封装入TS分片文件;
4)服务侧将媒体流中的字幕按照m4s封装格式,按照smpte-tt编码格式进行转码封装;
5)终端侧播放器下载主索引文件,根据主索引文件对各媒体分片进行下载和解析;
6)终端侧播放器对下载解析后的音视频信息进行解码与显示;
7)终端侧字幕解析器下载主索引文件,根据主索引文件对字幕分片进行下载;
8)终端侧字幕解析器对下载后的字幕文件进行解封装,获取解码基准时间和字幕信息;
9)终端侧字幕解析器对字幕信息进行smpte-tt格式的解码,获取字幕显示时间、显示样式、显示图片内容、显示图片编码格式等信息;
10)终端侧字幕解析器根据smpte-tt解码获取的显示图片编码格式信息对显示图片进行图片解码;
11)终端侧字幕解析器获取播放器的实时时间,查找符合显示时间条件的字幕,进行同步显示;
12)服务侧判断是否需要更新子索引文件,若需要则更新子索引文件;
13)终端侧播放器和字幕解析器定时更新子索引文件,并重复步骤6)-11)。
实施例二
在本实施例中,以点播为例,对HLS点播媒体流中的图片字幕显示为例进行说明。如图6所示,该流程包括如下步骤:
1)服务侧下载点播片源以进行封装分发;
2)服务侧基于HLS协议根据媒体流中的媒体信息创建主索引文件,主索引文件包含多路流信息,样例可参考直播实施例的步骤2);
3)服务侧将媒体流中的视频、音频按照HLS协议进行切片和转码封装入TS分片文件;
4)服务侧将媒体流中的字幕按照m4s封装格式,按照smpte-tt编码格式进行转码封装;
5)终端侧播放器下载主索引文件,根据主索引文件对各媒体分片进行下载和解析;
6)终端侧播放器对下载解析后的音视频信息进行解码与显示;
7)终端侧字幕解析器下载主索引文件,根据主索引文件对字幕分片进行下载;
8)终端侧字幕解析器对下载后的字幕文件进行解封装,获取解码基准时间和字幕信息;
9)终端侧字幕解析器对字幕信息进行smpte-tt格式的解码,获取字幕显示时间、显示样式、显示图片内容、显示图片编码格式等信息;
10)终端侧字幕解析器根据smpte-tt解码获取的显示图片编码格式信息对显示图片进行图片解码;
11)终端侧字幕解析器获取播放器的实时时间,查找符合显示时间条件的字幕,进行同步显示。
实施例三
在本实施例中,以多字幕流切换为例,对多字幕流的图片字幕显示方法进行说明。在本实施例中以2路字幕流为例,首个字幕流为字幕流a,第二个字幕流为字幕流b。如图7所示,该流程包括如下步骤:
1)服务侧基于HLS协议根据媒体流中的媒体信息创建主索引文件,主索引文件包含多路流信息,其中包含多个字幕媒体流信息,示例如下:
#EXTM3U
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="Audio",NAME="chi",DEFAULT=NO,AUTOSELECT=YES,LANGUAGE="chi",URI="audio_ch.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="Audio",NAME="eng",DEFAULT=YES,AUTOSELECT=YES,LANGUAGE="eng",URI="audio_en.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="Subtitles",NAME="chi",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="chi",URI="smpte_ch.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="Subtitles",NAME="eng",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="eng",URI="smpte_en.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1129000,AUDIO="Audio",SUBTITLES="Subtitles"
video_1000kbps.m3u8
2)服务侧将媒体流中的视频、音频按照HLS协议进行切片和转码封装入TS分片文件;
3)服务侧将媒体流中的字幕按照m4s封装格式,按照smpte-tt编码格式进行转码封装;
4)终端侧字幕解析器下载主索引文件,解析多字幕语言、子索引文件等信息;
5)终端侧字幕解析器默认下载多字幕中首个字幕流a;
6)终端侧字幕解析器下载子索引文件进行解析,并按照索引文件中的文件名对字幕分片进行下载;
7)终端侧字幕解析器对下载后的字幕文件进行解封装,获取解码基准时间和字幕信息;
8)终端侧字幕解析器对字幕信息进行smpte-tt格式的解码,获取字幕显示时间、显示样式、显示图片内容、显示图片编码格式等信息;
9)终端侧字幕解析器根据smpte-tt解码获取的显示图片编码格式信息对显示图片进行图片解码;
10)终端侧字幕解析器将解析后的字幕信息存入缓冲区;
11)终端侧字幕解析器获取播放器的实时时间,在字幕信息缓冲区中查找符合显示时间条件的字幕,进行同步显示;
12)若为直播流,终端侧字幕解析器定时更新子索引文件;
13)用户选择切换字幕流b;
14)终端侧字幕解析器将字幕信息缓冲区清空;
15)终端侧字幕解析器按照主索引文件中解析的字幕流b信息,更新字幕流b的子索引文件;
16)终端侧字幕解析器根据播放器的当前时间选择起始下载的分片进行依次下载;
17)终端侧字幕解析器下载分片后,重复步骤7)-12)。
本发明的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。
本发明的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本发明的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (16)

  1. 一种基于HLS流的字幕显示方法,包括:
    将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件;
    通过播放器下载并播放所述HLS流中的视频和音频文件;
    通过字幕解析器下载并解析所述字幕文件以获取字幕显示信息;
    通过所述字幕解析器获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
  2. 根据权利要求1所述的方法,将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件,包括:
    将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
  3. 根据权利要求1所述的方法,所述索引文件包括主索引文件和子索引文件,其中,所述子索引文件包括视频索引文件、音频索引文件和字幕索引文件;修改索引文件包括:自定义扩展字段用于标识字幕信息。
  4. 根据权利要求1所述的方法,其中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
  5. 根据权利要求1所述的方法,其中,通过播放器下载并播放所述HLS流中的视频和音频文件包括:
    通过所述播放器下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析;
    通过所述播放器对下载解析后的视频和音频文件进行解码和播放。
  6. 根据权利要求1所述的方法,其中,通过字幕解析器下载并解析所述字幕文件以获取字幕显示信息包括:
    通过字幕解析器下载所述索引文件,并根据所述索引文件对字幕分片进行下载;
    对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息;
    对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息。
  7. 根据权利要求2所述的方法,根据所述字幕显示信息选取对应的字幕并进行同步显示之后,还包括:
    通过所述字幕解析器定时更新子索引文件。
  8. 根据权利要求2所述的方法,其中,所述DASH媒体流中包含多个字幕流,根据所述字幕显示信息选取对应的字幕并进行同步显示之后,还包括:
    判断是否需要从当前的第一字幕流切换至第二字幕流,如果是,则将缓冲区中的所述第 一字幕流的字幕信息清空;
    按照主索引文件中解析的第二字幕流信息更新第二字幕流的子索引文件;
    通过所述字幕解析器下载并解码所述第二字幕流的分片,并根据所述播放器的当前播放时间,同步显示所述第二字幕流对应的字幕。
  9. 一种基于HLS流的字幕显示装置,包括:
    转码模块,设置为将DASH媒体流转码为HLS流,并将所述DASH媒体流中的字幕流转码为图片编码格式的字幕文件;
    播放器,设置为下载并播放所述HLS流中的视频和音频文件;
    字幕解析器,设置为下载并解析所述字幕文件以获取字幕显示信息,获取所述播放器的当前播放时间,并根据所述字幕显示信息选取对应的字幕并进行同步显示。
  10. 根据权利要求9所述的装置,其中,
    所述转码模块,还设置为将所述DASH媒体流按照HLS协议进行切片和转码封装,其中,视频流和音频流转码为按照原编码格式的媒体文件,字幕流转码为图片字幕编码格式的字幕文件,并修改索引文件。
  11. 根据权利要求9所述的装置,其中,所述字幕显示信息至少包括以下之一:字幕的显示时间、显示图片内容、显示样式、显示位置尺寸、显示图片编码格式。
  12. 根据权利要求9所述的装置,其中,
    所述播放器,还设置为下载所述索引文件,根据所述索引文件对各媒体分片进行下载和解析,并对下载解析后的视频和音频文件进行解码和播放。
  13. 根据权利要求9所述的装置,其中,所述字幕解析器包括:
    下载模块,设置为下载所述索引文件,并根据所述索引文件对字幕分片进行下载;
    解析模块,设置为对下载后的所述字幕文件进行解封装,并获取解码基准时间与字幕信息;
    解码模块,设置为对字幕信息按照对应的图片编码格式进行解码,获取所述字幕显示信息;
    同步模块,设置为获取所述播放器的当前播放时间;
    显示模块,设置为根据所述字幕显示信息选取对应的字幕并进行同步显示。
  14. 根据权利要求9所述的装置,其中,所述转码模块位于服务器侧,所述播放器和字幕解析器位于终端侧。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现所述权利要求1至8任一项中所述的方法的步骤。
  16. 一种电子装置,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上 运行的计算机程序,所述处理器执行所述计算机程序时实现所述权利要求1至8任一项中所述的方法的步骤。
PCT/CN2022/095045 2021-06-01 2022-05-25 基于hls流的字幕显示方法及装置 WO2022253079A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110611176.0A CN115442662A (zh) 2021-06-01 2021-06-01 基于hls流的字幕显示方法及装置
CN202110611176.0 2021-06-01

Publications (1)

Publication Number Publication Date
WO2022253079A1 true WO2022253079A1 (zh) 2022-12-08

Family

ID=84271857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/095045 WO2022253079A1 (zh) 2021-06-01 2022-05-25 基于hls流的字幕显示方法及装置

Country Status (2)

Country Link
CN (1) CN115442662A (zh)
WO (1) WO2022253079A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114598915A (zh) * 2020-12-03 2022-06-07 南京中兴软件有限责任公司 一种媒体服务方法、装置、设备及计算机存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140281009A1 (en) * 2013-03-14 2014-09-18 General Instrument Corporation Devices, systems, and methods for converting or translating dynamic adaptive streaming over http (dash) to http live streaming (hls)
CN106162377A (zh) * 2015-04-08 2016-11-23 中国移动通信集团公司 自适应流媒体技术的转换方法、装置、bm-sc及终端
US20170019445A1 (en) * 2015-07-16 2017-01-19 Arris Enterprises, Inc. Systems and methods for providing dlna streaming to client devices
US20170302900A1 (en) * 2014-09-17 2017-10-19 Harmonic, Inc. Controlling modes of sub-title presentation
CN108055574A (zh) * 2017-11-29 2018-05-18 上海网达软件股份有限公司 媒体文件转码生成多音轨多字幕点播内容的方法及系统
CN111147896A (zh) * 2018-11-05 2020-05-12 中兴通讯股份有限公司 一种字幕数据处理方法、装置、设备和计算机存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140281009A1 (en) * 2013-03-14 2014-09-18 General Instrument Corporation Devices, systems, and methods for converting or translating dynamic adaptive streaming over http (dash) to http live streaming (hls)
US20170302900A1 (en) * 2014-09-17 2017-10-19 Harmonic, Inc. Controlling modes of sub-title presentation
CN106162377A (zh) * 2015-04-08 2016-11-23 中国移动通信集团公司 自适应流媒体技术的转换方法、装置、bm-sc及终端
US20170019445A1 (en) * 2015-07-16 2017-01-19 Arris Enterprises, Inc. Systems and methods for providing dlna streaming to client devices
CN108055574A (zh) * 2017-11-29 2018-05-18 上海网达软件股份有限公司 媒体文件转码生成多音轨多字幕点播内容的方法及系统
CN111147896A (zh) * 2018-11-05 2020-05-12 中兴通讯股份有限公司 一种字幕数据处理方法、装置、设备和计算机存储介质

Also Published As

Publication number Publication date
CN115442662A (zh) 2022-12-06

Similar Documents

Publication Publication Date Title
US20230179837A1 (en) Network Video Streaming with Trick Play Based on Separate Trick Play Files
US9247317B2 (en) Content streaming with client device trick play index
WO2017063399A1 (zh) 一种视频播放方法和装置
CN107634930B (zh) 一种媒体数据的获取方法和装置
KR100928998B1 (ko) 사용자 단말기에 멀티미디어 컨텐츠와 코덱을 제공하는적응적 멀티미디어 시스템 및 그 방법
US8645562B2 (en) Apparatus and method for providing streaming content
US10149020B2 (en) Method for playing a media stream in a browser application
US10887645B2 (en) Processing media data using file tracks for web content
WO2016145913A1 (zh) 自适应流媒体处理方法及装置
CN108513143A (zh) 提供串流内容的装置及方法
US20180324241A1 (en) Apparatus and method for providing streaming content
WO2014193996A2 (en) Network video streaming with trick play based on separate trick play files
CN105828096B (zh) 媒体流文件的处理方法和装置
KR102499231B1 (ko) 수신 장치, 송신 장치 및 데이터 처리 방법
CN106791988B (zh) 多媒体数据轮播方法和终端
WO2015192683A1 (zh) 一种基于码流自适应技术的内容分发方法、装置及系统
KR102085192B1 (ko) 렌더링 시간 제어
WO2022253079A1 (zh) 基于hls流的字幕显示方法及装置
CN103945260B (zh) 一种流媒体点播编辑系统及点播方法
CN110769326B (zh) 视频切片文件的加载、视频文件的播放方法和装置
KR20120139514A (ko) Dash 규격의 미디어 데이터와 mmt 전송 시스템과의 연동 방법 및 그 장치
KR101568317B1 (ko) Ip 카메라에서 hls 프로토콜을 지원하는 시스템 및 그 방법
KR20240107164A (ko) 미디어 컨테이너 파일 및 스트리밍 매니페스트에서의 픽처인픽처에 대한 시그널링
KR20240070610A (ko) Cmaf 및 dash 멀티미디어 스트리밍을 위한 주소 지정 가능한 리소스 인덱스 이벤트
CN113497952A (zh) Mp4文件实时流化网关控制系统及控制流程

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22815125

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE