CN115442662A - Subtitle display method and device based on HLS (HTTP live streaming) - Google Patents

Subtitle display method and device based on HLS (HTTP live streaming) Download PDF

Info

Publication number
CN115442662A
CN115442662A CN202110611176.0A CN202110611176A CN115442662A CN 115442662 A CN115442662 A CN 115442662A CN 202110611176 A CN202110611176 A CN 202110611176A CN 115442662 A CN115442662 A CN 115442662A
Authority
CN
China
Prior art keywords
subtitle
stream
display
file
index file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110611176.0A
Other languages
Chinese (zh)
Inventor
江平
洪冲
朱兴昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN202110611176.0A priority Critical patent/CN115442662A/en
Priority to PCT/CN2022/095045 priority patent/WO2022253079A1/en
Publication of CN115442662A publication Critical patent/CN115442662A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • H04N21/4355Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the invention provides a subtitle display method and device based on HLS (HTTP live streaming), wherein the method comprises the following steps: transcoding DASH media stream into HLS stream, and transcoding subtitle stream in the DASH media stream into subtitle file in picture coding format; downloading and playing the video and audio files in the HLS stream through a player; downloading and analyzing the subtitle file through a subtitle analyzer to obtain subtitle display information; and acquiring the current playing time of the player through the subtitle parser, and selecting and synchronously displaying the corresponding subtitle according to the subtitle display information. In the invention, the subtitle stream in the DASH media stream is transcoded into the subtitle file in the picture subtitle coding format, the subtitle file is downloaded and analyzed through the subtitle analyzer, and synchronous display is carried out, so that the problem that the picture subtitle display of the DASH stream is not supported by the IOS terminal is solved.

Description

Subtitle display method and device based on HLS (HTTP live streaming)
Technical Field
The embodiment of the invention relates to the field of multimedia, in particular to a subtitle display method and device based on HLS (HTTP Live Streaming).
Background
Nowadays, the multimedia Live broadcast service is an important application in the field of audio and video, and the multimedia Live broadcast Protocol mainly adopts DASH (Dynamic Adaptive Streaming over HTTP), HLS (HTTP Live Streaming), MSS (Microsoft Streaming), and the like, wherein the mpeg-DASH standard is a Dynamic Adaptive Streaming Protocol based on HTTP (Hypertext Transfer Protocol) that is introduced by mpeg (Moving Picture Experts Group) for the purpose of specifying various Adaptive Streaming technologies existing in the industry, and supports DRM (Digital Right Management), HTTP delivery, low-latency Streaming, and many other functions, and HLS is a Streaming media communication Protocol based on HTTP that is implemented by apple inc. The DASH protocol, which is a live standard, is widely applied, and many terminal players are compatible with the DASH protocol, but for native players of IOS terminals (including but not limited to iphone, ipad, appleTV, etc.), the HLS format and the common video file encapsulation format are mainly compatible.
Meanwhile, the compatible subtitle formats of a plurality of players are limited, the common text subtitle formats are mainly contained, but the picture subtitles with richer contents are not supported, and the experience of the user and the integrity of the terminal function scene are greatly reduced. Common picture format subtitles include DVB-subtitle, smpt-tt and like formats.
For common live/on-demand channels, the media server mainly adopts the DASH protocol, and if the media server contains subtitle information, the media server usually adopts graphic subtitles based on the DVB-subtitle standard, and cannot provide text subtitles at the same time. And the IOS terminal native player does not support media playing of DASH protocol and does not support analysis display of DVB-subtitle graphic subtitles.
Disclosure of Invention
The embodiment of the invention provides a subtitle display method and device based on HLS (HTTP live streaming), which are used for at least solving the problem that an IOS (internet operating system) terminal in the related art does not support the subtitle display of a picture of DASH (digital over coax) stream.
According to an embodiment of the present invention, there is provided a subtitle display method based on an HLS stream, including: transcoding DASH media stream into HLS stream, and transcoding subtitle stream in the DASH media stream into subtitle file in picture coding format; downloading and playing the video and audio files in the HLS stream through a player; downloading and analyzing the subtitle file through a subtitle analyzer to obtain subtitle display information; and acquiring the current playing time of the player through the subtitle parser, and selecting and synchronously displaying the corresponding subtitle according to the subtitle display information.
In an exemplary embodiment, transcoding a DASH media stream into an HLS stream, and transcoding subtitle streams in the DASH media stream into subtitle files in a picture coding format includes: and slicing and transcoding and packaging the DASH media stream according to an HLS protocol, wherein the video stream and the audio stream are transcoded into a media file according to an original coding format, the subtitle stream is transcoded into a subtitle file according to a picture subtitle coding format, and the index file is modified.
In one exemplary embodiment, the index file includes a main index file and a sub index file, wherein the sub index file includes a video index file, an audio index file, and a subtitle index file; modifying the index file includes: the custom extension field is used to identify the subtitle information.
In one exemplary embodiment, the subtitle display information includes at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
In an exemplary embodiment, downloading and playing the video and audio files in the HLS stream by a player includes: downloading the index file through the player, and downloading and analyzing each media fragment according to the index file; and decoding and playing the video and audio files after downloading and analyzing through the player.
In an exemplary embodiment, downloading and parsing the subtitle file through a subtitle parser to obtain subtitle display information includes: downloading the index file through a subtitle parser, and downloading subtitle fragments according to the index file; decapsulating the downloaded subtitle file, and acquiring decoding reference time and subtitle information; and decoding the subtitle information according to the corresponding picture coding format to obtain the subtitle display information.
In an exemplary embodiment, after selecting the corresponding subtitle according to the subtitle display information and performing synchronous display, the method further includes: and updating the sub-index file at regular time through the subtitle parser.
In an exemplary embodiment, after the DASH media stream includes a plurality of subtitle streams and corresponding subtitles are selected according to the subtitle display information and are displayed synchronously, the method further includes: judging whether switching from a current first subtitle stream to a second subtitle stream is needed, and if so, emptying subtitle information of the first subtitle stream in a buffer area; updating a sub-index file of the second subtitle stream according to the second subtitle stream information analyzed in the main index file; and downloading and decoding the fragments of the second subtitle stream through the subtitle parser, and synchronously displaying the subtitles corresponding to the second subtitle stream according to the current playing time of the player.
According to another embodiment of the present invention, there is provided an HLS stream-based subtitle display apparatus including: the transcoding module is used for transcoding the DASH media stream into an HLS stream and transcoding the subtitle stream in the DASH media stream into a subtitle file in a picture coding format; the player is used for downloading and playing the video and audio files in the HLS stream; and the subtitle parser is used for downloading and parsing the subtitle file to obtain subtitle display information, obtaining the current playing time of the player, and selecting and synchronously displaying the corresponding subtitle according to the subtitle display information.
In an exemplary embodiment, the transcoding module is further configured to slice and transcode and encapsulate the DASH media stream according to an HLS protocol, where the video stream and the audio stream are transcoded into a media file according to an original encoding format, and the subtitle stream is transcoded into a subtitle file according to a picture subtitle encoding format, and the index file is modified.
In one exemplary embodiment, the subtitle display information includes at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
In an exemplary embodiment, the player is further configured to download the index file, download and parse each media segment according to the index file, and decode and play the video and audio files after downloading and parsing.
In one exemplary embodiment, the subtitle parser includes: the downloading module is used for downloading the index file and downloading the subtitle fragments according to the index file; the analysis module is used for de-encapsulating the downloaded subtitle file and acquiring decoding reference time and subtitle information; the decoding module is used for decoding the subtitle information according to a corresponding picture coding format to acquire the subtitle display information; the synchronous module is used for acquiring the current playing time of the player; and the display module is used for selecting the corresponding subtitle according to the subtitle display information and carrying out synchronous display.
In an exemplary embodiment, the transcoding module is located at the server side, and the player and the subtitle parser are located at the terminal side.
According to a further embodiment of the present invention, there is also provided a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
In the above embodiment of the present invention, the subtitle stream in the DASH media stream is transcoded into the subtitle file in the picture subtitle coding format, and the subtitle file is downloaded and parsed by the subtitle parser and displayed synchronously, so that the problem that the picture subtitle display of the DASH stream is not supported by the IOS terminal is solved, a rich subtitle effect of the picture subtitle can be brought to the user, and the compatible operation of the multi-platform multi-player can be supported in an extensible manner.
Drawings
Fig. 1 is a flowchart of a subtitle display method based on an HLS stream according to an embodiment of the present invention;
fig. 2 is a block diagram of a subtitle display apparatus based on an HLS stream according to an embodiment of the present invention;
fig. 3 is a block diagram of a subtitle display apparatus based on an HLS stream according to another embodiment of the present invention;
fig. 4 is a basic architecture and a main workflow of a picture subtitle display scheme based on HLS streams according to an embodiment of the present invention;
fig. 5 is a live view flowchart of a picture subtitle display method based on HLS streams according to an embodiment of the present invention;
FIG. 6 is a flow chart of an on-demand method for displaying subtitles of pictures based on HLS streams according to an embodiment of the present invention
Fig. 7 is a flowchart for switching between multiple subtitles based on HLS streams according to an embodiment of the present invention.
Detailed Description
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings in conjunction with the embodiments.
In this embodiment, a subtitle display method based on HLS streams is provided, and as shown in fig. 1, the method includes the following steps:
step S102, transcoding DASH media stream into HLS stream, and transcoding subtitle stream in DASH media stream into subtitle file in picture coding format;
step S104, downloading and playing the video and audio files in the HLS stream through a player;
step S106, downloading and analyzing the subtitle file through a subtitle analyzer to obtain subtitle display information;
and S108, acquiring the current playing time of the player through the subtitle parser, and selecting and synchronously displaying the corresponding subtitles according to the subtitle display information.
In an exemplary embodiment, step S102 may further include: and slicing and transcoding and packaging the DASH media stream according to an HLS protocol, wherein the video stream and the audio stream are transcoded into a media file according to an original coding format, the subtitle stream is transcoded into a subtitle file according to a picture subtitle coding format, and the index file is modified.
In one exemplary embodiment, the index file includes a main index file and a sub index file, wherein the sub index file includes a video index file, an audio index file, and a subtitle index file; modifying the index file includes: the custom extension field is used to identify the subtitle information.
In one exemplary embodiment, the subtitle display information includes at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
In an exemplary embodiment, the step S104 may further include: downloading the index file through the player, and downloading and analyzing each media fragment according to the index file; and decoding and playing the downloaded and analyzed video and audio files through the player.
In an exemplary embodiment, the step S106 may further include: downloading the index file through a subtitle parser, and downloading subtitle fragments according to the index file; decapsulating the downloaded subtitle file, and acquiring decoding reference time and subtitle information; and decoding the subtitle information according to the corresponding picture coding format to obtain the subtitle display information.
In an exemplary embodiment, after step S108, the method may further include: and updating the sub-index file at regular time through the subtitle parser.
In an exemplary embodiment, after the DASH media stream includes a plurality of subtitle streams and selects corresponding subtitles according to the subtitle display information and performs synchronous display, the method may further include: judging whether switching from a current first subtitle stream to a second subtitle stream is needed, and if so, emptying subtitle information of the first subtitle stream in a buffer area; updating a sub-index file of the second subtitle stream according to the second subtitle stream information analyzed in the main index file; and downloading and decoding the fragments of the second subtitle stream through the subtitle parser, and synchronously displaying the subtitles corresponding to the second subtitle stream according to the current playing time of the player.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a subtitle display apparatus based on HLS stream is also provided, and the apparatus is used to implement the foregoing embodiments and preferred embodiments, and the description already made is omitted for brevity. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware or a combination of software and hardware is also possible and contemplated.
Fig. 2 is a block diagram of a subtitle display apparatus based on an HLS stream according to an embodiment of the present invention, as shown in fig. 2, the apparatus including: a transcoding module 10, a player 20 and a subtitle parser 30.
A transcoding module 10, configured to transcode a DASH media stream into an HLS stream, and transcode a subtitle stream in the DASH media stream into a subtitle file in a picture coding format.
A player 20 for downloading and playing the video and audio files in the HLS stream.
And the subtitle parser 30 is configured to download and parse the subtitle file to obtain subtitle display information, obtain current playing time of the player, and select and synchronously display a corresponding subtitle according to the subtitle display information.
In an exemplary embodiment, the transcoding module 10 is further configured to slice and transcode and encapsulate the DASH media stream according to the HLS protocol, where the video stream and the audio stream are transcoded into a media file according to an original encoding format, and the subtitle stream is transcoded into a subtitle file according to a picture subtitle encoding format, and the index file is modified.
In one exemplary embodiment, the subtitle display information includes at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
In an exemplary embodiment, the player 20 is further configured to download the index file, download and parse the media segments according to the index file, and decode and play the video and audio files after downloading and parsing.
As shown in fig. 3, in an exemplary embodiment, the subtitle parser 30 includes a download module 31, a parsing module 32, a decoding module 33, a synchronization module 34, and a display module 35.
And the downloading module 31 is configured to download the index file, and download the subtitle fragments according to the index file.
And the parsing module 32 is configured to decapsulate the downloaded subtitle file, and obtain decoding reference time and subtitle information.
And the decoding module 33 is configured to decode the subtitle information according to the corresponding picture coding format, and acquire the subtitle display information.
And the synchronization module 34 is configured to obtain a current playing time of the player.
And the display module 35 is configured to select a corresponding subtitle according to the subtitle display information and perform synchronous display.
In this embodiment, the transcoding module is located on a server side, and the player and the subtitle parser are located on a terminal side.
It should be noted that the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
In order to facilitate understanding of the technical solutions provided by the present invention, the following detailed description is made with reference to specific application scenarios.
The basic composition and main workflow of the subtitle display scheme based on HLS stream pictures will be described in detail first. As shown in fig. 3, in the present embodiment, the hardware architecture mainly includes a source terminal, an encoder, a server, and a terminal.
In this embodiment, the original picture information is collected and then a live media stream is output, or a manufactured media file is output to an on-demand media stream. And after receiving the push stream, the service side transcodes and encapsulates the video stream and the audio stream in the media stream and transcodes and distributes the video stream and the audio stream according to the HLS protocol. And transcoding the original subtitle stream in the media stream into a picture subtitle format and packaging. And simultaneously, self-defining an extension index file and adding subtitle information.
The terminal side consists of a player and a subtitle parser. The player decodes and plays by parsing the media stream information in the index file, wherein the subtitle stream is not parsed and displayed because the subtitle format is not supported. And the caption analyzer downloads and analyzes the caption stream information in the index file. And after the subtitle file is analyzed, the information of the display time, the display style, the display image and the like of the subtitle is obtained and stored into a corresponding subtitle file information mapping library.
The caption analyzer selects and synchronously displays the caption information by acquiring the current playing time of the player.
As shown in fig. 4, the work flow of this embodiment mainly includes the following steps:
1) The live broadcast picture is coded by video acquisition equipment and then a live broadcast RTSP stream is output;
2) The live video output equipment pushes the stream to a server side for DASH packaging and distribution;
3) Transcoding, packaging and slicing a live DASH stream according to an HLS protocol at a server side, wherein a video stream and an audio stream are transcoded into ts media files according to an original coding format, a subtitle stream is transcoded into subtitle files in smpt-tt or other picture subtitle coding formats, and an index file is modified;
4) The terminal simultaneously creates a player and a caption analyzer and configures an HLS live broadcast channel;
5) The player downloads and analyzes the m3u8 index file and further selects and downloads the video stream, the audio stream and the subtitle stream
6) Decoding the downloaded video file, audio file and subtitle file in the player, wherein the subtitle stream cannot be decoded and is not displayed because the picture subtitle is not supported, and the video stream and the audio stream are normally decoded and played;
7) The caption analyzer downloads the m3u8 index file for analysis, and extracts relevant information of the caption stream, such as caption language, fragment duration and the like;
8) The subtitle parser internally downloads a subtitle file in an HLS live broadcast source, and decapsulates and decodes the subtitle file to obtain information such as display time, display picture content, display style, display position size and the like of a subtitle;
9) And the subtitle parser selects proper subtitles to synchronously display based on the real-time playing time of the player.
In this embodiment, if the player supports the subtitle format, the subtitle format may conflict with the subtitle display of the subtitle parser, and it needs to be determined by the terminal control layer. In addition, the caption parser can expand the decoding format inside, and more caption formats can be supported after expansion.
Example one
In this embodiment, a live streaming is taken as an example, and a picture subtitle in an HLS live streaming is taken as an example for description. As shown in fig. 5, the process includes the following steps:
1) Collecting original live broadcast video information, outputting live broadcast media stream and pushing to CDN service side
2) The service side creates a main index file according to the media information in the media stream based on the HLS protocol, and the main index file can contain multi-path stream information. The original index file is directly used, and the original player cannot play, so that the index file is modified, the custom extension field is added for identifying the subtitle information, and the problem that the subtitle cannot be played is solved. Examples are as follows:
#EXTM3U
#EXT-X-ZTE-MEDIA:TYPE=SUBTITLES,GROUP-ID="subs1",LANGUAGE="tr",NAME="chi",AUTOSELECT=YES,DEFAULT=YES,URI="dvbsub_sec-t1.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=3650086,RESOLUTION=1280x720,FRAME-RATE=59.960,CODECS="avc1.640020,mp4a.40.2",ZTE_SUBTITLES="subs1"
sec-v1-a1.m3u8
3) The service side slices and transcodes the video and audio in the media stream according to an HLS protocol and encapsulates the video and audio into a TS fragment file;
4) The server side carries out transcoding encapsulation on the subtitles in the media stream according to an m4s encapsulation format and an smpte-tt encoding format;
5) A terminal side player downloads a main index file and downloads and analyzes each media fragment;
6) The terminal side player decodes and displays the downloaded and analyzed audio and video information;
7) The terminal side caption analyzer downloads the main index file and downloads the caption fragments;
8) The terminal side caption analyzer de-encapsulates the downloaded caption file to obtain decoding reference time and caption information;
9) The terminal side caption analyzer decodes the caption information in a smpte-tt format to acquire information such as caption display time, display styles, display picture contents, display picture coding formats and the like;
10 The caption resolver at the terminal side decodes the display picture according to the coding format information of the display picture acquired by smpte-tt decoding;
11 Terminal side caption analyzer obtains real-time of the player, searches caption meeting display time condition, and performs synchronous display;
12 The service side judges whether the sub-index file needs to be updated or not, and if so, the sub-index file is updated;
13 Terminal-side player, subtitle parser periodically updates the sub-index file and repeats steps 6) -11).
Example two
In this embodiment, taking on-demand as an example, a description will be given of an example of displaying a picture subtitle in an HLS on-demand media stream. As shown in fig. 6, the process includes the following steps:
1) Downloading the video-on-demand source by the service side for packaging and distribution;
2) The service side creates a main index file based on an HLS protocol according to the media information in the media stream, the main index file comprises multi-path stream information, and the sample can refer to the step 2) of the live broadcast embodiment;
3) The service side slices and transcodes the video and audio in the media stream according to an HLS protocol and encapsulates the video and audio into a TS fragment file;
4) The server side carries out transcoding encapsulation on the subtitles in the media stream according to an m4s encapsulation format and a smpte-tt coding format;
5) A terminal side player downloads a main index file and downloads and analyzes each media fragment;
6) The terminal side player decodes and displays the downloaded and analyzed audio and video information;
7) The terminal side caption analyzer downloads the main index file and downloads the caption fragments;
8) The terminal side caption analyzer de-encapsulates the downloaded caption file to obtain decoding reference time and caption information;
9) The terminal side caption parser decodes the caption information in smpte-tt format to obtain the information of caption display time, display style, display picture content, display picture coding format and the like;
10 The caption resolver at the terminal side decodes the display picture according to the coding format information of the display picture acquired by smpte-tt decoding;
11 Terminal side caption resolver acquires the real-time of the player, searches for the caption meeting the display time condition, and performs synchronous display.
EXAMPLE III
In the present embodiment, a method for displaying a subtitle in a picture of a multi-subtitle stream will be described by taking a multi-subtitle stream switching as an example. In this embodiment, taking 2-channel subtitle streams as an example, the first subtitle stream is subtitle stream a, and the second subtitle stream is subtitle stream b. As shown in fig. 7, the process includes the following steps:
1) The service side creates a main index file based on HLS protocol according to the media information in the media stream, where the main index file includes multiple paths of stream information, where the main index file includes multiple pieces of subtitle media stream information, and the example is as follows:
#EXTM3U
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="Audio",NAME="chi",DEFAULT=NO,AUTOSELECT=YES,LANGUAGE="chi",URI="audio_ch.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="Audio",NAME="eng",DEFAULT=YES,AUTOSELECT=YES,LANGUAGE="eng",URI="audio_en.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="Subtitles",NAME="chi",DEFAULT=YES,AUTOSELECT=YES,FORCED=NO,LANGUAGE="chi",URI="smpte_ch.m3u8"
#EXT-X-MEDIA:TYPE=SUBTITLES,GROUP-ID="Subtitles",NAME="eng",DEFAULT=NO,AUTOSELECT=YES,FORCED=NO,LANGUAGE="eng",URI="smpte_en.m3u8"
#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=1129000,AUDIO="Audio",SUBTITLES="Subtitles"
video_1000kbps.m3u8
2) The service side slices and transcodes the video and audio in the media stream according to an HLS protocol and encapsulates the video and audio into a TS fragment file;
3) The server side carries out transcoding encapsulation on the subtitles in the media stream according to an m4s encapsulation format and an smpte-tt encoding format;
4) The terminal side caption analyzer downloads the main index file and analyzes information such as multi-caption languages, sub-index files and the like;
5) Downloading a first subtitle stream a in the multiple subtitles by default through a subtitle parser at the terminal side;
6) The terminal side caption analyzer downloads the sub-index file for analysis, and downloads the caption fragments according to the file names in the index file;
7) The terminal side caption analyzer de-encapsulates the downloaded caption file to obtain decoding reference time and caption information;
8) The terminal side caption parser decodes the caption information in smpte-tt format to obtain the information of caption display time, display style, display picture content, display picture coding format and the like;
9) The terminal side caption analyzer performs picture decoding on the display picture according to the display picture coding format information acquired by smpte-tt decoding;
10 The caption resolver at the terminal side stores the parsed caption information into a buffer area;
11 Terminal side caption analyzer obtains real-time of player, searches caption meeting display time condition in caption information buffer zone, and makes synchronous display;
12 If the stream is a live stream, the subtitle parser at the terminal side updates the sub-index file at regular time;
13 User selects to switch the subtitle stream b;
14 Terminal-side caption parser empties the caption information buffer;
15 The terminal side caption resolver updates a sub-index file of the caption stream b according to the caption stream b information resolved in the main index file;
16 Terminal side caption analyzer selects the initial downloaded segments to download in sequence according to the current time of the player;
17 After the terminal-side subtitle parser downloads the fragments, steps 7) -12) are repeated.
Embodiments of the present invention also provide a computer-readable storage medium having a computer program stored thereon, wherein the computer program is arranged to perform the steps of any of the above-mentioned method embodiments when executed.
In an exemplary embodiment, the computer-readable storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
In an exemplary embodiment, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
For specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and exemplary implementations, and details of this embodiment are not repeated herein.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented in a general purpose computing device, they may be centralized in a single computing device or distributed across a network of multiple computing devices, and they may be implemented in program code that is executable by a computing device, such that they may be stored in a memory device and executed by a computing device, and in some cases, the steps shown or described may be executed in an order different from that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps therein may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. A subtitle display method based on HLS streams is characterized by comprising the following steps:
transcoding DASH media stream into HLS stream, and transcoding subtitle stream in the DASH media stream into subtitle file in picture coding format;
downloading and playing the video and audio files in the HLS stream through a player;
downloading and analyzing the subtitle file through a subtitle analyzer to obtain subtitle display information;
and acquiring the current playing time of the player through the subtitle parser, and selecting and synchronously displaying the corresponding subtitle according to the subtitle display information.
2. The method of claim 1, wherein transcoding a DASH media stream into an HLS stream and transcoding subtitle streams in the DASH media stream into a subtitle file in a picture coding format comprises:
and slicing and transcoding and packaging the DASH media stream according to an HLS protocol, wherein the video stream and the audio stream are transcoded into a media file according to an original coding format, the subtitle stream is transcoded into a subtitle file according to a picture subtitle coding format, and the index file is modified.
3. The method of claim 1, wherein the index file comprises a main index file and a sub index file, wherein the sub index file comprises a video index file, an audio index file, and a subtitle index file; modifying the index file includes: the custom extension field is used to identify the subtitle information.
4. The method of claim 1, wherein the subtitle display information comprises at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
5. The method of claim 1, wherein downloading and playing video and audio files in the HLS stream via a player comprises:
downloading the index file through the player, and downloading and analyzing each media fragment according to the index file;
and decoding and playing the video and audio files after downloading and analyzing through the player.
6. The method of claim 1, wherein downloading and parsing the subtitle file by a subtitle parser to obtain subtitle display information comprises:
downloading the index file through a subtitle parser, and downloading subtitle fragments according to the index file;
decapsulating the downloaded subtitle file, and acquiring decoding reference time and subtitle information;
and decoding the subtitle information according to the corresponding picture coding format to obtain the subtitle display information.
7. The method of claim 2, wherein after selecting the corresponding subtitle according to the subtitle display information and performing synchronous display, the method further comprises:
and updating the sub-index file at regular time through the subtitle parser.
8. The method of claim 2, wherein the DASH media stream includes a plurality of subtitle streams, and after selecting corresponding subtitles according to the subtitle display information and performing synchronous display, the method further includes:
judging whether switching from a current first subtitle stream to a second subtitle stream is needed, and if so, emptying subtitle information of the first subtitle stream in a buffer area;
updating a sub-index file of the second subtitle stream according to the second subtitle stream information analyzed in the main index file;
and downloading and decoding the fragments of the second subtitle stream through the subtitle parser, and synchronously displaying the subtitles corresponding to the second subtitle stream according to the current playing time of the player.
9. A subtitle display apparatus based on an HLS stream, comprising:
the transcoding module is used for transcoding the DASH media stream into the HLS stream and transcoding the subtitle stream in the DASH media stream into a subtitle file in a picture coding format;
the player is used for downloading and playing the video and audio files in the HLS stream;
and the subtitle parser is used for downloading and parsing the subtitle file to obtain subtitle display information, obtaining the current playing time of the player, and selecting and synchronously displaying the corresponding subtitle according to the subtitle display information.
10. The apparatus of claim 9,
the transcoding module is further configured to slice and transcode and encapsulate the DASH media stream according to the HLS protocol, where the video stream and the audio stream are transcoded into a media file according to an original encoding format, the subtitle stream is transcoded into a subtitle file in a picture subtitle encoding format, and the index file is modified.
11. The apparatus of claim 9, wherein the subtitle display information comprises at least one of: the display time of the caption, the content of the display picture, the display style, the size of the display position and the coding format of the display picture.
12. The apparatus of claim 9,
the player is also used for downloading the index file, downloading and analyzing each media fragment according to the index file, and decoding and playing the video and audio files after downloading and analyzing.
13. The apparatus of claim 9, wherein the subtitle parser comprises:
the downloading module is used for downloading the index file and downloading the subtitle fragments according to the index file;
the analysis module is used for de-encapsulating the downloaded subtitle file and acquiring decoding reference time and subtitle information;
the decoding module is used for decoding the subtitle information according to a corresponding picture coding format to acquire the subtitle display information;
the synchronous module is used for acquiring the current playing time of the player;
and the display module is used for selecting the corresponding subtitle according to the subtitle display information and carrying out synchronous display.
14. The apparatus of claim 9, wherein the transcoding module is located on a server side, and wherein the player and the subtitle parser are located on a terminal side.
15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 8.
16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the method as claimed in any of claims 1 to 8 are implemented when the computer program is executed by the processor.
CN202110611176.0A 2021-06-01 2021-06-01 Subtitle display method and device based on HLS (HTTP live streaming) Pending CN115442662A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110611176.0A CN115442662A (en) 2021-06-01 2021-06-01 Subtitle display method and device based on HLS (HTTP live streaming)
PCT/CN2022/095045 WO2022253079A1 (en) 2021-06-01 2022-05-25 Hls stream-based subtitle display method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110611176.0A CN115442662A (en) 2021-06-01 2021-06-01 Subtitle display method and device based on HLS (HTTP live streaming)

Publications (1)

Publication Number Publication Date
CN115442662A true CN115442662A (en) 2022-12-06

Family

ID=84271857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110611176.0A Pending CN115442662A (en) 2021-06-01 2021-06-01 Subtitle display method and device based on HLS (HTTP live streaming)

Country Status (2)

Country Link
CN (1) CN115442662A (en)
WO (1) WO2022253079A1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2903319A1 (en) * 2013-03-14 2014-09-18 Arris Technology, Inc. Devices, systems, and methods for converting or translating dynamic adaptive streaming over http (dash) to http live streaming (hls)
FR3025925B1 (en) * 2014-09-17 2016-12-23 France Brevets METHOD FOR CONTROLLING PRESENTATION MODES OF SUBTITLES
CN106162377B (en) * 2015-04-08 2019-06-21 中国移动通信集团公司 Conversion method, device, BM-SC and the terminal of adaptive stream media technology
US10673907B2 (en) * 2015-07-16 2020-06-02 Arris Enterprises Llc Systems and methods for providing DLNA streaming to client devices
CN108055574A (en) * 2017-11-29 2018-05-18 上海网达软件股份有限公司 Media file transcoding generates the method and system of multitone rail multi-subtitle on-demand content
CN111147896A (en) * 2018-11-05 2020-05-12 中兴通讯股份有限公司 Subtitle data processing method, device and equipment and computer storage medium

Also Published As

Publication number Publication date
WO2022253079A1 (en) 2022-12-08

Similar Documents

Publication Publication Date Title
US9712890B2 (en) Network video streaming with trick play based on separate trick play files
RU2652099C2 (en) Transmission device, transmission method, reception device and reception method
KR100800860B1 (en) Method and apparatus for preview service in digital broadcasting system using electronic service guide
US9247317B2 (en) Content streaming with client device trick play index
US11025982B2 (en) System and method for synchronizing content and data for customized display
CN107634930B (en) Method and device for acquiring media data
US8826346B1 (en) Methods of implementing trickplay
US10715571B2 (en) Self-adaptive streaming medium processing method and apparatus
KR20180089416A (en) Selection of next-generation audio data coded for transmission
US10887645B2 (en) Processing media data using file tracks for web content
US10674229B2 (en) Enabling personalized audio in adaptive streaming
KR101409023B1 (en) Method and System for providing Application Service
KR102499231B1 (en) Receiving device, sending device and data processing method
WO2014193996A2 (en) Network video streaming with trick play based on separate trick play files
KR102085192B1 (en) Rendering time control
KR20090009847A (en) Method and apparatus for re-constructing media from a media representation
CN109151614B (en) Method and device for reducing HLS live broadcast delay
EP3242490B1 (en) Self-adaptive streaming media processing method and device
CN103945260B (en) A kind of streaming media on demand editing system and order method
CN116233490A (en) Video synthesis method, system, device, electronic equipment and storage medium
CN115442662A (en) Subtitle display method and device based on HLS (HTTP live streaming)
KR102533674B1 (en) Receiving device, sending device and data processing method
WO2016199527A1 (en) Transmission device, transmission method, reception device, and reception method
CN107547917B (en) Channel playing and processing method and device and channel processing system
US9219931B2 (en) Method and apparatus for transmitting and receiving service discovery information in multimedia transmission system and file structure for the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination