CN112930687B - Media stream processing method and device, storage medium and program product - Google Patents

Media stream processing method and device, storage medium and program product Download PDF

Info

Publication number
CN112930687B
CN112930687B CN201880098342.8A CN201880098342A CN112930687B CN 112930687 B CN112930687 B CN 112930687B CN 201880098342 A CN201880098342 A CN 201880098342A CN 112930687 B CN112930687 B CN 112930687B
Authority
CN
China
Prior art keywords
media stream
time stamp
display time
playing
web front
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201880098342.8A
Other languages
Chinese (zh)
Other versions
CN112930687A (en
Inventor
鲁学研
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bitmain Technologies Inc
Original Assignee
Bitmain Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bitmain Technologies Inc filed Critical Bitmain Technologies Inc
Publication of CN112930687A publication Critical patent/CN112930687A/en
Application granted granted Critical
Publication of CN112930687B publication Critical patent/CN112930687B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

A media stream processing method and device, storage medium and program product, the method includes the following steps: the front end of the Internet webpage pulls the media stream, acquires the structured information carried in the media stream (S102), then generates a display time stamp of the structured information according to the frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream (S104), so that the front end of the Internet webpage performs format encapsulation on the media stream, plays the encapsulated media stream by using an H5 player (S106), and further synchronously draws the structured information on the current playing picture according to the display time stamp (S108). The method provides a technical scheme that synchronous drawing of structured information and video frames can be realized without installing plug-ins, eliminates the technical barrier that synchronous drawing cannot be realized without plug-ins, and has higher flexibility.

Description

Media stream processing method and device, storage medium and program product
Technical Field
The present invention relates to the field of media streaming, and for example, to a media streaming processing method and apparatus, a storage medium, and a program product.
Background
With the popularization of internet applications, data transmitted over networks is not limited to text or graphics, but is a new enjoyment for people due to the spread of media streams such as sound and video.
Currently, playing media streams on a browser generally depends on the manner in which the browser controls or plug-ins are used. Specifically, a plug-in is installed in a browser, a media stream is directly pulled through the plug-in, the pulled media stream is decoded to obtain a frame picture, and then the frame synchronization is utilized to draw structured information, so that synchronous playing of the structured information and video frames is realized.
However, the existing media stream processing method requires installing a plug-in the browser, and part of the browser also has a limitation on the version of the plug-in, but if the traditional plug-in is not used, other modes are needed to achieve synchronous drawing of the structured information and the video frames.
Disclosure of Invention
The embodiment of the disclosure provides a media stream processing method and device, a storage medium and a program product, which are used for providing a technical scheme for synchronously drawing structural information and video frames without installing plug-ins, eliminating a technical barrier that synchronous drawing cannot be realized without plug-ins, and having higher flexibility.
The embodiment of the disclosure provides a media stream processing method, which comprises the following steps:
the Web front end pulls the media stream and acquires the structural information carried in the media stream;
the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, wherein the display time stamp is aligned with the playing time of the media stream;
the Web front end performs format encapsulation on the media stream and plays the encapsulated media stream by using an H5 player;
and the Web front end synchronously draws the structural information on the current playing picture according to the display time stamp.
The embodiment of the disclosure also provides a media stream processing device, which comprises:
the acquisition module is used for pulling the media stream and acquiring the structural information carried in the media stream;
the generation module is used for generating a display time stamp PTS of the structured information according to the frame structure of the media stream, wherein the display time stamp is aligned with the playing time of the media stream;
the playing module is used for carrying out format encapsulation on the media stream and playing the encapsulated media stream by using an H5 player;
and the drawing module is used for synchronously drawing the structural information on the current playing picture according to the display time stamp.
The embodiment of the disclosure also provides a computer, which comprises the media stream processing device.
The disclosed embodiments also provide a computer-readable storage medium storing computer-executable instructions configured to perform the above-described media stream processing method.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described media stream processing method.
The embodiment of the disclosure also provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor, which when executed by the at least one processor, cause the at least one processor to perform the media stream processing method described above.
According to the technical scheme provided by the embodiment of the disclosure, the media stream is pulled through the Web front end, the structural information of the media stream is obtained, and then the frame structure of the media stream is utilized to generate the display time stamp of the structural information, so that when the H5 player is utilized to play the media stream, the structural information is synchronously drawn according to the display time stamp, so that synchronous drawing of the structural information and the media stream can be realized without installing any plug-in unit in a browser or decoding frame pictures, the technical barrier that synchronous drawing cannot be realized without plug-in units is eliminated, the flexibility is high, the CPU occupation is small due to the implementation mode, the problem of CPU consumption is avoided, multi-channel video playing can be realized, and the problem of blocking is avoided.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which like reference numerals refer to similar elements, and in which:
fig. 1 is a flow chart of a media stream processing method according to an embodiment of the disclosure;
fig. 2 is an interactive flow diagram of a media stream processing method according to an embodiment of the present disclosure;
fig. 3 is a flow chart of another media stream processing method according to an embodiment of the disclosure;
fig. 4 is a flow chart of another media stream processing method according to an embodiment of the disclosure;
fig. 5 is a schematic structural diagram of a media stream processing device according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a computer according to an embodiment of the disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
So that the manner in which the features and techniques of the disclosed embodiments can be understood in more detail, a more particular description of the embodiments of the disclosure, briefly summarized below, may be had by reference to the appended drawings, which are not intended to be limiting of the embodiments of the disclosure. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may still be practiced without these details. In other instances, well-known structures and devices may be shown simplified in order to simplify the drawing.
First, for ease of understanding, terms related to embodiments of the present disclosure will be specifically described.
Web (internet Web page) front end: which may be embodied as a Web front-end processor.
And H5: HTML5, which is formulated by the world wide web consortium (World Wide Web Consortium, W3C), is aimed at replacing the HTML4.01 and XHTML 1.0 standards formulated earlier, in order to enable the web standards to match the current web requirements as internet applications evolve rapidly. When referring broadly to HTML5, it actually refers to a set of technical combinations including hypertext markup language (Hyper Text Markup Language) HTML, cascading style sheets (Cascading Style Sheets, CSS), and JavaScript (an interpreted scripting language).
MSE (Media Source Extensions): i.e. media source extension, is a new browser interface (Web API) supported by mainstream browsers such as Chrome, safari, edge. MSE conforms to the W3C standard, allowing JavaScript to dynamically construct < video > and < audio > media streams. It defines an object that allows JavaScript to transfer a media stream fragment to one HTML Media Element (media unit).
PTS (Presentation Time Stamp): i.e. a display time stamp, which may be used as a basis for the player to play information or a media stream, e.g. the player may determine when to display the frame data corresponding to the display time stamp based on the display time stamp.
RTSP (Real Time Streaming Protocol): i.e. Real-time streaming protocol, is an application layer protocol in the transmission control protocol (Transmission Control Protocol, TCP)/internet protocol (Internet Protocol, IP) protocol system, the IETF RFC standard submitted by university of columbia, internet view company and Real Networks company.
RTMP (Real Time Messaging Protocol): i.e., the real-time messaging protocol, which is based on TCP, is a family of protocols that includes variations of the real-time messaging protocol (Real Time Messaging Protocol, RTMP) base protocol and RTMPT/RTMPs/RTMPE. RTMP is a network protocol designed for real-time data communication, and is mainly used for audio-video and data communication between Flash/AIR (Adobe Integrated Runtime) platform and media stream/interactive server supporting RTMP protocol. Software supporting this protocol includes Adobe Media Server/Ultrant Media Server/red5, etc.
FLV (FLASH VIDEO): the FLV media stream format is a video format that has evolved with the push of Flash MX technology. Because the file formed by the method is smaller and the loading speed is higher, the network can watch the video file, and the method effectively solves the problems that the exported SWF (Flash, a special format special for designing software Flash) file is huge and can not be used on the network well after the video file is imported into Flash.
http-flv: flv media stream transmitted based on http protocol.
websocket-flv: flv media stream transmitted based on websocket protocol.
NPAPI (Netscape Plugin Application Programming Interface): namely, the network scene plug-in application programming interfaces are plug-in interfaces used by Gecko engine browsers such as Netscape Navigator, mozilla Suite, mozilla SeaMonkey and Mozilla Firefox and webkit engine browsers such as Apple Safari and Google Chrome.
PPAPI (Pepper Plugin API): there is a safety hazard for NPAPI.
The following description of the embodiments of the present disclosure refers to the foregoing terms, and the foregoing meanings are not repeated.
Aiming at the problems existing in the prior art, the embodiment of the disclosure gives the following solution ideas: and the Web front end pulls the media stream and acquires the structured information, and then synchronously draws the structured information when the H5 player plays the media stream by adding a display time stamp to the structured information so as to realize synchronous playing.
The embodiment of the disclosure provides a media stream processing method. Referring to fig. 1, the method includes:
s102, the Web front end pulls the media stream and acquires the structural information carried in the media stream.
And S104, the Web front end generates a display time stamp PTS of the structural information according to the frame structure of the media stream, and the display time stamp is aligned with the playing time of the media stream.
S106, the Web front end packages the format of the media stream, and plays the packaged media stream by using an H5 player;
s108, the Web front end synchronously draws the structured information on the current playing picture according to the display time stamp.
In the embodiment of the present disclosure, the media streams related to S102 and S104 are http-flv streams or websocket-flv streams, where http and websocket are protocol names, and flv is a format of the media stream. The format packaging in step S106 is essentially to convert the format of the media stream, so that the packaged media stream can meet the playing requirement of the H5 player. In one possible design, the encapsulated media stream may be in FMP4 format.
Before the Web front-end pulls the data stream, that is, before step S102 is performed, the streaming server needs to push the foregoing media stream, and the algorithm server needs to calculate the structural information of the media stream.
Specifically, referring to the interaction diagram shown in fig. 2, before executing S102, the method further includes the following steps:
s1011, the streaming media server pulls the real-time streaming protocol RTSP stream.
S1012, the streaming media server decodes the RTSP stream to obtain a video frame unit.
S1013, the streaming server sends the video frame unit to the algorithm server.
S1014, the algorithm server processes the video frame unit to obtain the structural information of the RTSP stream.
S1015, the algorithm server sends the structured information to the streaming server.
S1016, the streaming media server encapsulates the RTSP stream and the structured information to obtain a media stream.
S1017, the streaming media server pushes the media stream.
In the embodiment of the disclosure, whether the streaming media server, the algorithm server and the Web front end are integrated together is not particularly limited, and the streaming media server, the algorithm server and the Web front end can be independent of each other or integrated in one device or equipment in at least two modes.
In addition, the prior art also relates to an H5 scheme for playing media stream without depending on plug-in, specifically, the low coding rate stream of MPEG1 coding format is transmitted by websocket mode, the frame picture is soft-decoded by adopting a central processing unit (Central Processing Unit, CPU), and synchronous drawing is performed by canvas. However, in this implementation, the CPU soft-decodes the frame picture, which results in a higher occupation of the CPU, so that the number of channels to be played is limited, and the h264 coding scheme of the main stream is not supported, and the streaming media server needs to convert and encode into MPEG1.
In contrast, as shown in fig. 2, the technical solution provided in the embodiments of the present disclosure is to decode an RTSP stream by a streaming media server to obtain a video frame unit, without decoding by a Web front end, without occupying a CPU, and without the problem of CPU consumed performance, so as to play multiple paths of videos and effectively solve the problem of katon.
In the interactive flow shown in fig. 2, in S1016, the RTSP stream and the structured information are encapsulated, and the encapsulation result is that the http-flv stream or the websocket-flv stream described in the foregoing embodiments of the present disclosure are formed.
Based on the push of the streaming media server, the Web front end can pull the media stream at the push address.
Specifically, the embodiment of the present disclosure provides the following implementation manner of S102:
the Web front end pulls the media stream through XHR2/Fetch or websocket.
Wherein, XHR2/Fetch or websocket is used as network request, which is used to request binary data from streaming media server to obtain media stream.
Further, the location of the encapsulation of the structured information in the media stream may be different based on the protocol-format of the media stream encapsulated by S1016. In one possible design, if the RTSP stream and the structured information are encapsulated as a media stream conforming to the h.264 specification, such as an http-flv stream, the structured information may be encapsulated into a supplemental enhancement information (Supplemental enhancement information, SEI) unit of the http-flv stream. Among them, h.264 is a digital video compression format proposed by the international organization for standardization and the international telecommunications union together.
In contrast, the Web server may acquire the structured information carried in the media stream according to the protocol and format of the media stream when executing the step of acquiring the structured information, where the packaging positions of the structured information in the media streams based on different protocols-formats are different.
That is, the protocol-format of the media stream is first determined, so that the encapsulation position of the purchase information is determined according to the preset correspondence between the protocol-format and the encapsulation position, and then the structured information is acquired at the encapsulation position. For example, in the foregoing possible design, since the http-flv stream is pulled by the Web front end, it may be determined that the SEI unit in the http-flv stream is used to encapsulate the structural information, and thus the structural information may be obtained in the SEI unit.
The existing media stream processing mode is realized by means of plug-ins, such as an NPAPI plug-in, a PPAPI plug-in or an ADOBE FLASH plug-in, frame pictures are required to be decoded after the media stream is pulled, and structural information is drawn by utilizing frame synchronization, so that a browser is required to install the plug-ins, and the operation amount is further increased and the processing efficiency is further reduced due to the frame picture decoding.
Based on this, in the embodiment of the disclosure, since the plug-in is not required to be installed in the browser, but the display timestamp is generated for the structured information in the obtained media stream, the structured information can be synchronously drawn when the H5 player plays the media stream without decoding the media stream into a frame picture.
The specific implementation process may refer to the flow shown in fig. 3, where step S104 may be:
s1042, the Web front end analyzes the SEI unit of the structured data and the video frame unit of the media stream according to the frame structure of the media stream.
And S1044, the Web front end generates a display time stamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
In the disclosed embodiment, the final presentation results in the media stream being played in synchronization with the structured information, thus requiring the display time stamp of the SEI unit to be aligned with the play time of the video frame unit.
For example, if the playing of the video frame unit is played by trampling at a time interval of 40ms per frame, when the display time stamp of the SEI unit is generated, the display time stamp corresponding to the playing time of the video frame unit corresponding to the SEI unit may be generated for each SEI unit at a time interval of 40ms per SEI unit.
Based on the foregoing settings, the embodiment of the disclosure uses an H5 player to play the media stream, and simultaneously draws the structured information above the current playing frame according to the foregoing generated display timestamp, so as to realize synchronous playing.
When the structured information is synchronously drawn according to the display time stamp, a three-dimensional image or a two-dimensional image can be drawn on a canvas as required, wherein the canvas is a browser DOM (document object model document object model) object, and a displayed picture is overlaid and displayed on the current playing picture. That is, the canvas presents a picture closer to the viewer than the picture played in the H5 player.
In one possible design, the two-dimensional image may be implemented by canvas. Canvas is a 2D (two-dimensional) drawing protocol that is rendered pixel by pixel and 2D image rendering is implemented by JavaScript. Therefore, in the foregoing flow, if synchronous drawing of the structured information is implemented by canvas, canvas elements may be added in the H5 player, so that the canvas may draw 2D images on the canvas by JavaScript.
Alternatively, in another design, the drawing of the three-dimensional image may be achieved through the Web GL (Web Graphics Library ) protocol. The Web GL is a 3D (three-dimensional) drawing protocol, which eliminates the trouble of developing a Web-specific rendering plug-in, and can be used to create Web pages with a complex 3D structure, even to design 3D Web games, etc.
When synchronous drawing is specifically realized, the front end of the Web can preset frequency to call the current time of the media stream, and then synchronous drawing of structural information is carried out on the canvas based on the display timestamp, and the current playing picture is covered and aligned. The preset frequency may be set as needed, for example, a frequency fetch time of 60 frames of 1 second may be set.
However, in the synchronous drawing process, considering the possible problem that the media stream is not synchronous with the playing of the structured information, the embodiment of the disclosure also provides a possible design as shown in fig. 4, so as to implement the deviation correction processing on the drawing structured information and ensure the synchronous drawing with the media stream.
As shown in fig. 4, the method further includes:
s110, the Web front end performs deviation rectifying processing on the drawing of the structured information according to a first frequency, wherein the first frequency is larger than a second frequency, and the second frequency is the frame playing frequency of the media stream.
In the design shown in fig. 4, the second frequency is the frame playing frequency of the media stream, and the time interval is the frame interval of the media stream, which is related to the frame structure design of the media stream. And the first frequency is greater than the second frequency, and the time interval of the first frequency is less than the time interval of the first frequency, i.e., less than the frame interval of the media stream. That is, the drawing of the structured information is rectified at a higher frequency so that the played media stream and the structured information can be presented synchronously.
In addition, in consideration of calculation time consumption of alignment drawing, in specific implementation, correction processing can be performed by using a time interval slightly longer than specific operation duration. The correction processing can be realized through MATH.ceil function.
The math.ceil function is one of the lua functions (1 ua function is one of JavaScript) and is configured to: MATH. Ceil (x) returns the smallest integer greater than or equal to parameter x, i.e., rounding up the floating point number. In an implementation scenario involved in embodiments of the present disclosure, it may be expressed as: math.ceil (video.currenttime/40) x 40 is aligned with PTS, where video.currenttime represents the current play time of the media stream.
For example, in one possible implementation scenario, if the frame playing interval of the media stream is 40ms, the correction process may be performed with 10ms per frame as the first frequency.
As can be seen from the foregoing description, the technical solution provided by the embodiments of the present disclosure does not need to install a plug-in the browser, which can be suitable for development requirements of future browsers, and the synchronous drawing is realized by displaying the timestamp without decoding the frame picture by the Web front end, which can effectively reduce the operand, is beneficial to shortening the processing time, and reserves more time for correction processing, so as to ensure synchronous playing and alignment of the structured information and the media stream.
The embodiment of the disclosure also provides a media stream processing device. Referring to fig. 5, the media stream processing device 500 includes:
the obtaining module 51 is configured to pull the media stream and obtain structural information carried in the media stream;
the generating module 52 is configured to generate a display time stamp PTS of the structured information according to a frame structure of the media stream, where the display time stamp is aligned with a playing time of the media stream;
a playing module 53, configured to format-encapsulate the media stream, and play the encapsulated media stream by using an H5 player;
and the drawing module 54 is used for synchronously drawing the structural information on the current playing picture according to the display time stamp.
In one possible design, the generating module 52 is specifically configured to:
analyzing SEI units of the structured data and video frame units of the media stream according to the frame structure of the media stream;
and generating a display time stamp for the SEI unit according to the playing time of the video frame unit, so that the SEI unit is aligned with the video frame unit.
In another possible design, the media stream processing device further includes:
and the deviation rectifying module (not shown in fig. 5) is used for rectifying the drawing of the structured information according to a first frequency, wherein the first frequency is larger than a second frequency, and the second frequency is the frame playing frequency of the media stream.
In another possible design, the rendering module 54 is specifically configured to:
and drawing the three-dimensional image or the two-dimensional image on the canvas according to the display time stamp, wherein the canvas is overlaid and displayed on the current playing picture.
In another possible design, the obtaining module 51 is specifically configured to:
pulling the media stream;
and obtaining the structural information carried in the media stream according to the protocol and format of the media stream.
In another possible design, the media stream is: http-flv media stream or websocket-flv media stream.
In another possible design, the obtaining module 51 is specifically configured to:
the media stream is pulled through XHR2/Fetch or websocket.
The media stream processing device 500 shown in fig. 5 is provided at the Web front end.
In addition, referring to fig. 6, the embodiment of the disclosure further provides a computer 600 including the above-mentioned media stream processing device 500.
The disclosed embodiments also provide a computer-readable storage medium storing computer-executable instructions configured to perform the above media stream processing method.
The disclosed embodiments also provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described media stream processing method.
The computer readable storage medium may be a transitory computer readable storage medium or a non-transitory computer readable storage medium.
The embodiment of the disclosure also provides an electronic device, the structure of which is shown in fig. 7, the electronic device comprising:
at least one processor 73, one processor 73 being illustrated in FIG. 7; and a memory (memory) 71, which may also include a communication interface (Communication Interface) 72 and a bus. The processor 73, the communication interface 72, and the memory 71 may communicate with each other via a bus. The communication interface 72 may be used for information transfer. The processor 73 may call logic instructions in the memory 71 to perform the media stream processing method of the above-described embodiment.
Further, the logic instructions in the memory 71 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product.
The memory 71 is a computer-readable storage medium that can be used to store a software program, a computer-executable program, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 73 executes the functional applications and the media streams by running software programs, instructions and modules stored in the memory 71, i.e. implements the media stream processing method in the above-described method embodiments.
The memory 71 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 71 may include a high-speed random access memory, and may also include a nonvolatile memory.
Embodiments of the present disclosure may be embodied in a software product stored on a storage medium, including one or more instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of a method according to embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium including: a plurality of media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or a transitory storage medium.
When used in this application, although the terms "first," "second," etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another element. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without changing the meaning of the description, so long as all occurrences of the "first element" are renamed consistently and all occurrences of the "second element" are renamed consistently. The first element and the second element are both elements, but may not be the same element.
The words used in this application are merely for describing embodiments and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a," "an," and "the" (the) are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, when used in this application, the terms "comprises," "comprising," and/or "includes," and variations thereof, mean that the stated features, integers, steps, operations, elements, and/or components are present, but that the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof is not precluded.
The aspects, implementations, or features of the described embodiments can be used alone or in any combination. Aspects of the described embodiments may be implemented in software, hardware, or a combination of software and hardware. The described embodiments may also be embodied by a computer-readable medium having stored thereon computer-readable code comprising instructions executable by at least one computing device. The computer readable medium may be associated with any data storage device that can store data which can be thereafter read by a computer system. Computer readable media for example may include read-only memory, random-access memory, CD-ROM, HDD, DVD, magnetic tape, optical data storage devices, and the like. The computer readable medium can also be distributed over a network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.
The above technical description may refer to the accompanying drawings, which form a part of the present application, and in which are shown by way of illustration implementations in accordance with the described embodiments. While these embodiments are described in sufficient detail to enable those skilled in the art to practice them, these embodiments are non-limiting; other embodiments may be used, and changes may be made without departing from the scope of the described embodiments. For example, the order of operations described in the flowcharts is non-limiting, and thus the order of two or more operations illustrated in the flowcharts and described in accordance with the flowcharts may be changed in accordance with several embodiments. As another example, in several embodiments, one or more operations illustrated in the flowcharts and described in accordance with the flowcharts are optional or may be deleted. In addition, certain steps or functions may be added to the disclosed embodiments or more than two of the step sequences may be substituted. All such variations are considered to be encompassed by the disclosed embodiments and the claims.
Additionally, terminology is used in the above technical description to provide a thorough understanding of the described embodiments. However, no overly detailed details are required to implement the described embodiments. Accordingly, the foregoing description of the embodiments has been presented for purposes of illustration and description. The embodiments presented in the foregoing description and examples disclosed in accordance with these embodiments are provided separately to add context and aid in the understanding of the described embodiments. The foregoing description is not intended to be exhaustive or to limit the described embodiments to the precise form disclosed. Several modifications, alternative adaptations and variations are possible in light of the above teachings. In some instances, well known process steps have not been described in detail in order to avoid unnecessarily obscuring the described embodiments.

Claims (15)

1. A method for processing a media stream, comprising:
the method comprises the steps that a media stream is pulled by the front end of an Internet webpage Web, and structured information carried in the media stream is obtained;
the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, wherein the display time stamp is aligned with the playing time of the media stream;
the Web front end performs format encapsulation on the media stream and plays the encapsulated media stream by using an H5 player;
the Web front end synchronously draws the structural information on the current playing picture according to the display time stamp;
the Web front end generates a display time stamp PTS of the structured information according to the frame structure of the media stream, including:
the Web front end analyzes an SEI unit of structured data and a video frame unit of the media stream according to the frame structure of the media stream;
and the Web front end generates the display time stamp for the SEI unit according to the playing time of the video frame unit, so that the display time stamp of the SEI unit is aligned with the playing time of the video frame unit.
2. The method according to claim 1, wherein the method further comprises:
and the Web front end performs deviation rectifying processing on the drawing of the structured information according to a first frequency, wherein the first frequency is larger than a second frequency, and the second frequency is the frame playing frequency of the media stream.
3. The method according to claim 1, wherein the Web front-end synchronously drawing the structured information on the current playing screen according to the display time stamp, including:
and the Web front end draws a three-dimensional image or a two-dimensional image on a canvas according to the display time stamp, wherein the canvas is displayed on the current playing picture in an overlaying manner.
4. The method of claim 1, wherein the Web front end pulls a media stream to obtain structured information carried in the media stream, comprising:
the Web front end pulls the media stream;
and the Web front end acquires the structural information carried in the media stream according to the protocol and the format of the media stream.
5. The method according to claim 1 or 4, wherein the media stream is: http-flv media stream or websocket-flv media stream.
6. The method of claim 5, wherein the Web front end pulling the media stream comprises:
the Web front end pulls the media stream through XHR2/Fetch or websocket.
7. A media stream processing device, comprising:
the acquisition module is used for pulling the media stream and acquiring the structural information carried in the media stream;
the generation module is used for generating a display time stamp PTS of the structured information according to the frame structure of the media stream, wherein the display time stamp is aligned with the playing time of the media stream;
the playing module is used for carrying out format encapsulation on the media stream and playing the encapsulated media stream by using an H5 player;
the drawing module is used for synchronously drawing the structural information on the current playing picture according to the display time stamp;
the generating module is specifically configured to:
analyzing SEI units of structured data and video frame units of the media stream according to the frame structure of the media stream;
and generating the display time stamp for the SEI unit according to the playing time of the video frame unit, so that the display time stamp of the SEI unit is aligned with the playing time of the video frame unit.
8. The apparatus of claim 7, wherein the apparatus further comprises:
and the deviation rectifying module is used for rectifying the drawing of the structured information according to a first frequency, wherein the first frequency is larger than a second frequency, and the second frequency is the frame playing frequency of the media stream.
9. The apparatus of claim 7, wherein the rendering module is specifically configured to:
and drawing a three-dimensional image or a two-dimensional image on a canvas according to the display time stamp, wherein the canvas is displayed in an overlaying manner on the current playing picture.
10. The apparatus of claim 7, wherein the obtaining module is specifically configured to:
pulling the media stream;
and acquiring the structural information carried in the media stream according to the protocol and the format of the media stream.
11. The apparatus according to claim 7 or 10, wherein the media stream is: http-flv media stream or websocket-flv media stream.
12. The apparatus of claim 11, wherein the obtaining module is specifically configured to:
the media stream is pulled through XHR2/Fetch or websocket.
13. A computer comprising the apparatus of any one of claims 7-12.
14. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor, which when executed by the at least one processor, cause the at least one processor to perform the method of any of claims 1-6.
15. A computer readable storage medium, characterized in that computer executable instructions are stored, said computer executable instructions being arranged to perform the method of any of claims 1-6.
CN201880098342.8A 2018-11-15 2018-11-15 Media stream processing method and device, storage medium and program product Active CN112930687B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/115660 WO2020097857A1 (en) 2018-11-15 2018-11-15 Media stream processing method and apparatus, storage medium, and program product

Publications (2)

Publication Number Publication Date
CN112930687A CN112930687A (en) 2021-06-08
CN112930687B true CN112930687B (en) 2023-04-28

Family

ID=70730368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201880098342.8A Active CN112930687B (en) 2018-11-15 2018-11-15 Media stream processing method and device, storage medium and program product

Country Status (2)

Country Link
CN (1) CN112930687B (en)
WO (1) WO2020097857A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113573088B (en) * 2021-07-23 2023-11-10 上海芯翌智能科技有限公司 Method and equipment for synchronously drawing identification object for live video stream
CN113360707A (en) * 2021-07-27 2021-09-07 北京睿芯高通量科技有限公司 Video structured information storage method and system
CN113938470B (en) * 2021-10-18 2023-09-12 成都小步创想慧联科技有限公司 Method and device for playing RTSP data source by browser and streaming media server
CN114461423A (en) * 2022-02-08 2022-05-10 腾讯科技(深圳)有限公司 Multimedia stream processing method, device, storage medium and program product
CN114697303B (en) * 2022-03-16 2023-11-03 北京金山云网络技术有限公司 Multimedia data processing method and device, electronic equipment and storage medium
CN114745361B (en) * 2022-03-25 2024-05-14 朗新数据科技有限公司 Audio and video playing method and system for HTML5 browser
CN115914748A (en) * 2022-10-18 2023-04-04 阿里云计算有限公司 Visual display method and device for visual recognition result and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106375793A (en) * 2016-08-29 2017-02-01 东方网力科技股份有限公司 Superposition method and superposition system of video structured information, and user terminal
CN107194006A (en) * 2017-06-19 2017-09-22 深圳警翼智能科技股份有限公司 A kind of video features structural management method
CN107277004A (en) * 2017-06-13 2017-10-20 重庆扬讯软件技术股份有限公司 A kind of browser is without plug-in unit net cast method
CN107832402A (en) * 2017-11-01 2018-03-23 武汉烽火众智数字技术有限责任公司 Dynamic exhibition system and its method during a kind of video structural fructufy

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007073347A1 (en) * 2005-12-19 2007-06-28 Agency For Science, Technology And Research Annotation of video footage and personalised video generation
US9712867B2 (en) * 2013-09-16 2017-07-18 Avago Technologies General Ip (Singapore) Pte. Ltd. Application specific policy implementation and stream attribute modification in audio video (AV) media
CN107682715B (en) * 2016-08-01 2019-12-24 腾讯科技(深圳)有限公司 Video synchronization method and device
CN106303430B (en) * 2016-08-21 2019-05-14 贵州大学 The method for playing real time monitoring without plug-in unit in browser

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106375793A (en) * 2016-08-29 2017-02-01 东方网力科技股份有限公司 Superposition method and superposition system of video structured information, and user terminal
CN107277004A (en) * 2017-06-13 2017-10-20 重庆扬讯软件技术股份有限公司 A kind of browser is without plug-in unit net cast method
CN107194006A (en) * 2017-06-19 2017-09-22 深圳警翼智能科技股份有限公司 A kind of video features structural management method
CN107832402A (en) * 2017-11-01 2018-03-23 武汉烽火众智数字技术有限责任公司 Dynamic exhibition system and its method during a kind of video structural fructufy

Also Published As

Publication number Publication date
CN112930687A (en) 2021-06-08
WO2020097857A1 (en) 2020-05-22

Similar Documents

Publication Publication Date Title
CN112930687B (en) Media stream processing method and device, storage medium and program product
US10567809B2 (en) Selective media playing method and apparatus according to live streaming and recorded streaming
US20220263885A1 (en) Adaptive media streaming method and apparatus according to decoding performance
US10200744B2 (en) Overlay rendering of user interface onto source video
US9137567B2 (en) Script-based video rendering
CN109889907B (en) HTML 5-based video OSD display method and device
US10979785B2 (en) Media playback apparatus and method for synchronously reproducing video and audio on a web browser
KR101323424B1 (en) Extended command stream for closed caption disparity
US9936231B2 (en) Trigger compaction
US9554175B2 (en) Method, computer program, reception apparatus, and information providing apparatus for trigger compaction
US11638066B2 (en) Method, device and computer program for encapsulating media data into a media file
KR20130127423A (en) Method of picture-in-picture for multimedia applications
EP2990958A1 (en) Reception device, information processing method in reception device, transmission device, information processing device, and information processing method
US11653054B2 (en) Method and apparatus for late binding in media content
KR101668283B1 (en) Method for displaying video considered latency, apparatus and cloud streaming service system therefor
KR20140133096A (en) Virtual web iptv and streaming method using thereof
CN113573100B (en) Advertisement display method, equipment and system
CN115665117A (en) Webpage-side video stream playing method
CN114125501A (en) Interactive video generation method and playing method and device thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant