CN112584088A - Method for transmitting media stream data, electronic device and storage medium - Google Patents

Method for transmitting media stream data, electronic device and storage medium Download PDF

Info

Publication number
CN112584088A
CN112584088A CN202110212230.4A CN202110212230A CN112584088A CN 112584088 A CN112584088 A CN 112584088A CN 202110212230 A CN202110212230 A CN 202110212230A CN 112584088 A CN112584088 A CN 112584088A
Authority
CN
China
Prior art keywords
frame data
video stream
stream
time stamp
media stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110212230.4A
Other languages
Chinese (zh)
Other versions
CN112584088B (en
Inventor
孙俊伟
王克彦
曹亚曦
吕少卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Huachuang Video Signal Technology Co Ltd
Original Assignee
Zhejiang Huachuang Video Signal Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Huachuang Video Signal Technology Co Ltd filed Critical Zhejiang Huachuang Video Signal Technology Co Ltd
Priority to CN202110212230.4A priority Critical patent/CN112584088B/en
Publication of CN112584088A publication Critical patent/CN112584088A/en
Application granted granted Critical
Publication of CN112584088B publication Critical patent/CN112584088B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The embodiment of the application relates to a method for sending media stream data, an electronic device and a storage medium, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; determining the storage time stamp of each path of video stream in the multiple paths of video streams based on the same time reference; selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to the client, wherein the start and stop times of the storage time stamps of any two groups of image groups in the preset video stream are not overlapped; therefore, the browser of the client can directly decode and play the preset video stream sent by the recording and playing server without installing a plug-in.

Description

Method for transmitting media stream data, electronic device and storage medium
Technical Field
The present application relates to the field of multimedia data processing technologies, and in particular, to a method for sending media stream data, an electronic device, and a computer-readable storage medium.
Background
The video conference system includes a Multipoint Control Unit (MCU) as a video conference server, and various participating devices such as soft and hard terminals and a recording and playing server. The terminal collects images and sounds and sends the images and the sounds to the MCU through coding. And the MCU fuses or does not fuse the images sent by the terminals according to the requirements of the video conference, and sends the sound mixed or not mixed to each participating terminal, thereby realizing the audio and video conversation of the multi-party participants. And the recording and broadcasting server pulls the multi-channel media stream to the MCU through the conference entering behavior of the simulation terminal and then carries out video recording storage so as to provide the video recording for the user to play back.
With the rise of mobile internet, the demand that video conference videos can be played on demand in time on a browser of a mobile terminal without installing plug-ins is increasingly strong. However, in the video conference service, the interaction of the media streams between the terminal and the MCU often includes multiple video streams, such as a video stream collected by a camera in a meeting place and a presentation video stream presented by a PPT in a laptop; the coding parameters of a plurality of paths of video streams are often inconsistent, and the browser plays back the conference video without plug-ins becomes very difficult.
In the related art, a method for processing media stream data including multiple paths of video streams includes performing unscheduled frame extraction and frame loss through a browser according to a certain condition to realize synchronous playing of audio and video. However, in this way, the picture is not regularly stuck, i.e. part of the video data is lost. Moreover, the browser end is complex to implement and cannot be docked with a third party.
Another method in the related art is to perform frame loss or delayed transmission on a streaming media server end irregularly according to a certain condition, thereby realizing audio and video synchronous transmission. This also results in video pictures that are not regularly stuck, i.e., part of the video data is lost.
In view of the above problems, no effective solution has been proposed in the art.
Disclosure of Invention
In view of the above, embodiments of the present application provide a method for transmitting media stream data, an electronic device, and a computer-readable storage medium to solve at least one problem in the background art.
In a first aspect, an embodiment of the present application provides a method for sending media stream data, which is applied to a recording and playing server, and the method includes:
acquiring a media stream, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; determining the storage time stamp of each path of video stream in the multiple paths of video streams based on the same time reference;
and selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to a client, wherein the start and stop times of the storage time stamps of any two groups of group of pictures (GOPs) in the preset video stream are not overlapped.
In some embodiments, the acquiring the media stream specifically includes:
acquiring a single-channel composite media stream, wherein the single-channel composite media stream comprises frame data belonging to a plurality of channels of video streams.
In some embodiments, the frame data of the preset video stream carries an extended time stamp corresponding to the storage time stamp; the number of binary bits occupied by the storage timestamp in the frame header is less than the number of binary bits required to be occupied by the extension timestamp;
after the obtaining the media stream, the method further comprises: and expanding the storage time stamp of each frame data in each path of the video stream into a binary number which is the same as the number of binary bits occupied by the expanded time stamp.
In some embodiments, the expanding the storage timestamp of each frame data in each of the video streams to a binary number that is the same as the number of binary bits required to be occupied by the expanded timestamp specifically includes:
corresponding to the current frame data as the first frame data of the media stream, expanding the storage time stamp of the current frame data into an expansion time stamp, wherein the value of the expansion time stamp is equal to that of the storage time stamp;
corresponding to first frame data of which the current frame data is only one path of video stream in the multiple paths of video streams, determining a first frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of first associated frame data; determining the extension time stamp of the current frame data according to the sum of the first frame time stamp increment and the value of the extension time stamp of the first associated frame data; wherein the first associated frame data is previous frame data of the current frame data in the media stream;
corresponding to other frame data of the current frame data except the first frame data of any one video stream in the multi-channel video streams, determining a second frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of second associated frame data; determining the extension time stamp of the current frame data according to the value sum of the increment of the second frame time stamp and the extension time stamp of the second associated frame data; and the second associated frame data is previous frame data in the media stream, which is the same as the channel identification information of the current frame data.
In some of these embodiments, the multiplexing of the video streams includes at least: a first video stream and a second video stream;
the selecting frame data of a preset video stream from the media stream and sending the frame data to the client specifically comprises: and selecting frame data of a preset video stream from the media stream to send to a client according to a selection strategy that the sending priority of the first video stream is greater than that of the second video stream.
In some embodiments, the selecting, according to the selection policy that the sending priority of the first video stream is greater than the sending priority of the second video stream, a frame data of a preset video stream from the media stream is selected and sent to a client, specifically including:
representing that the video stream to which the current frame data belong is a first video stream corresponding to the channel identification information, selecting the current frame data as a part of the preset video stream, and sending the selected current frame data to the client;
representing that the video stream to which the current frame data belongs is a second video stream corresponding to the channel identification information, and further determining whether to select the current frame data as a part of the preset video stream to be sent to the client according to the current state zone bit; wherein the status flag is changed based on channel identification information of the current frame data and the first duration; the first duration is determined by the difference value of the relevant timestamp values and the current state flag bit and is used for representing the duration without the first video stream data; the value difference of the relevant timestamp corresponds to a value difference between a storage timestamp of the current frame data and a storage timestamp of third relevant frame data, where the third relevant frame data is frame data which is one frame data of the current frame data in the media stream and belongs to the first video stream.
In some embodiments, the status flag includes a first state and a second state, and the second state is an initial state of the status flag; the method further comprises the following steps:
the state flag bit is in a second state, and the video stream to which the current frame data belongs is determined to be a first video stream according to the channel identification information of the current frame data, and the state flag bit is changed from the second state to the first state;
and if the state flag bit is in a first state and the first duration is greater than a preset time threshold, changing the state flag bit from the first state to the second state.
In some embodiments, the representing, by the channel identification information, that the video stream to which the current frame data belongs is a second video stream, and further determining, according to a current state flag, whether to select the current frame data as a part of the preset video stream to be sent to the client includes:
if the current state zone bit is in the first state, current frame data is not selected to be used as a part of the preset video stream to be sent to the client;
and if the current state zone bit is in the second state, selecting current frame data as a part of the preset video stream and sending the current frame data to the client.
In some embodiments, the first duration is equal to the associated timestamp value difference corresponding to the current state flag being in the first state;
and corresponding to the flag bit of the current state being in the second state, the first duration is equal to zero.
In some embodiments, the first video stream is a presentation video stream, and the second video stream is a camera video stream or a multi-picture fusion video stream.
In some of these embodiments, the media stream further comprises frame data belonging to an audio stream; the frame data of the audio stream carries a storage time stamp of each frame data and channel identification information used for representing that each frame data belongs to the audio stream; the storage time stamp of each frame data in the audio stream and the storage time stamp of each video stream in the multiple video streams are determined based on the same time reference;
the method further comprises the following steps: and determining that the current frame data belongs to the frame data in the audio stream according to the channel identification information, and sending the current frame data to the client.
In a second aspect, embodiments of the present application provide an electronic device, which includes a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the steps in any one of the above-mentioned method embodiments.
In a third aspect, embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in any one of the above-mentioned method embodiments.
The media stream data sending method, the electronic device and the storage medium acquire the media stream, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; determining the storage time stamp of each path of video stream in the multiple paths of video streams based on the same time reference; selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to the client, wherein the start and stop times of the storage time stamps of any two groups of image groups in the preset video stream are not overlapped; therefore, the browser of the client can directly decode and play the preset video stream sent by the recording and playing server without installing a plug-in, the problem of irregular picture blockage caused by the fact that the browser or the streaming media server detects the audio and video synchronization condition and triggers frame loss is avoided, and on-demand switching of multiple paths of video streams with different coding parameters is achieved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a schematic flowchart of a method for sending media stream data according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a video conference play timeline;
FIG. 3 is a schematic diagram of a video conferencing system;
fig. 4 is a flow chart of the recording and broadcasting service.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The embodiment of the application firstly provides a method for sending media stream data, which can be applied to a recording and broadcasting server. Fig. 1 is a schematic flow chart of a method for sending media stream data according to an embodiment of the present application, where as shown in the figure, the method includes:
step 101, acquiring a media stream, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; the storage time stamp of each of the multiple video streams is determined based on the same time reference.
In a video conference system, a presentation apparatus, an image pickup apparatus, a recording apparatus, and the like can all produce media streams as synchronous sources (also referred to as media stream producers) of a video conference. Under the condition that multiple paths of media streams are produced in the same video conference, if multiple paths of video streams exist, a video conference server fuses or does not fuse the multiple paths of video streams and then sends the fused multiple paths of video streams to participating equipment such as soft and hard terminals requesting the video streams; and if the plurality of audio streams exist, the video conference server sends the plurality of audio streams to the participating equipment such as the soft terminal and the hard terminal which request the audio streams after mixing or not mixing. In a common situation, the video conference server selects one video stream from multiple media streams as a main video stream, selects one presentation video stream as a presentation stream, and sends three media streams, namely one audio stream (or multiple audio streams mixed into one audio stream), to the participating device; in other cases, the paths of the camera video stream, the presentation video stream and the audio stream can be multiple paths. In addition, the video conference server may also send some other streams with other roles of control functions, such as auxiliary streams, etc., to the participating devices.
In an ongoing video conference, media streams are generated in real time by a synchronization source and transmitted to each participating device in real time by a video conference server, and the transmission of the media streams also generally adopts real-time streaming transmission, so as to control the priority, the frame rate and the like of the transmitted multiple media streams according to the network condition to ensure the real-time performance of the media streams. The Streaming Media Server used for real-time Streaming can be a QuickTime Streaming Server, a RealServer or a Windows Media Server. The Streaming Media Protocol used for real-time Streaming may be RTSP (real Streaming Protocol) or MMS (Microsoft Media Server) Protocol.
In this embodiment, the recording and playing server may request multiple media streams from the video conference server by simulating a terminal conference-in behavior, that is, for the video conference server, the recording and playing server may be a participating device that is indistinguishable from other participating devices, and the media streams requested by the recording and playing server are transmitted to the recording and playing server. The recording and broadcasting server receives the multi-path media stream transmitted by the video conference server in response to the terminal conference-in behavior.
It should be noted that, the recording and broadcasting server can request the media streams produced by all the synchronous sources participating in the same video conference from the video conference server as much as possible under the condition that the network transmission bandwidth of the recording and broadcasting server allows. Or the recording and broadcasting server requests the video conference server for the media stream produced by the partial synchronous information source participating in the same video conference.
For example, in a relatively common situation, the recording and playing server may request, as a main stream, one camera video stream, one presentation video stream, and one audio stream from the video conference server. In other embodiments, the recording and playing server may request one-channel multi-picture fusion video stream in addition to the three-channel media stream from the video conference server.
In order to ensure reliable transmission of media stream data, the recording and playing server and the video conference server are deployed in the same network, and/or the available transmission bandwidth from the video conference server to the recording and playing server is large enough to make the available transmission bandwidth not less than the transmission bandwidth required for reliable transmission of multiple media streams.
Here, the frame data is also referred to as a data frame and is a protocol data unit of a data link layer. The frame data includes three parts: a frame header, a data portion, and a frame trailer; the header and trailer of the frame data contain some necessary information. In the embodiment of the present application, the storage time stamp of each frame data and the channel identification information of the associated video stream are stored, for example, by the frame header of each frame data.
The channel identification information may be stored in the header or the trailer for identifying the media stream type of the current frame data. The channel identification information is used to identify, for example, the video stream to which the current frame data belongs as a presentation video stream or a camera video stream, where the current frame data is frame data in each video stream. The channel identification information of the frame data in the video stream may be determined based on a terminal that collects the video stream, for example, a video stream shared by a notebook computer on a desktop has first channel identification information, and the first channel identification information may characterize the video stream as a presentation video stream; and the video stream collected by the camera has second channel identification information, and the second channel identification information can represent that the video stream is a camera video stream. Thus, based on the channel identification information, the video stream to which the frame data belongs, that is, the media stream type of the frame data, can be determined. In the selection strategy that the sending priority of the presentation video stream is greater than that of the camera video stream, the camera video stream can be also called a video main stream.
In a specific embodiment, the acquiring the media stream specifically includes: and acquiring a single-path composite media stream, wherein the single-path composite media stream comprises frame data belonging to the multi-path video stream.
The acquired single-path composite media stream is stored as the single-path composite media stream by a queue as a data structure according to the time sequence of the storage time stamps through multiple paths of media streams, the condition that frame data of different media streams are read in the recording and broadcasting server by taking the storage time stamps as the sequence is ensured, and videos are also played at a client according to the sequence of the storage time stamps, so that the problem that middle default images need to be filled through front and rear frame images due to the fact that the front and rear frames are not connected can be avoided.
It can be understood that the single-channel composite media stream is different from other media stream frame formats in that the initial frame header of the media stream frame format preset in this embodiment includes information of transmission timing of initial frame data, where the information of transmission timing is transmission time and transmission order of the frame data in the entire multiple media streams of the same video conference when the application layer receives each initial frame data. That is, it is equivalent to regard multiple media streams of the same video conference as one media stream when determining the information of the transmission timing, without considering the differences of the media stream types, producers, and frame types.
One way is therefore to determine the above-mentioned information of the transmission timing by means of a monotonic clock value of the recording and broadcasting server. The monotonic clock is a clock with a monotonically changing (usually monotonically increasing) clock value, and is different from a TPN clock that can be calibrated and adjusted, and the monotonic clock is usually not adjustable, so the monotonic clock can be used for judging the occurrence sequence of events. One of the most common monotonic clocks is the monotonic clock that starts counting since system startup, which can be obtained with a clock _ gettime function (various types of clocks including clock _ monotonic can be provided). The monotone clock value returned by calling the clock _ gettime time function has high precision, and the highest precision can reach nanosecond level. When the recording and broadcasting server acquires a certain initial frame data in the multi-path media stream at the application layer, the recording and broadcasting server acquires a monotonic clock value of the recording and broadcasting server at the current moment, and then determines the transmission time sequence information of the initial frame data according to the monotonic clock value.
Monotonic clock values are typically long integer data representing seconds, milliseconds, or nanoseconds. In the present embodiment, the information for determining the transmission timing with the time granularity of milliseconds is information that can sufficiently accurately represent the transmission timing of each initial frame data, and therefore, in the present embodiment, the recording and broadcasting server takes the total number of milliseconds of the monotonic clock value at the current time as the information of the transmission timing of the current initial frame data.
Here, the storage time stamp of each frame data may be generated by the recording server from information of the transmission timing of each initial frame data.
In the case where the number of binary bits occupied in the frame header of the initial frame data of the storage time stamp is greater than the number of binary bits occupied by the transmission timing information, the number of insufficient binary bits can be complemented by 0 by directly using the transmission timing information (for example, the total number of milliseconds of the monotone clock value) of each initial frame data as the storage time stamp of each initial frame data.
Taking the monotonic clock value as an example, the monotonic clock value is 4 bytes (i.e. binary 32 bits) in a 32-bit system, and 8 bytes (i.e. binary 64 bits) in a 64-bit system. In order to save the number of bytes of the frame header of the media stream, the storage time stamp of the present application may be 4 bytes (32 bits) or 2 bytes (16 bits). Therefore, in the case where the number of binary bits occupied by the storage time stamp is smaller than the number of binary bits occupied by the total number of milliseconds of the monotonic clock value, it is necessary to convert the number of binary bits of the total number of milliseconds of the monotonic clock value to be the same as the number of binary bits of the storage time stamp. In this embodiment, the obtained remainder is used as the storage timestamp of each initial frame data by taking a modulo manner of the total number of milliseconds of the monotonic clock value of each initial frame data and a preset value, where the preset value is the maximum number that can be represented by the number of binary bits occupied by the storage timestamp in the frame header of the current initial frame data. For example, a 16-bit binary can represent a maximum number of powers of 16 of 2, namely 65536, so by taking the remainder modulo 65536 on the total number of milliseconds of a monotonic clock value, the total number of milliseconds of the monotonic clock value can be converted into a stored timestamp that can be represented in a 16-bit binary.
And then, the recording and broadcasting server combines the updating frame head corresponding to each initial frame, the data part of each initial frame and the frame tail to obtain the one-way composite media stream.
In practical application, a storage timestamp is generated according to the transmission time sequence information of initial frame data, the timestamp is added or modified to a frame header to obtain update frame data of different media streams, and the update frame data of the different media streams are arranged in time and sequence according to the transmission time sequence to obtain a single-path composite media stream.
And 102, selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to the client, wherein the start time and the end time of the storage time stamps of any two groups of image groups in the preset video stream are not overlapped.
Here, a Group of Pictures (GOP) is a Group of consecutive Pictures, which is composed of an I frame and a B/P frame, and is a basic unit accessed by a video encoder and decoder, and the arrangement sequence thereof is repeated until the end of the video. Each GOP begins with an I-frame until the end of a P-frame or B-frame before the next I-frame. For example, a plurality of frames in one GOP are arranged as follows: IPPBPPPBPPPP. In a standard media stream protocol, the start and stop times of the presentation time stamps for any two groups of GOPs in the media stream are typically non-overlapping. In this embodiment, the start-stop times of the storage timestamps of any two groups of GOPs are not overlapped, that is, if two or more groups of GOPs with overlapped start-stop times of the storage timestamps exist in the acquired media stream, the recording and playing server only selects one group of GOPs as frame data of the preset video stream and sends the frame data to the client. Therefore, the client can play the preset video stream by adopting a standard media stream protocol.
In some embodiments, frame data of the preset video stream carries an extended time stamp corresponding to the storage time stamp; storing the binary bit number occupied by the timestamp in the frame header to be less than the binary bit number occupied by the extended timestamp;
after acquiring the media stream, the method further comprises: and expanding the storage time stamp of each frame data in each path of video stream into a binary number which is the same as the number of binary bits occupied by the expanded time stamp.
Furthermore, in an embodiment where the media stream further comprises frame data belonging to the audio stream, after obtaining the media stream, the method further comprises: the storage time stamp of each frame data in the audio stream is extended to a binary number equal to the number of binary bits required to extend the time stamp.
In a specific application scenario, after acquiring a media stream, the method may include: the storage time stamp of all frame data in the media stream is extended to the same binary number as the number of binary bits required to extend the time stamp.
Before extending the storage timestamp to the extended timestamp, the method may further include: and initializing the extension time stamp of each frame data of each video stream to be zero. Next, based on the storage time stamp of the frame data, the corresponding extended time stamp is set.
It is understood that to save the number of frame headers of the media stream, a smaller number of binary bits (e.g., 16 bits) are often used to store the timestamp in each frame header. When the streaming media is broadcast on demand, the display time stamp needs to be converted into a display time stamp with 64-bit binary digit number, so that the situation that the binary digit number occupied by the storage time stamp in the frame header is less than the binary digit number occupied by the display time stamp can occur; in this case, before the recording and broadcasting server sends the one-way composite media stream to the client, the recording and broadcasting server may extend the number of binary bits storing the timestamp in the update frame header of each update frame in at least a part of the one-way composite media stream to the same number of binary bits displaying the timestamp.
In some embodiments, the expanding the storage timestamp of each frame data in each video stream to a binary number that is the same as the number of binary bits that the expanded timestamp needs to occupy specifically includes:
corresponding to the current frame data as the first frame data of the media stream, expanding the storage time stamp of the current frame data into an expanded time stamp, wherein the numerical value of the expanded time stamp is equal to the numerical value of the storage time stamp;
corresponding to first frame data of which the current frame data is only one path of video stream in the multiple paths of video streams, determining a first frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of first associated frame data; determining the extension time stamp of the current frame data according to the value sum of the increment of the first frame time stamp and the extension time stamp of the first associated frame data; the first associated frame data is the previous frame data of the current frame data in the media stream;
determining a second frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of second associated frame data, wherein the current frame data are other frame data except first frame data of any one path of video stream in the multi-path video stream; determining the extension time stamp of the current frame data according to the value sum of the increment of the second frame time stamp and the extension time stamp of the second associated frame data; and the second associated frame data is the previous frame data which is the same as the media stream identifier of the current frame data in the media stream.
Here, when the media stream further includes frame data of other media streams than the frame data of the multiple video streams, for example, when the media stream further includes frame data of the audio stream, the first frame data of the media stream may not be frame data in the multiple video streams, for example, may be the first frame data of the audio stream.
Determining a first frame timestamp increment according to the storage timestamp of the current frame data and the storage timestamp of the first associated frame data, which may specifically include:
determining the increment of the first frame timestamp as the difference value between the value of the storage timestamp of the current frame data and the value of the storage timestamp of the first associated frame data, wherein the value of the storage timestamp corresponding to the current frame data is greater than or equal to the value of the storage timestamp of the first associated frame data;
and determining that the increment of the first frame time stamp is equal to the maximum quantity which can be represented by the binary bit number of the storage time stamp plus the value of the storage time stamp of the current frame data minus the value of the storage time stamp of the first associated frame data.
Here, the example will be described in which the number of binary bits occupied by the storage time stamp is 16 bits, and the number of binary bits occupied by the extended time stamp is 64 bits. The values of the storage time stamps of the current frame data and the first associated frame data are both a value from 0 to 65535, and since the first associated frame data is before and the current frame data is after, under a general condition, the value of the storage time stamp of the current frame data is larger than that of the storage time stamp of the first associated frame data, and at the moment, the difference between the values is determined as the increment of the first frame time stamp; in other cases, since the storage time stamps are rearranged from 0 after being arranged to a value equal to 65535 (i.e., … … 65533, 65534, 65535, 0, 1, 2 … …), it may happen that the current frame data is prior to the first associated frame data but has a value less than the value of the storage time stamp of the first associated frame data, for which case the calculation of the first frame time stamp increment should take the maximum number 65536 of 16-bit binary representable plus the value of the storage time stamp of the current frame data and subtract the value of the storage time stamp of the first associated frame data. For example, if the value of the storage timestamp of the current frame data is 2, and the value of the storage timestamp of the first associated frame data is 65535, the first frame timestamp increment is equal to 65536+2-65535, i.e., equal to 3.
Determining the extension timestamp of the current frame data according to the sum of the increment of the first frame timestamp and the value of the extension timestamp of the first associated frame data, which specifically comprises the following steps: determining the sum of the first frame timestamp increment and the value of the extension timestamp of the first associated frame data as the value of the extension timestamp of the current frame data; and determining the extension time stamp of the current frame data according to the value of the extension time stamp of the current frame data. Continuing to take the value of the storage timestamp of the current frame data as 2, the value of the storage timestamp of the first associated frame data as 65535, and the increment of the first frame timestamp equals 3 as an example, then the value of the extended timestamp of the current frame data equals 65535+3, that is, 65538; converting 65538 to a 64-bit binary number may obtain an extended timestamp for the current frame data.
Similarly, determining the second frame timestamp increment according to the storage timestamp of the current frame data and the storage timestamp of the second associated frame data may specifically include:
determining the increment of the second frame timestamp as the difference value between the value of the storage timestamp of the current frame data and the value of the storage timestamp of the second associated frame data, wherein the value of the storage timestamp corresponding to the current frame data is greater than or equal to the value of the storage timestamp of the second associated frame data;
and determining that the increment of the second frame time stamp is equal to the maximum quantity which can be represented by the binary bit number of the storage time stamp plus the value of the storage time stamp of the current frame data minus the value of the storage time stamp of the second associated frame data.
Here, the method is similar to the method for determining the timestamp increment of the first frame, and thus, the description is omitted.
Determining the extension timestamp of the current frame data according to the sum of the increment of the second frame timestamp and the value of the extension timestamp of the second associated frame data, which specifically comprises the following steps: determining the sum of the increment of the second frame time stamp and the value of the extension time stamp of the second associated frame data as the value of the extension time stamp of the current frame data; and determining the extension time stamp of the current frame data according to the value of the extension time stamp of the current frame data.
It is understood that the above-mentioned step of extending the storage time stamp of each frame data in each video stream to a binary number equal to the number of binary bits occupied by the extended time stamp is applicable to the extension of the storage time stamp of any frame data in the media stream, i.e. also applicable to the frame data of the audio stream.
In some of these embodiments, the multiple video streams include at least: a first video stream and a second video stream; the method for selecting frame data of a preset video stream from a media stream and sending the frame data to a client specifically comprises the following steps: and selecting frame data of a preset video stream from the media stream to send to the client according to a selection strategy that the sending priority of the first video stream is greater than that of the second video stream.
In a specific application scenario, the first video stream is, for example, a presentation video stream, and the second video stream is, for example, a camera video stream or a multi-picture fusion video stream. The selection policy is, for example, that the transmission priority of the presentation video stream is greater than the transmission priority of the camera video stream.
In some embodiments, selecting, by using a selection policy that the sending priority of the first video stream is greater than the sending priority of the second video stream, frame data of a preset video stream from the media stream to be sent to the client specifically includes: and representing the video stream to which the current frame data belong as a first video stream corresponding to the channel identification information, selecting the current frame data as a part of the preset video stream, and sending the selected current frame data to the client.
In some embodiments, selecting, by using a selection policy that the sending priority of the first video stream is greater than the sending priority of the second video stream, frame data of a preset video stream from the media stream to be sent to the client specifically includes: representing that the video stream to which the current frame data belongs is a second video stream corresponding to the channel identification information, and further determining whether to select the current frame data as a part of the preset video stream to be sent to the client according to the current state zone bit; wherein the status flag bit is changed based on the channel identification information of the current frame data and the first duration; the first duration is determined by the difference value of the relevant timestamp values and the current state flag bit and is used for representing the duration without the first video stream data; the value difference of the relevant time stamp corresponds to the value difference between the current frame data and the storage time stamp of the third relevant frame data, and the third relevant frame data is the frame data which is one frame data which belongs to the first video stream before the current frame data in the media stream.
Here, the correlation time stamp value difference corresponds to a value difference between the storage time stamps of the current frame data and the third correlation frame data, and specifically includes: in an embodiment of extending the storage time stamp of each frame data in each video stream to a binary number equal to the number of binary bits occupied by the extended time stamp, the value difference of the associated time stamp is equal to the value difference between the extended time stamp of the current frame data and the third associated frame data.
The status flag bit may include a first state and a second state, and the second state is an initial state of the status flag bit; the method further comprises the following steps:
the corresponding state flag bit is in a second state, and the video stream to which the current frame data belongs is determined to be a first video stream according to the channel identification information of the current frame data, and the state flag bit is changed from the second state to the first state;
and if the state flag bit is in the first state and the first duration is greater than the preset time threshold, changing the state flag bit from the first state to the second state.
In a specific application, when the video stream to which the current frame data belongs is represented as a first video stream corresponding to the channel identification information, the method may further include: and further judging whether the current state zone bit is in the first state, if so, selecting the current frame data as a part of the preset video stream and sending the current frame data to the client. It should be understood that, if the determination result is negative, the status flag is changed from the second status to the first status, and then the step of selecting the current frame data as a part of the preset video stream to be sent to the client is still performed.
As will be understood with reference to fig. 2, taking the first video stream as the presentation video stream (i.e., "1 presentation stream" in the figure) as an example, the presentation video stream may be sometimes, and in a video conference, the display priority of the presentation video stream is higher than that of the camera video stream (i.e., the video main stream in the figure), so through the above steps, the effect of sending the presentation video stream with the presentation video stream can be achieved.
Representing that the video stream to which the current frame data belongs is a second video stream corresponding to the channel identification information, and determining whether to select the current frame data as a part of a preset video stream to be sent to the client according to the current state zone bit, specifically comprising:
if the current state zone bit is in the first state, current frame data is not selected as a part of the preset video stream to be sent to the client;
and if the current state zone bit is in the second state, selecting the current frame data as a part of the preset video stream and sending the part of the preset video stream to the client.
With continuing reference to fig. 2, it can be understood that, taking the second video stream as the camera video stream (i.e., "2 video main stream" in the figure) as an example, the camera video stream usually exists continuously in the video conference, and the display priority of the camera video stream is often lower than that of the presentation video stream (i.e., the presentation stream in the figure), so through the above steps, the effect of sending the presentation video stream with the presentation video stream (corresponding to the first video stream) and sending the camera video stream without the presentation video stream can be achieved.
In addition, as shown in fig. 2, the media stream may further include frame data belonging to an audio stream (i.e., "0 audio stream" in the figure), and the audio stream usually exists continuously in the video conference, so in this embodiment of the present application, the audio stream may be completely selected and sent to the client.
In some embodiments, the first duration is equal to the difference in the associated timestamp values corresponding to the current state flag being in the first state; the first duration is equal to zero corresponding to the current state flag being in the second state.
In other words, the first duration is not calculated corresponding to the current status flag being in the second state.
And the flag bit corresponding to the current state is in the first state, and whether the first state is finished or not is represented by setting a preset time threshold and judging whether the first duration is greater than the preset time threshold, namely the first video stream is finished.
Here, the preset time threshold is, for example, 1500 msec.
In an embodiment, changing the status flag from the second status to the first status includes: and if the corresponding status flag bit is in the second state, and the video stream to which the current frame data belongs is determined to be the first video stream according to the channel identification information of the current frame data, and the current frame data is a video key frame (i.e., an I frame), the status flag bit is changed from the second state to the first state. And/or, changing the status flag bit from the first status to the second status, specifically comprising: and if the status flag bit is in the first state, the first duration is greater than the preset time threshold, and the current frame data is a video key frame (i.e., an I frame), changing the status flag bit from the first state to the second state.
In the following, the status flag is taken as the flag of the demonstration status for explanation; specifically, the status flag bit is in a first state, which indicates that the current status is a demonstration status; the status flag bit is in a second state, which indicates that the presentation state is not currently available. The first state is specifically represented by "yes/true", for example, and the second state is specifically represented by "no/false", for example. In this embodiment, if the flag bit of the current presentation state is "no" and the current data frame is a video key frame, and when the video stream to which the current data frame belongs is determined to be a presentation video stream, the flag bit of the current presentation state is changed from "no" to "yes", that is, the no-presentation state is changed into the presentation state; if the flag bit of the current demonstration state is 'yes', the current data frame is a video key frame, and when the video stream to which the current data frame belongs is judged to be a video stream for shooting and the first duration is larger than the preset time threshold, the end of the demonstration is judged, and the flag bit of the current demonstration state is changed from 'yes' to 'no', namely, the demonstration state is changed into a non-demonstration state.
Specifically, the recording and playing server reads frame data stored in a single-channel media stream of three media streams, namely an audio stream, a presentation stream and a video main stream, and judges the type of the media stream of which the data frame is read at this time; if the type of the currently read frame data is demonstration flow, judging whether the currently read frame data is in a demonstration state, and if the currently read frame data is in the demonstration state; and sending the extended time stamp of the demonstration stream as a frame time stamp to the mobile terminal.
In the embodiment that the media stream further includes frame data belonging to the audio stream, the frame data of the audio stream carries a storage time stamp of each frame data and channel identification information for representing that each frame data belongs to the audio stream; the storage time stamp of each frame data in the audio stream and the storage time stamp of each video stream in the multiple video streams are determined based on the same time reference; the method further comprises the following steps: and determining that the current frame data belongs to the frame data in the audio stream according to the channel identification information, and sending the current frame data to the client.
It will be appreciated that the frame data for the audio stream may be sent to the client in its entirety.
The present application is described and illustrated below with a specific example.
Fig. 3 shows a schematic structural diagram of a video conference system; as shown in the figure, the video conference system comprises a plurality of soft and hard terminals, an MCU and a recording and broadcasting server, wherein the recording and broadcasting server and the MCU are deployed in the same network, the bandwidth between the recording and broadcasting server and the MCU is sufficient, and the recording and broadcasting server and the MCU keep system time synchronization. In addition, the recording and broadcasting server is also connected with the WEB browser, so that the frame data is sent to the WEB browser for the WEB browser to broadcast.
In this particular example, the types of video conference media streams are shown in table 1.
Table 1 video conference media stream types
Type (B) Media streaming
0 Audio streaming
1 Demonstration flow (e.g. desktop sharing)
2 Video mainstream (pick-up camera picture)
Wherein the unique media stream type can be determined by the channel identification information of the media stream.
In this specific example, the method for storing the video in the recording and playing server includes:
step S01, receiving any frame data of three media streams of audio stream, demonstration stream and video main stream;
step S02, obtaining the monotone clock millisecond value of the current recording and playing server, performing modulo operation on the monotone clock millisecond value and 65536 (here, taking the example that the binary bit number occupied by the storage timestamp in the frame header is equal to 16, at this time, the maximum number that the occupied binary bit number can represent is 65536), obtaining a newPTS storage timestamp, modifying the 16-bit PTS value of the frame header with the value of the newPTS storage timestamp, and then performing video recording storage.
Based on the storage mode of the media stream, the recording and playing server of the specific example reads and sends media stream data to the terminal requesting video playback through the following steps;
step S11, when playback starts, the 16-bit storage timestamps pts16[0], pts16[1] and pts16[2] of the three media streams are initialized to be 0, the 64-bit extended timestamps pts64[0], pts64[1] and pts64[2] are all 0, all stream storage timestamps last _ pts _ all16 are set to be 0, all stream extended timestamps last _ pts _ all64 are set to be 0, and the initial value of the presentation state flag inPresentation is set to be false;
step S12, reading frame data of each path of media stream in the session, acquiring next frame data, calling a frame time stamp conversion algorithm to convert a storage time stamp pts16 of a frame header of the frame into an extended time stamp pts 64;
step S13, if the inPresentation is false (i.e. not being presented), and it is determined that the frame is an I frame (i.e. video key frame), and the media stream type of the frame is 1 (i.e. presentation stream), the inPresentation is set to true, i.e. entering the presentation state;
step S14, calculating the duration noPresentationInterval of no presentation stream in the current playback session;
step S14-1, if the inPresentation is true (i.e. in the presentation state), then the noPresentationInterval = last _ pts _ all 64-pts 64[1 ];
step S14-2, if the inPresentation is false (i.e. no presentation state), then the nopresentation interval = 0;
step S15, if the inPresentation is true (i.e. the presentation state), and the frame is an I frame (i.e. the video key frame), and the media stream type of the frame is 2 (i.e. the video main stream), and the noPresentationInterval is greater than a threshold (e.g. 1500 ms); judging that the presentation is finished, setting the inPresence to false, and setting the extended timestamp pts64[1] of the presentation stream to 0;
step S16, judging the media stream type of the reading frame;
step S16-1, if the media stream type of the read frame is 0 (i.e. audio stream), using PTS64[0] as the frame time stamp to send directly;
step S16-2, if the media stream type of the read frame is 1 (i.e. presentation stream), further judging whether the inPresentation is true, if true, using PTS64[1] as the frame time stamp to send; if the media stream type of the frame read this time is 1 (namely, presentation stream) and the inPresentation is a flash, changing the inPresentation to true, and then sending the frame using PTS64[1] as a frame time stamp;
step S16-3, if the media stream type of the read frame is 2 (i.e. video main stream), further judging whether the inPresence is false, if so, using PTS64[2] as the frame time stamp to send; if the media stream type of the read frame is 2 and the inPresentation is true, changing true to flush, and then sending the frame by using the PTS64[2] as the frame time stamp.
Through the steps, the browser end and the streaming media server end can realize on-demand switching of multiple paths of video streams with different coding parameters without detecting audio and video synchronization conditions and triggering frame loss, and the conference video on-demand effect of 'watching the display stream with the display stream and watching the video main stream without the display stream' is realized; meanwhile, the development workload at the browser end is small, and most mobile phone browsers can be directly supported, so that the application range is wider; in addition, video streams in a unified coding format are generated without fusing multiple paths of video streams in a conference for video storage and playback, coding resources of a server are saved, one-time recoding is avoided, and video quality is guaranteed.
Here, the logic of the HLS on-demand session of the recording and playing server is: the transport stream (ts stream) slice file is constructed using the extended time stamp PTS64 as a Presentation Time Stamp (PTS) and a Decoding Time Stamp (DTS). Wherein, the recording and broadcasting service flow is shown in fig. 4.
As one possible implementation manner, first, the HLS service is started at the client, after the HLS service is started, an on-demand session of the HLS is established, an HLS on-demand data source is specified from the on-demand session, and after the on-demand data source is determined, the media stream data in the internal storage of the recording and playing server is read by using the video reading module.
In this specific example, the timestamp is converted by a frame timestamp conversion algorithm, which specifically includes the following steps, first inputting parameters, where the parameters include: media stream type, storage time stamp pts16 of frame data;
step S31, if pts64[ type ] is 0, acquiring all stream extension time stamp last _ pts _ all64 values;
step S31-1, if last _ pts _ all64 is 0, then the frame is the first frame of the entire playback session, directly using the storage timestamp as the extension timestamp, pts64[ type ] = pts 16;
step S31-2, otherwise, the frame is the first frame of the media stream type media stream, and a frame time stamp increment delta is calculated by using a storage time stamp on all streams, namely last _ pts _ all 16;
step S31-2-1, if pts16 > = last _ pts _ all16, delta = pts 16-last _ pts _ all 16;
step S31-2-2, otherwise delta = 65536+ pts 16-last _ pts _ all 16;
step S31-3, pts64[ type ] = last _ pts _ all64 + delta;
step S32-1, if pts64[ type ] is not 0, calculating frame time stamp increment delta by using the last storage time stamp pts16[ type ] of the media stream type;
step S32-1-1, if pts16 > = pts16[ type ], delta = pts 16-pts 16[ type ];
step S32-1-2, otherwise delta = 65536+ pts 16-pts 16[ type ];
step S33-1, pts64[ type ] = pts64[ type ] + delta;
step S33-2, pts16[ type ] = pts 16;
step S33-3, last _ pts _ all64 = pts64[ type ];
step S33-4, last _ pts _ all16 = pts 16;
at step S33-5, the frame timestamp conversion ends.
In this way, the frame time stamp converting method converts the storage time stamp of each frame data into the extended time stamp for each media stream type.
It should be noted that, for specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and optional implementations, and details are not described again in this embodiment.
In addition, in combination with the method for sending media stream data provided in the foregoing embodiments, an embodiment of the present application further provides an electronic device, which includes a processor and a memory, where the memory stores a computer program, and the processor implements the steps in any one of the foregoing method embodiments when executing the computer program.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
acquiring a media stream, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; determining the storage time stamp of each path of video stream in the multiple paths of video streams based on the same time reference;
and selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to the client, wherein the start and stop times of the storage time stamps of any two groups of image groups in the preset video stream are not overlapped.
Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in any one of the above method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware related to instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (13)

1. A method for sending media stream data is applied to a recording and broadcasting server, and is characterized by comprising the following steps:
acquiring a media stream, wherein the media stream comprises frame data belonging to multiple paths of video streams, and the frame data of each path of video stream carries a storage time stamp of each frame data and channel identification information of the video stream to which each frame data belongs; determining the storage time stamp of each path of video stream in the multiple paths of video streams based on the same time reference;
and selecting frame data of a preset video stream from the media stream according to the storage time stamp and the channel identification information of each frame data, and sending the frame data to a client, wherein the start and stop times of the storage time stamps of any two groups of group of pictures (GOPs) in the preset video stream are not overlapped.
2. The method for sending media stream data according to claim 1, wherein the acquiring a media stream specifically includes:
acquiring a single-channel composite media stream, wherein the single-channel composite media stream comprises frame data belonging to a plurality of channels of video streams.
3. The method for sending media stream data according to claim 1, wherein frame data of the preset video stream carries an extended timestamp corresponding to the storage timestamp; the number of binary bits occupied by the storage timestamp in the frame header is less than the number of binary bits required to be occupied by the extension timestamp;
after the obtaining the media stream, the method further comprises: and expanding the storage time stamp of each frame data in each path of the video stream into a binary number which is the same as the number of binary bits occupied by the expanded time stamp.
4. The method for sending media stream data according to claim 3, wherein the expanding the storage timestamp of each frame data in each video stream to a binary number that is the same as a binary number required to be occupied by the expanded timestamp includes:
corresponding to the current frame data as the first frame data of the media stream, expanding the storage time stamp of the current frame data into an expansion time stamp, wherein the value of the expansion time stamp is equal to that of the storage time stamp;
corresponding to first frame data of which the current frame data is only one path of video stream in the multiple paths of video streams, determining a first frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of first associated frame data; determining the extension time stamp of the current frame data according to the sum of the first frame time stamp increment and the value of the extension time stamp of the first associated frame data; wherein the first associated frame data is previous frame data of the current frame data in the media stream;
corresponding to other frame data of the current frame data except the first frame data of any one video stream in the multi-channel video streams, determining a second frame timestamp increment according to a storage timestamp of the current frame data and a storage timestamp of second associated frame data; determining the extension time stamp of the current frame data according to the value sum of the increment of the second frame time stamp and the extension time stamp of the second associated frame data; and the second associated frame data is previous frame data in the media stream, which is the same as the channel identification information of the current frame data.
5. The method for transmitting media stream data according to claim 1, wherein the multiplexing of the video streams includes at least: a first video stream and a second video stream;
the selecting frame data of a preset video stream from the media stream and sending the frame data to the client specifically comprises: and selecting frame data of a preset video stream from the media stream to send to a client according to a selection strategy that the sending priority of the first video stream is greater than that of the second video stream.
6. The method for sending media stream data according to claim 5, wherein the selecting policy that the sending priority of the first video stream is greater than the sending priority of the second video stream selects frame data of a preset video stream from the media stream and sends the frame data to the client, specifically comprises:
representing that the video stream to which the current frame data belong is a first video stream corresponding to the channel identification information, selecting the current frame data as a part of the preset video stream, and sending the selected current frame data to the client;
representing that the video stream to which the current frame data belongs is a second video stream corresponding to the channel identification information, and further determining whether to select the current frame data as a part of the preset video stream to be sent to the client according to the current state zone bit; wherein the status flag is changed based on channel identification information of the current frame data and the first duration; the first duration is determined by the difference value of the relevant timestamp values and the current state flag bit and is used for representing the duration without the first video stream data; the value difference of the relevant timestamp corresponds to a value difference between a storage timestamp of the current frame data and a storage timestamp of third relevant frame data, where the third relevant frame data is frame data which is one frame data of the current frame data in the media stream and belongs to the first video stream.
7. The method for sending media stream data according to claim 6, wherein the status flag includes a first state and a second state, and the second state is an initial state of the status flag; the method further comprises the following steps:
the state flag bit is in a second state, and the video stream to which the current frame data belongs is determined to be a first video stream according to the channel identification information of the current frame data, and the state flag bit is changed from the second state to the first state;
and if the state flag bit is in a first state and the first duration is greater than a preset time threshold, changing the state flag bit from the first state to the second state.
8. The method for sending media stream data according to claim 7, wherein the video stream to which the current frame data belongs is represented as a second video stream corresponding to the channel identification information, and further determines whether to select the current frame data as a part of the preset video stream to send to the client according to a current status flag, specifically comprising:
if the current state zone bit is in the first state, current frame data is not selected to be used as a part of the preset video stream to be sent to the client;
and if the current state zone bit is in the second state, selecting current frame data as a part of the preset video stream and sending the current frame data to the client.
9. The transmission method of media stream data according to claim 7 or 8,
corresponding to the current state zone bit being in a first state, the first duration is equal to the relevant timestamp value difference;
and corresponding to the flag bit of the current state being in the second state, the first duration is equal to zero.
10. The method for transmitting media stream data according to claim 5, wherein the first video stream is a presentation video stream, and the second video stream is a camera video stream or a multi-picture fusion video stream.
11. The transmission method of media stream data according to claim 1, wherein the media stream further includes frame data belonging to an audio stream; the frame data of the audio stream carries a storage time stamp of each frame data and channel identification information used for representing that each frame data belongs to the audio stream; the storage time stamp of each frame data in the audio stream and the storage time stamp of each video stream in the multiple video streams are determined based on the same time reference;
the method further comprises the following steps: and determining that the current frame data belongs to the frame data in the audio stream according to the channel identification information, and sending the current frame data to the client.
12. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and the processor is configured to execute the computer program to perform the steps of the method for transmitting media stream data according to any of claims 1 to 11.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for transmitting media stream data according to any one of claims 1 to 11.
CN202110212230.4A 2021-02-25 2021-02-25 Method for transmitting media stream data, electronic device and storage medium Active CN112584088B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110212230.4A CN112584088B (en) 2021-02-25 2021-02-25 Method for transmitting media stream data, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110212230.4A CN112584088B (en) 2021-02-25 2021-02-25 Method for transmitting media stream data, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN112584088A true CN112584088A (en) 2021-03-30
CN112584088B CN112584088B (en) 2021-07-06

Family

ID=75114014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110212230.4A Active CN112584088B (en) 2021-02-25 2021-02-25 Method for transmitting media stream data, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112584088B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115174567A (en) * 2022-06-22 2022-10-11 浙江大华技术股份有限公司 Code sending method, device, equipment and storage medium
CN115250357A (en) * 2021-04-26 2022-10-28 海信集团控股股份有限公司 Terminal device, video processing method and electronic device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060227813A1 (en) * 2005-04-11 2006-10-12 Mavrogeanes Richard A Method and system for synchronized video recording/delivery
CN103428462A (en) * 2013-08-29 2013-12-04 中安消技术有限公司 Method and device for processing multichannel audio and video
CN105430537A (en) * 2015-11-27 2016-03-23 刘军 Method and server for synthesis of multiple paths of data, and music teaching system
CN111416994A (en) * 2020-03-27 2020-07-14 上海依图网络科技有限公司 Method and device for synchronously presenting video stream and tracking information and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060227813A1 (en) * 2005-04-11 2006-10-12 Mavrogeanes Richard A Method and system for synchronized video recording/delivery
CN103428462A (en) * 2013-08-29 2013-12-04 中安消技术有限公司 Method and device for processing multichannel audio and video
CN105430537A (en) * 2015-11-27 2016-03-23 刘军 Method and server for synthesis of multiple paths of data, and music teaching system
CN111416994A (en) * 2020-03-27 2020-07-14 上海依图网络科技有限公司 Method and device for synchronously presenting video stream and tracking information and electronic equipment

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115250357A (en) * 2021-04-26 2022-10-28 海信集团控股股份有限公司 Terminal device, video processing method and electronic device
CN115250357B (en) * 2021-04-26 2024-04-12 海信集团控股股份有限公司 Terminal device, video processing method and electronic device
CN115174567A (en) * 2022-06-22 2022-10-11 浙江大华技术股份有限公司 Code sending method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN112584088B (en) 2021-07-06

Similar Documents

Publication Publication Date Title
US8914835B2 (en) Streaming encoded video data
KR102009124B1 (en) Establishing a streaming presentation of an event
CN110446072B (en) Video stream switching method, electronic device and storage medium
JP5043096B2 (en) Channel changing method and digital video apparatus
CN112584087B (en) Video conference recording method, electronic device and storage medium
CN112584088B (en) Method for transmitting media stream data, electronic device and storage medium
CN108540819B (en) Live broadcast data processing method and device, computer equipment and storage medium
CN109714622B (en) Video data processing method and device and electronic equipment
US8355450B1 (en) Buffer delay reduction
WO2018014523A1 (en) Media data acquisition method and apparatus
CN111031385B (en) Video playing method and device
CN111447455A (en) Live video stream playback processing method and device and computing equipment
KR20210029829A (en) Dynamic playback of transition frames while transitioning between media stream playbacks
CN106791994B (en) Low-delay quick broadcasting method and device
EP3096533B1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN112738451B (en) Video conference recording and playing method, device, equipment and readable storage medium
CN115119009B (en) Video alignment method, video encoding device and storage medium
US20020120942A1 (en) Apparatus for the decoding of video data in first and second formats
CN112954433A (en) Video processing method and device, electronic equipment and storage medium
CN112218128B (en) Advertisement video playing method, playing client and readable storage medium
CN114697712A (en) Method, device and equipment for downloading media stream and storage medium
CN114268830A (en) Cloud director synchronization method, device, equipment and storage medium
US8493429B2 (en) Method and terminal for synchronously recording sounds and images of opposite ends based on circuit domain video telephone
EP3096525A1 (en) Communication apparatus, communication data generation method, and communication data processing method
CN111526390B (en) Method and device for sending MMT packet and method for receiving MMT packet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant