WO2022002070A1 - Procédé de distribution en temps réel adaptatif pour flux multimédia, et serveur - Google Patents

Procédé de distribution en temps réel adaptatif pour flux multimédia, et serveur Download PDF

Info

Publication number
WO2022002070A1
WO2022002070A1 PCT/CN2021/103196 CN2021103196W WO2022002070A1 WO 2022002070 A1 WO2022002070 A1 WO 2022002070A1 CN 2021103196 W CN2021103196 W CN 2021103196W WO 2022002070 A1 WO2022002070 A1 WO 2022002070A1
Authority
WO
WIPO (PCT)
Prior art keywords
media
stream
substream
sub
target
Prior art date
Application number
PCT/CN2021/103196
Other languages
English (en)
Chinese (zh)
Inventor
姜红旗
辛振涛
姜红艳
申素辉
Original Assignee
北京开广信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京开广信息技术有限公司 filed Critical 北京开广信息技术有限公司
Publication of WO2022002070A1 publication Critical patent/WO2022002070A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Definitions

  • the present application relates to the technical field of digital information transmission, and in particular, to an adaptive real-time delivery method and server of a media stream.
  • RTP Real-time Transport Protocol, real-time transmission protocol
  • RTSP Real Time Streaming Protocol, real-time streaming protocol
  • HTTP HyperText Transfer Protocol, hypertext transfer protocol
  • HTTP Adaptive Streaming HTTP Adaptive Streaming
  • HTTP adaptive streaming includes various schemes: HLS (HTTP Live Streaming) proposed by Apple, Smooth Streaming proposed by Microsoft, HDS (HTTP Dynamic Streaming) proposed by Adobe, and DASH (Dynamic Adaptive Streaming) proposed by MPEG. Streaming over HTTP, HTTP-based dynamic adaptive streaming).
  • HLS HTTP Live Streaming
  • Smooth Streaming proposed by Microsoft
  • HDS HTTP Dynamic Streaming
  • DASH Dynamic Adaptive Streaming
  • MPEG Dynamic Adaptive Streaming
  • the common feature of the above HTTP adaptive streaming scheme is that the media stream is cut into short-term (2s ⁇ 10s) media segments, and an index file or manifest file describing these media segments is generated at the same time (such as m3u8 playlist in HLS or MPD file in DASH), and then save it to each web server, the client obtains the URL (Uniform Resource Locator, Uniform Resource Locator) access address of these media segments by accessing the playlist or manifest file, and then can use HTTP protocol to download and play these media segments one by one.
  • the main difference between these schemes is reflected in the encapsulation format and manifest file format adopted by the media segment.
  • HTTP adaptive streaming is easy to deploy using common web servers and adapts to the existing Internet infrastructure, including CDN, Caches, Firewall and NATS, etc., and can support large-scale user access.
  • the client can also select clips with suitable bit rates according to network conditions and terminal capabilities, so as to realize bit rate adaptation. Therefore, HTTP adaptive streaming has become the mainstream way of real-time streaming media delivery on the Internet.
  • a media stream transmitted on the Internet may include dozens of media sub-streams, which are manifested in the following aspects: 1) Various types of media sub-streams , the same scene can generate multiple types of media sub-streams including video, audio, subtitles, pictures, auxiliary information, data, etc. These media sub-streams need to be mixed together for transmission; 2) Multi-bit rate encoding, in order to adapt to the network bandwidth Transmission needs and processing capabilities of different terminals.
  • the same video stream can generate multiple encoded sub-streams according to different resolutions, frame rates and code rates, and multiple audio streams can generate multiple coded sub-streams according to different languages, sampling rates and code rates.
  • Coding sub-streams 3) Multi-view Video, in order to obtain a more realistic video experience, the same scene will generate multiple video sub-streams from different viewpoints, such as 3D video or free-view video; 4) Multi-sound In order to obtain an immersive audio experience, the same scene will be sampled from different positions to generate multiple audio sub-streams; 5) Scalable Video Coding (SVC), in order to adapt to the transmission of network bandwidth, one channel of video A base layer and several enhancement layers are produced during encoding. Further, any combination of the above aspects (eg, using multi-view video while using multi-rate video coding or scalable coding for each view) will result in a surprisingly large number of media sub-streams and media streams.
  • SVC Scalable Video Coding
  • sub-stream combined segmentation that is, encapsulating video sub-stream segments and audio sub-stream segments of the same time range in the same media segment and corresponding to an HTTP URL.
  • the client only needs to request once to get the corresponding video clips and audio clips, which ensures the synchronization of each substream and simplifies the processing of the receiving end.
  • different video substreams The number of combinations of streams and audio sub-streams will increase rapidly, and each combination will generate a new segment, which leads to repeated storage of video sub-streams and audio sub-streams on the server side, increasing the storage overhead of the server.
  • sub-streams are segmented independently, that is, each sub-stream is segmented independently, but the time alignment between segments of these different sub-streams is maintained, and each sub-stream segment corresponds to a URL.
  • the client can request the segmentation of each sub-stream as needed, and the server does not need to store the combined segmentation of each sub-stream, but because the client needs to submit requests multiple times to obtain different sub-streams Stream segmentation, which increases the transmission overhead and the difficulty of synchronization processing.
  • the above HTTP adaptive streaming transmission scheme has another problem: in order to support real-time transmission, the server needs to continuously update its manifest file, and the client needs to obtain the manifest file before obtaining the URL address of the latest media segment. Since the manifest file needs to be transmitted to the client after a period of time, the manifest file obtained by the client does not reflect the current generation of the latest media segment on the server, which will affect the real-time transmission performance of the media stream. When the number of substreams or combinations in the media stream reaches dozens, the manifest file will become very complicated, further increasing the transmission overhead and processing overhead of the client receiving the media stream.
  • the HTTP adaptive streaming transmission scheme based on pre-segmentation and manifest file is not suitable for adaptive real-time delivery of media streams containing many sub-streams, and a new delivery method needs to be designed for it.
  • the present application aims to solve one of the technical problems in the related art at least to a certain extent.
  • the first purpose of this application is to propose an adaptive real-time delivery method for media streams, which simplifies the synchronous transmission between sub-streams while reducing the storage overhead on the server, and supports various types of Adaptive real-time delivery of sub-stream media streams (eg using multi-rate coding/multi-view/multi-channel/scalable coding).
  • the second purpose of this application is to propose an adaptive real-time delivery server for media streams.
  • the third object of the present application is to propose a computer device.
  • the fourth object of the present application is to provide a non-transitory computer-readable storage medium.
  • an embodiment of the present application proposes an adaptive real-time delivery method for a media stream
  • the media stream includes at least one media sub-stream
  • each media sub-stream is a sequence of media units generated in real time on a server
  • each media sub-stream is associated with a sub-stream number
  • each media unit is associated with a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream
  • the method includes the following steps: receiving a client A media segment request sent by the terminal, wherein the media segment request carries at least one pull command, and the pull command does not carry or carries at least one control parameter, and the control parameter includes a first parameter indicating the target media stream to be transmitted.
  • a media segment is generated according to the media segment request, wherein, for the media segment request in the For each pull command, the target media stream to be transmitted is selected, at least one target media sub-stream to be transmitted in the target media stream is selected, and the candidate media unit to be transmitted in the target media sub-stream is determined, and encapsulate the candidate media units determined by each pull command into the media segment; and send the media segment to the client.
  • the adaptive real-time delivery method of the media stream according to the embodiment of the present application can arbitrarily combine the media units of each sub-stream according to the request of the client, generate the media segment in real time, and deliver the media segment to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • an embodiment of the present application proposes an adaptive real-time delivery server for a media stream
  • the media stream includes at least one media sub-stream
  • each media sub-stream is a sequence of media units generated in real time on the server, wherein , each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number indicating the sequence in which the media unit is generated in the media substream
  • the server includes: a client interface component , used to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an instruction to be transmitted The first type parameter of the target media stream, the second type parameter indicating the target media sub-stream to be transmitted, and the third type parameter indicating the candidate media unit to be transmitted; a media segment generating component for generating according to the media segment request media segment, wherein, for each pull command in the media segment request, the target media stream
  • the adaptive real-time delivery server of the media stream in the embodiment of the present application can arbitrarily combine the media units of each substream according to the request of the client, generate media segments in real time, and deliver the media segments to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • An embodiment of the present application provides a computer device, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, The instructions are arranged to perform an adaptive real-time delivery method for media streams as described in the above embodiments.
  • Embodiments of the present application provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the media stream described in the foregoing embodiments Adaptive real-time delivery method.
  • FIG. 1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of media streams according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a sub-flow pattern according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of media substream description information (including multi-rate coding substreams) according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of media substream description information (including multi-view video substreams) according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of media substream description information (including scalable coding substreams) according to an embodiment of the present application
  • FIG. 7 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of candidate media unit encapsulation under different media unit sorting modes according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to a specific embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the Internet it is often necessary to transfer various real-time audio streams, video streams or data streams from one network node to another network node.
  • These network nodes include various terminals, such as PCs, mobile phones, tablet computers, and It includes various application servers, such as a video server and an audio server.
  • the transmitted audio streams, video streams or data streams are collectively referred to as media streams.
  • the delivery process of the media stream can be described by a general client-server model: the server delivers the generated media stream to the client in real time.
  • the server and the client here refer to logical functional entities, wherein the server is a functional entity that sends a media stream, and the client is a functional entity that receives a media stream. Servers and clients can exist on any network node.
  • a live media stream of a concert includes at least one video stream and at least one audio stream.
  • multi-rate coding/multi-view coding/multi-channel coding/scalable coding is adopted, there will be multiple data streams ⁇ video streams ⁇ audio streams in the live stream.
  • all synchronously transmitted video streams, audio streams or data streams in a live media stream to be transmitted are referred to as media sub-streams of the media stream.
  • Each delivered media substream is a sequence of media units generated in real time on the server.
  • the corresponding media units can be selected by themselves.
  • the media substream is a real-time generated byte stream, one byte can be selected as the media unit;
  • the media substream is an audio stream or video stream obtained by real-time sampling, the original audio frame or video frame can be selected is the media unit;
  • the media substream is an audio stream or video stream sampled and encoded in real time, the encoded audio frame, the encoded video frame or the Access Unit can be selected as the media unit;
  • the encapsulated transport packet (such as RTP packet, PES/PS/TS packet, etc.) can be selected as the media unit;
  • a segmented media segment such as the TS format segment used in the HLS protocol and the
  • Each media unit can be associated with a production time, which is usually a timestamp.
  • Each media unit may also be associated with a sequence number, which may be used to indicate the order in which the media units are generated in the media substream. When the sequence number is used to indicate the order in which the media unit is generated, the meaning of the sequence number needs to be defined according to the specific media unit.
  • the sequence number of the media unit is the byte sequence number; when the media unit is an audio frame or a video frame, the sequence number of the media unit is the frame sequence number; when the media unit is a transmission packet, the sequence number of the media unit is the packet sequence number; when the media unit is a stream segment, the sequence number of the media unit is the segment sequence number (such as the Media Sequence of each TS segment in HLS).
  • a sequence number representing the generation sequence and a generation time can be associated at the same time.
  • the RTP header has a packet sequence number (Sequence Number) field to indicate the RTP The sequence of the packets, and the Timestamp field to indicate the generation time of the media data encapsulated in the RTP.
  • packet sequence number Sequence Number
  • Timestamp field to indicate the generation time of the media data encapsulated in the RTP.
  • each media substream is associated with a unique substream number.
  • the corresponding substreams are numbered 1, 2, . . . , N.
  • the generation time and/or sequence number of each media substream may be used to describe the generation sequence of each media unit.
  • the generation time of the media units of different media substreams may be synchronous timing or independent timing.
  • independent timing When independent timing is adopted, the generation times of different media sub-streams are derived from asynchronous clocks. Therefore, it is necessary to separately record the corresponding relationship between the generation times of these different media sub-streams.
  • synchronous timing When synchronous timing is used, the generation times of different media sub-streams are derived from the same reference clock, and the synchronization relationship of media units in different media sub-streams can be known through the generation times.
  • the generation time of all media substreams in one media stream uses the same reference clock on the server, which corresponds to the same time line, such as Greenwich Mean Time.
  • a media stream includes at least one media substream, wherein each media substream may be of any type, such as an audio stream, a video stream, or a subtitle stream, and each media substream may also adopt any transmission encapsulation type, Such as RTP packet stream or MPEG2-TS stream.
  • each media substream may be of any type, such as an audio stream, a video stream, or a subtitle stream, and each media substream may also adopt any transmission encapsulation type, Such as RTP packet stream or MPEG2-TS stream.
  • the media substream is an RTP packet stream
  • the media unit is an RTP packet
  • the sequence number of the RTP packet (Sequence Number) is the sequence number of the media unit
  • the timestamp (Timestamp) of the RTP packet is the timestamp of the media unit.
  • each TS segment is regarded as a media unit.
  • Each TS segment may include a plurality of media frames, and then the segments are numbered in the sequence of generation, as the sequence number of the media unit, and the time stamp of the first media frame included in each segment indicates the generation time of the segment.
  • the server push method is adopted: once there is a new media unit on the server, it will be actively sent to the client.
  • the method of the embodiment of the present application is similar to various HTTP adaptive streams (such as HLS and MPEG-DASH), and adopts the method of pulling by the client, but the difference is that in the existing various HTTP adaptive streams, the client All of the pre-segmented segments are requested or pulled according to the manifest file, and each segment can be identified by a URL.
  • the media segment is not pre-segmented, but the server real-time according to the client's request. generated, the client can control the content of the media segment.
  • FIG. 1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of a media stream provided by an embodiment of the present application.
  • the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with There is a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream, then the adaptive real-time delivery method of the media stream comprises the following steps:
  • step S101 a media segment request sent by the client is received, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication of the target media to be transmitted A first type of parameter for the stream, a second type of parameter indicating the target media sub-stream to be delivered, and a third type of parameter indicating the candidate media unit to be delivered.
  • control parameters that can be used as the first type of parameters include but are not limited to: media stream identifier, media stream name, program identifier, etc.; the control parameters that can be used as the second type of parameters include but are not limited to: substream list, substream Pattern, sub-stream type, sub-stream priority, etc.; control parameters that can be used as the third type of parameters include but are not limited to: start sequence number, start time, maximum time offset, unit type, unit priority, etc. It should be understood by those skilled in the art that new control parameters can also be defined according to the needs of further implementation.
  • a media segment request may carry one or more pull commands, and these pull commands all carry respective control parameters, or a pull command may not carry any control parameters. Additionally, new commands other than pull commands can be defined as needed for further implementation.
  • the media segment request may be submitted using any network transmission protocol, such as common HTTP protocol, TCP protocol, UDP protocol, and so on.
  • HTTP protocol such as common HTTP protocol, TCP protocol, UDP protocol, and so on.
  • HTTP-GET method or the HTTP-POST method can also be used.
  • the pull command in the media segment request carries control parameters
  • certain encapsulation rules need to be used to encapsulate the pull command and its control parameters into a string or byte stream, and then send it to the server.
  • the command and its control parameters can be encapsulated in the URL as strings.
  • a media segment request carrying a pull command (with multiple control parameters) (split the long URL string into multiple lines for easy display):
  • the parameter names streamID, substreamList, substreamPattern, seqBegin, timeBegin, maxTimeOffset, unitType, and unitPrio respectively represent the media stream ID, substream list, substream pattern, start sequence number, start time, maximum time offset, unit Type, unit priority.
  • the server side can use a web server to receive the media segment request from the above client, extract the corresponding command and its control parameters from the requested URL, and classify the control parameters carried by each pull command: if it is a media stream identifier or media stream name, this parameter is the first type parameter; if it is a substream list or substream pattern, this parameter is the second type parameter; if it is one of the following parameters: start sequence number, start time, maximum time offset, unit type, unit priority, then this parameter is the third type of parameter.
  • a media segment is generated according to the media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one target media substream to be transmitted in the target media stream is selected. stream, determine the candidate media units to be transmitted in the target media substream, and encapsulate the candidate media units determined by each pull command into media segments.
  • the media segment is generated according to the media segment request, and this step can be further divided into several sub-steps S1021-S1024: First, for each pull command in the media segment request, step S1021 selects the pending The target media stream to be transmitted, step S1022 selects the target media substream in the aforementioned target media stream according to the second type parameter, step S1023 determines the candidate media unit to be transmitted in the aforementioned target media substream according to the third type parameter, and step S1024 will Candidate media units identified in all pull commands are packaged into media segments.
  • the target media stream to be transmitted may be selected according to the media stream identifier or the media stream name, and in step S1022, the target media substream may be selected according to parameters such as the substream list and substream pattern.
  • the candidate media unit can be determined according to parameters such as the starting sequence number, starting time, maximum time offset, etc., and in step S1024, one or more media units can be encapsulated into media using a self-defined encapsulation protocol
  • a simple encapsulation protocol is as follows: a media segment consists of a segment header and a segment payload, and the segment payload is formed by concatenating several media units.
  • the segment header indicates the starting position and length of each media unit.
  • each media unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub
  • step S103 the media segment is sent to the client.
  • the server can select an appropriate method to send the media segment to the client according to the protocol used by the client's media segment request. For example, when the received media segment request adopts the HTTP GET method, the HTTP GET response message can be used to respond Send the generated media segment: put the media segment into the entity body of the HTTP response message; if the media segment request is received through an established TCP connection, the generated media segment can be sent to the client directly through the TCP connection end.
  • the server When the server receives continuous media segment requests from the client, the server will continue to generate new media segments according to the client's request. These new media segments encapsulate the selected target media substreams that have recently been generated and are waiting to be sent to the client.
  • the client can parse these media segments to recover the media units of each target media substream in the real-time media stream. This process is shown in FIG. 2 .
  • the client can continuously adjust the control parameters carried by the pull command in the media segment request according to application needs or network transmission conditions, such as changing the second type of parameters (media substream list, etc.) and the third type of parameters (such as start time, maximum time offset, unit priority, etc.), to ensure the continuity, real-time and adaptability to dynamic network transmission of media stream from server to client.
  • the method of the embodiments of the present application no longer requires pre-segmentation and manifest files, and thus does not require the client to receive and process manifest files, thereby reducing transmission delay and saving overhead.
  • the client can arbitrarily combine media units in different media substreams through media segment requests, and only need one request to obtain the required media units of each media substream, which is easy to ensure synchronous reception of different media substreams.
  • the media substreams and candidate media units that need to be received at any time it can better meet the needs of terminal applications and adapt to changes in network bandwidth. Adaptive delivery of multi-channel coded/scalable coded) media streams.
  • each step may correspond to a functional entity that can run independently and interact with each other.
  • generating the media segment according to the media segment request includes: if the pull command does not carry the first type parameter, the target media stream to be transmitted is the default specified media stream; if If the pull command does not carry the second type parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream; if the pull command does not carry the third type parameter, the candidate media unit includes the target media substream.
  • the media unit specified by default in the media substream the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
  • the pull command sent by the client does not need to carry the first type of parameters, and the media stream is the selected target media stream; when there are multiple media streams in the server One of the media streams to be transmitted can be designated as the default media stream.
  • the pull command sent by the client does not carry any first-type parameters, the default media stream is selected as the target media stream.
  • the media sub-streams it contains may be various.
  • it may contain different types of media substreams: video stream, audio stream, subtitle stream, additional information stream, picture stream, etc.; for the same type of media substream, it may contain different bit rates, such as for video stream
  • it may contain media substreams corresponding to different resolutions and frame rates;
  • audio streams it may contain media substreams corresponding to different sampling rates;
  • video streams of the same type and bit rate it may contain multiple encodings layers (such as using scalable video coding SVC), these different coding layers correspond to different priorities.
  • the server should select one or more media sub-streams suitable for most terminal display and normal transmission under most network bandwidth conditions among all media sub-streams, as the default media sub-stream of the target media stream, When the client does not carry any second type parameters, these default media substreams are selected as target media substreams.
  • the server may use the default specified media unit as a candidate media unit.
  • These default specified media units are all media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or the generation time interval between all and the latest media units in the target media substream is less than the second The default media unit.
  • the first preset value or the second preset value set for each target media sub-stream shall ensure the sending of each target media sub-stream Synchronize.
  • 2 is a schematic diagram of a real-time transmission process of a media stream according to an embodiment of the present application.
  • the server contains only one media stream S1.
  • the server When the server receives a media segment request MS_REQ1, because MS_REQ1 only contains one pull command and the pull command contains Without carrying any parameters, the target media stream selected by the server is the default media stream S1, and the selected target media substreams are the default media substreams 1 and 4; for media substream 1, its first The preset value is 3. For media substream 4, the first preset value is 4. Therefore, the server determines the candidate media units of media substream 1 and media substream 4 respectively, and encapsulates them into the first media unit. Segment MS1, returned to the client.
  • Embodiment 3 in the following embodiment, how the server selects the target media substream to be transmitted according to the second type of parameters will be described.
  • the second type of parameters given in the embodiments of the present application include two types:
  • the sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams contained in the target media stream, and each bit of the sub-stream pattern is associated with a specific media sub-stream of the target media stream, and is used for indicating whether the specific media substream is a target media substream to be transmitted.
  • the substream pattern is bitstream 01101000. From left to right, each bit corresponds to substream 1 to substream 8. Therefore, when the bit value is 1 Indicates that the associated sub-stream is the target media sub-stream, that is, the target media sub-streams selected by the sub-stream pattern above are three: sub-stream 2, sub-stream 3 and sub-stream 5.
  • the substream list to represent the target media substreams; when the number of target media substreams is large and the substreams need to be specified It is recommended to use the sub-stream pattern when the same clock is used.
  • the characteristics of the sub-stream can also be defined as the second-type parameter.
  • the characteristics of these sub-streams include: sub-stream type, sub-stream priority, viewpoint number, channel number, video resolution, etc.
  • One or more sub-stream feature parameters can be used to indicate the conditions that the target media sub-stream needs to meet , the server selects the final target media substream.
  • Embodiment 4 In the following embodiments, an example will be given to illustrate how the server transmits the sub-stream related information to the client.
  • the client needs to specify the target media substream through the second type of parameters.
  • the premise is that the client should know which target media substreams are included in the current media stream and the characteristics of these target media substreams.
  • the terminal can select the target media substream to be transmitted according to the application requirements and network transmission conditions.
  • These descriptive information about the media substreams in the media stream can be provided by the server application layer to the client application layer, and the client can obtain this information in a way independent of the current transmission process (such as submitting additional request messages or through a third-party server).
  • One piece of information can also be directly obtained from the server during the transmission process.
  • a method for directly encapsulating the media substream description information into a media segment and transmitting it to the client is proposed.
  • the minimum information contained in the media substream description information is: which media substreams are included in the current media stream. If the numbers of the media substreams are consecutively numbered from 1 to N, the media substream description information only needs to include the number N of the media substreams to obtain the numbers of all the media substreams. When the media stream adopts various multi-substream encoding, more substream feature information will be introduced into the media substream description information:
  • the media component identifier is used to indicate different ways of obtaining information in a media stream.
  • the video information collected by different cameras in a live broadcast corresponds to different components.
  • Each media substream is associated with a media component, but the same media component can correspond to multiple media substreams.
  • a video captured from the same viewpoint can be represented by multiple media substreams encoded at different bit rates.
  • the types of media substreams include but are not limited to: video, audio, picture, subtitle, etc. or mixed types; the mixed types refer to a media substream that contains multiple types of media units, for example, a substream may contain both video and audio.
  • the substream code rate is used to indicate the code rate of the media substream; if a media substream is a variable code rate (VBR), the substream code rate is used to indicate the code rate of the media substream. Indicates the average bit rate of this substream over a period of time.
  • CBR fixed rate
  • VBR variable code rate
  • the priority of the media substream used to indicate the importance of different media substreams in the transmission process.
  • the media stream When the media stream adopts scalable coding, such as Scalable Video Coding (SVC), the media stream will generate multiple levels of coding streams, including: a base layer and multiple enhancement layers, and each media substream corresponds to a coding level .
  • SVC Scalable Video Coding
  • the media stream When the media stream adopts multi-view encoding such as 3D video, the media stream will generate multiple encoded streams of different viewpoints, and each media substream corresponds to a viewpoint.
  • multiple viewpoints When multiple viewpoints are jointly encoded into one media substream, there may be multiple viewpoint identifiers in one media substream.
  • the frame rate used for video coding When the type of a media substream is video, the frame rate used for video coding.
  • the media stream When the media stream adopts multi-channel encoding, the media stream will generate encoded data on multiple channels respectively. Several channels form a channel group for multi-channel joint encoding. Each media substream corresponds to a or multiple channel IDs.
  • the sampling rate used for encoding When the media substream is an audio stream, the sampling rate used for encoding.
  • the media substream is an audio stream containing vocals
  • the language of the vocals is an audio stream containing vocals
  • each media stream can customize its own media sub-stream description information according to the actual situation.
  • Examples of media sub-stream description information under three application scenarios are given in Fig. 4 to Fig. 6 , in which Fig. 4 Substream 1, substream 2 and substream 3 are substreams encoded by three different code rates of the same media content (the media component identifiers are all 10), and substream 1, substream 2, and substream 3 in Figure 5
  • Substream 4 and substream 5 correspond to two different channels of the same media content
  • substreams 1 to 4 in Figure 6 correspond to the same media content (media The component identifiers are all 30) a base layer and three enhancement layers when using scalable video coding.
  • the client After receiving the media segment, the client parses the media substream description information from it, and then selects the target media substream to be transmitted in real time according to the actual needs of the service layer, terminal performance and network conditions to support various multi-substream encoding. Adaptive delivery of media streams.
  • the media sub-stream description information of a media stream generally remains unchanged, therefore, it is not necessary to encapsulate the above-mentioned media sub-stream description information in each media segment.
  • the server when the server receives the first media segment request from the client, it can encapsulate the media substream description information in the first returned media segment, and can no longer encapsulate the media substream description in subsequent media segments. information.
  • Embodiment 5 In the following embodiments, an example will be given to illustrate how the server determines the candidate media unit to be transmitted through the third type of parameters.
  • generating the media segment according to the media segment request further includes: if the pull command carries at least one third-type parameter, wherein each third-type parameter corresponds to at least one of the candidate media units.
  • a constraint condition, the candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
  • the constraint condition corresponding to the start sequence number is: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number.
  • the constraint condition corresponding to the start time is: if the start time is valid, the generation time of the candidate unit is after the start time.
  • the constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, in the target media substream, the generation time interval between the candidate media unit and the latest media unit is less than the maximum time offset.
  • the above-mentioned third type of parameter validity and invalidity refers to whether the value of the parameter is within a specified range. Taking the start sequence number as an example, the value of the start sequence number cannot exceed the sequence number of the current latest media unit. On the other hand, to ensure real-time performance, the value of the start sequence number cannot be earlier than the sequence number of an existing media unit. The starting sequence number within the above range is valid. If a third-type parameter is invalid, it is equivalent to not carrying the third-type parameter. When all the third-type parameters are invalid, the candidate media unit to be transmitted in the target media substream is the default specified media unit.
  • each pull command may carry one or more of the third type parameters.
  • the pull command is not limited to carry other self-defined third type parameters. For example, it can be based on the characteristics of the media unit. Define other third-type parameters, such as media unit type, minimum priority, priority range, etc., as constraints of the media unit.
  • target media sub-streams when there is only one target media substream selected according to the second type parameter, it is only necessary to judge whether the media units in the target media substream satisfy the constraints corresponding to various third type parameters.
  • these target media sub-streams should use synchronization numbers. , or use the same clock for timing.
  • the synchronization number refers to: on the server, every time a specified time period elapses, all media units generated by each target media substream within the time period are associated with the same new sequence number.
  • the above-mentioned specified time period may be of fixed length or variable length, may be preset, or may be dynamically determined according to the actual generation of the media unit.
  • the serial number of the media unit can not only be used to indicate the generation sequence of the media units in each media substream, but also the synchronization relationship between the media units in different target media substreams.
  • Figure 2 shows a real-time delivery process of a media stream.
  • the client requests the media data of the target media stream S1, wherein the target media stream S1 is the default media stream on the server, and the target media stream includes 4 media streams.
  • Substreams where substream 1, substream 2, and substream 3 are three media streams that are synchronously numbered (for example, three video streams encoded with different bit rates), and substream 4 uses an independent (for example, an independently encoded output audio stream), the default specified media substreams are substream 1 and substream 4. Since sub-stream 1 and sub-stream 4 are not synchronized numbers, after the client receives the media segment, the serial numbers of the latest media units of sub-stream 1 and sub-stream 4 are different.
  • each pull command carries a different media substream list and the corresponding start sequence number, which are respectively used to specify the characteristics of the target media substream and the media unit to be sent. , so that the continuous reception of sub-stream 1 and sub-stream 4 can be guaranteed respectively.
  • the target media stream in Figure 7 is similar to Figure 2, except that the client actively requests the media data of sub-stream 1 and sub-stream 2 (for example, sub-stream 1, sub-stream 2 and sub-stream 3 respectively use scalable video coded base layer and two enhancement layers).
  • sub-stream 1 and sub-stream 2 are numbered synchronously, when the client submits a media segment request, only one pull command is used, and the substream list carried by it includes two target media substreams: For stream 1 and sub-stream 2, the starting sequence numbers carried by them can be used to indicate the candidate media units in sub-stream 1 and sub-stream 2 at the same time.
  • the target media stream in Fig. 8 is similar to Fig. 2 and Fig. 7, the difference is that the client simultaneously requests the synchronized media data of three sub-streams (including sub-stream 1, sub-stream 2 and sub-stream 4), although the sub-stream Stream 4 and sub-streams 1&2 are not numbered synchronously.
  • the generation time of all sub-streams in the target media stream S1 uses the same reference clock. Therefore, the client can still use only one pull command in the media segment request to Realize simultaneous pulling of three media substreams.
  • three target media sub-streams are specified in the sub-stream list carried by the pull command, and the start time carried by the pull command is the latest generation time of the media unit currently received by the client. The start time can ensure that all newly generated media units to be sent are continuously encapsulated into media segments and sent to the client.
  • the client can receive media streams in real time by continuously submitting media segment requests, and can adapt to changes in application requirements and network status by adjusting the target media sub-stream list, as shown in FIG. 2, at the beginning
  • substream 1 and substream 4 are received.
  • the target media substream can be modified in MS_REQ4 to only include substream 4, and it can be automatically switched to only substream 4.
  • Media units of substream 4 are received.
  • Embodiment 6 in the following embodiment, the processing procedure when the server encapsulates the candidate media unit into a media segment will be described.
  • encapsulating the candidate media units determined by each pull command into media segments includes: according to the order in which the pull commands appear in the media segment request, The candidate media units are encapsulated into the media segment, wherein if a parameter carried by a pull command includes a unit sorting method, the determined candidate media units are sorted according to the unit sorting method and then encapsulated into the media segment. If If a pull command does not carry a unit sorting mode, the determined candidate media units are sorted according to the default sorting mode and then encapsulated into the media segment.
  • the candidate media units are sorted according to the generation time of the candidate media units, and the earlier the candidate media units are generated, the earlier they are encapsulated into the media segment.
  • the order is reversed according to the generation time of the candidate media units, and the candidate media units generated later are encapsulated into the media segment first.
  • the candidate media units are sorted according to the sequence numbers of the candidate media units, and the candidate media units with the higher sequence numbers are encapsulated into the media segment earlier.
  • the sequence number of the candidate media unit is reversed, and the candidate media unit with the later sequence number is encapsulated into the media segment first.
  • the candidate media units of each sub-stream are encapsulated in sequence according to the order of the sub-stream numbers.
  • the candidate media units of the multiple substreams are encapsulated sequentially according to the order in which the substream numbers appear in the substream list.
  • the unit sorting method can also be a cascade of the above basic sorting methods, such as SSLIST_ORDER+SEQ_BACKWARD.
  • the meaning of this cascade is that first, the candidate media units are sorted according to the first basic sorting method, and the candidates with the same position after sorting are sorted. The media units are ordered according to the second basic ordering, and so on until the ordering is complete. Regardless of the basic sorting method or the cascading sorting method, if there are still candidate media units with the same position after sorting, the candidate media units with the same position are sorted according to the default sorting method.
  • Sorting method 1 The media segment request consists of two pull commands.
  • the target media substream of the first pull command is substream 4, and the target media substreams of the second pull command are substream 1 and substream 2. Therefore, according to the order of the pull commands, the candidate media units of substream 4 are firstly encapsulated into the media segment. Since the first pull command does not specify any unit sorting method, the default sorting method, that is, the time forward is used to encapsulate Candidate media units D58 to D62; then, since the unit sorting method carried by the second pull command is time reverse (TIME_BACKWARD), the media units encapsulated by substream 1 and substream 2 according to time reverse are A27/B27 in turn, A26/B26, A25/B25. Media units with the same location are sorted according to the size of their substream numbers by default. Therefore, the packaging sequence of the final candidate media units is shown in Sorting Mode 1 in FIG. 9 .
  • Ordering method 2 The media segment request includes only one pull command, and the unit ordering method carried by the pull command is a cascade of two basic ordering methods: SSLIST_ORDER+SEQ_FORWARD.
  • the candidate media unit of 4 the candidate media unit of substream 1 is encapsulated, and the candidate media unit of substream 2 is further encapsulated.
  • the second basic sorting method is sequence number forward (SEQ_FORWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted in the order of their sequence numbers from front to back.
  • SEQ_FORWARD sequence number forward
  • Figure 9 shows the sorting method 2.
  • Sorting mode 3 The media segment request includes only one pull command, and the unit sorting mode carried by the pull command is a cascade of two basic sorting modes: SSNO_ORDER+SEQ_BACKWARD.
  • the first basic ordering method is the sub-stream number order (SSNO_ORDER), which indicates that the candidate media units of each sub-stream are encapsulated in the order of the sub-stream numbers from small to large, that is, the candidate media units of sub-stream 1 are encapsulated first, and then the sub-stream is encapsulated.
  • the candidate media unit of 2 and then encapsulates the candidate media unit of substream 3.
  • the second basic sorting method is sequence number reverse (SEQ_BACKWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted according to their sequence numbers from back to front. Finally, the packaging order of the candidate media units As shown in Fig. 9 sorting mode 3.
  • Sorting method 4 The media segment request only includes one pull command, and the pull command carries only one unit sorting method: TIME_FORWARD, that is, the candidate media units are sorted from front to back according to the generation time of all candidate media units. Finally, the candidate media units are sorted. The encapsulation order of the media units is shown in Sorting Mode 4 of FIG. 9 .
  • this embodiment does not limit the definition of a new unit sorting method.
  • a new unit sorting method can be defined.
  • High-priority unit priority HGH_PRIOR_FIRST
  • SS_PRIOR_ORDER substream priority order
  • the candidate media units determined by each pull command may not be encapsulated according to the order in which the pull commands appear in the media segment request. For example, without distinguishing between pull commands, all candidate media Units are ordered and packed into media segments.
  • the order in which media units are encapsulated into media segments is controlled by pulling commands and unit sorting, so that when the network transmission bandwidth is insufficient, the specific candidate media units of specific substreams can be sent preferentially: for example, high-priority media substreams , when the video sub-stream and the audio sub-stream are transmitted at the same time, the audio transmission can be guaranteed first; when the base layer and the enhancement layer code stream are transmitted at the same time, the candidate media unit of the base layer is preferentially sent.
  • the delivery of the newly generated candidate media unit is prioritized to improve user experience.
  • the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • FIG. 10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application.
  • the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with Having a generation time and/or a sequence number indicating the order in which the media units are generated in the media substream, the server 10 includes a client interface component 100 , a media segment generating component 200 and a media segment sending component 300 .
  • the client interface component 100 is configured to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication A first type of parameter for the target media stream to be delivered, a second type of parameter to indicate the target media sub-stream to be delivered, and a third type of parameter to indicate a candidate media unit to be delivered.
  • a media segment generating component 200 configured to generate a media segment according to a media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one of the target media streams to be transmitted is selected.
  • the target media sub-stream determines the candidate media units to be transmitted in the target media sub-stream, and encapsulates the candidate media units determined by each pull command into media segments, wherein generating the media segments according to the media segment request includes: first, for the media For each pull command in the segment request, select the target media stream to be transmitted, select at least one target media substream to be transmitted in the target media stream, determine the candidate media units to be transmitted in each target media substream, and then , the candidate media units determined by each pull command are encapsulated into media segments.
  • the media segment sending component 300 is configured to send the generated media segment to the client.
  • the server 10 in this embodiment of the present application can arbitrarily combine the media units of each substream according to the client's request, generate media segments in real time, and then return the media segments to the client, thereby reducing storage overhead on the server and simplifying the interaction between substreams. synchronous transmission, and effectively reduce the media stream transmission delay and overhead.
  • the client interface component 100 is used to receive a media segment request from a client;
  • the media segment request can be one or more pull commands, and each pull command can carry 0, 1 or more control parameters;
  • the control parameters Including the following categories: the first type parameter, the second type parameter and the third type parameter; the first type parameter is used to indicate the target media stream to be transmitted; the second type parameter is used to indicate the target media stream to be transmitted in the target media stream stream; the third type of parameter is used to indicate the candidate media unit to be transmitted in the target media substream.
  • the client interface component 100 can use any specified protocol to receive the media segment request, for example, when the HTTP protocol is used, the client interface component 100 can be a Web server, which can receive any media segment request using the http protocol; protocol, the client interface component is a TCP server and provides a fixed service port.
  • the media segment generating component 200 is configured to generate the required media segment according to the media segment request of the client.
  • the media segment request is obtained from the client interface component 100, and the pull command and its control parameters are parsed out. Then, the target media stream to be transmitted is selected according to the first type of parameters, and the to-be-transmitted media stream is selected according to the second type of parameters.
  • the target media substreams to be transmitted are determined according to the third type of parameters to determine the candidate media units to be transmitted in each target media substream, and finally, the candidate media units determined by each pull command are extracted from the media stream storage unit, and the It is encapsulated into a media segment, and then directly sent to the media segment sending component 300 for sending.
  • the server 10 in this embodiment of the present application further includes at least one media stream real-time generating component for generating or receiving one or more media streams from other servers in real time by itself;
  • the media stream includes at least one media stream Stream, each media substream is a sequence of media units generated in real time on the server;
  • each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number, the sequence number is used to Indicates the generation order of the media units in the media substream;
  • the media stream real-time generation component includes one or more media sub-stream real-time generation components, and each media sub-stream real-time generation component includes one or more processing steps for the real-time generation of media sub-streams.
  • the processing steps include But not limited to: real-time acquisition of media signals, encoding and compression, transmission encapsulation and pre-segmentation.
  • the real-time media sub-stream generation component can also receive media streams from other devices in real time, or convert existing media stream files on a server into real-time generated media unit sequences.
  • the media segment generation component 200 is further configured to, when the pull command does not carry the first type of parameters, the target media stream to be transmitted is the default specified media stream, and when the pull command does not carry the first type parameter, the When the pull command does not carry the second type of parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream, and when the pull command does not carry the third type of parameter, the candidate media unit includes the target media substream.
  • the media unit specified by default in the media substream, the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
  • the second type of parameter includes a sub-stream list
  • the sub-stream list includes the serial number of at least one target media sub-stream.
  • the second type of parameter includes a sub-stream pattern
  • the sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams included in the target media stream, and the sub-stream
  • Each bit of the pattern is associated with a specific media substream of the target media stream and is used to indicate whether the specific media substream is a target media substream to be transmitted.
  • the media segment generation component 200 is further configured to encapsulate media substream description information into the media segment, where the media substream description information includes at least one entry, wherein each entry Corresponds to a media substream of the media stream, and contains at least one field: the media substream number.
  • each entry further includes at least one of the following fields: media component identifier, sub-stream type, sub-stream bit rate, sub-stream priority, coding level, viewpoint identifier, video Resolution, video frame rate, channel identification, audio sample rate, language type.
  • the media segment generation component 200 is further configured to, when the pull command carries at least one third type parameter, each third type parameter corresponds to at least one constraint condition of the candidate media unit,
  • the candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
  • the media units in the target media sub-stream adopt synchronization numbers, wherein, each time a specified time period passes, all media generated by each target media sub-stream within the specified time period are The units are all associated with the same new sequence number.
  • the third type of parameter includes the start sequence number.
  • the constraints corresponding to the start sequence number are: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number. .
  • the generation times of the media units in all the target media sub-streams are derived from the same clock on the server
  • the third type of parameters includes the start time
  • the constraints corresponding to the start time are: : If the start time is valid, the generation time of the candidate media unit is after the start time.
  • the third type of parameter includes the maximum time offset
  • the constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, then in the target media substream, the candidate media unit and the latest The generation time interval of the media unit is less than the maximum time offset.
  • the media segment generation component 200 is further configured to encapsulate the candidate media units determined by each pull command into the media segment request according to the order in which each pull command appears in the media segment request.
  • Media segment wherein, if the parameter carried by any pull command includes the unit sorting method, the candidate media units determined by the pull command are sorted according to the unit sorting method and then encapsulated into the media segment. If the unit sorting method is not carried, then The candidate media units determined by the pull command are sorted according to the default sorting method and then encapsulated into media segments.
  • the unit sorting method is a cascade of one or more basic sorting methods, and the basic sorting methods include the following types: time forward sorting, time reverse sorting, serial number Forward sorting, serial number reverse sorting, substream number sequence sorting, and substream list sequence sorting.
  • clients and servers are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS services, which are difficult to manage and weak in business scalability. defect.
  • the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may include:
  • Memory 1201 Memory 1201 , processor 1202 , and computer programs stored on memory 1201 and executable on processor 1202 .
  • the adaptive real-time delivery method of the media stream provided in the above embodiment is implemented.
  • the electronic device also includes:
  • the communication interface 1203 is used for communication between the memory 1201 and the processor 1202 .
  • the memory 1201 is used to store computer programs that can be executed on the processor 1202 .
  • the memory 1201 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • the bus can be an Industry Standard Architecture (referred to as ISA) bus, a Peripheral Component (referred to as PCI) bus, or an Extended Industry Standard Architecture (referred to as EISA) bus or the like.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is shown in FIG. 12, but it does not mean that there is only one bus or one type of bus.
  • the memory 1201, the processor 1202 and the communication interface 1203 are integrated on one chip, the memory 1201, the processor 1202 and the communication interface 1203 can communicate with each other through an internal interface.
  • the processor 1202 may be a central processing unit (Central Processing Unit, referred to as CPU), or a specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or is configured to implement one or more embodiments of the present application integrated circuit.
  • CPU Central Processing Unit
  • ASIC Application Specific Integrated Circuit
  • This embodiment also provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the above-mentioned adaptive real-time delivery method of a media stream is implemented.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • N means at least two, such as two, three, etc., unless otherwise expressly and specifically defined.
  • a "computer-readable medium” can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus.
  • computer readable media include the following: electrical connections (electronic devices) with one or N wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM).
  • the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
  • N steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

La présente demande concerne un procédé de distribution en temps réel adaptatif pour un flux multimédia, ainsi qu'un serveur. Le procédé consiste à : recevoir une demande de segment multimédia envoyée par un client, la demande de segment multimédia comportant au moins une instruction d'extraction; générer un segment multimédia en fonction de la demande de segment multimédia, consistant à : pour chaque instruction d'extraction dans la demande de segment multimédia, sélectionner un flux multimédia cible à transmettre, sélectionner au moins un sous-flux multimédia cible à transmettre dans le flux multimédia cible, déterminer une unité multimédia candidate à transmettre dans le sous-flux multimédia cible, et encapsuler l'unité multimédia candidate déterminée par chaque instruction d'extraction dans le segment multimédia; et envoyer le segment multimédia au client. Selon des modes de réalisation de la présente demande, les unités multimédias des sous-flux sélectionnés peuvent être combinées en temps réel en fonction de la demande du client afin de générer un segment multimédia, ce qui simplifie la transmission synchrone entre les sous-flux tout en réduisant le surdébit de stockage sur le serveur, et la transmission en temps réel adaptative de divers flux multimédias à sous-flux multiples est prise en charge de manière uniforme.
PCT/CN2021/103196 2020-06-30 2021-06-29 Procédé de distribution en temps réel adaptatif pour flux multimédia, et serveur WO2022002070A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010614997.5 2020-06-30
CN202010614997.5A CN113873343B (zh) 2020-06-30 2020-06-30 媒体流的自适应实时递送方法及服务器

Publications (1)

Publication Number Publication Date
WO2022002070A1 true WO2022002070A1 (fr) 2022-01-06

Family

ID=78981470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/103196 WO2022002070A1 (fr) 2020-06-30 2021-06-29 Procédé de distribution en temps réel adaptatif pour flux multimédia, et serveur

Country Status (2)

Country Link
CN (1) CN113873343B (fr)
WO (1) WO2022002070A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102857478A (zh) * 2011-06-30 2013-01-02 华为技术有限公司 媒体数据控制方法及装置
US20170244772A1 (en) * 2012-11-30 2017-08-24 Google Technology Holdings LLC Multi-streaming multimedia data
CN110545492A (zh) * 2018-09-05 2019-12-06 北京开广信息技术有限公司 媒体流的实时递送方法及服务器
CN110881018A (zh) * 2018-09-05 2020-03-13 北京开广信息技术有限公司 媒体流的实时接收方法及客户端
CN111193686A (zh) * 2018-11-14 2020-05-22 北京开广信息技术有限公司 媒体流的递送方法及服务器

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109964471B (zh) * 2016-10-18 2022-03-22 埃克斯普韦公司 用于向移动用户设备发送内容的方法
CN111193684B (zh) * 2018-11-14 2021-12-21 北京开广信息技术有限公司 媒体流的实时递送方法及服务器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102857478A (zh) * 2011-06-30 2013-01-02 华为技术有限公司 媒体数据控制方法及装置
US20170244772A1 (en) * 2012-11-30 2017-08-24 Google Technology Holdings LLC Multi-streaming multimedia data
CN110545492A (zh) * 2018-09-05 2019-12-06 北京开广信息技术有限公司 媒体流的实时递送方法及服务器
CN110881018A (zh) * 2018-09-05 2020-03-13 北京开广信息技术有限公司 媒体流的实时接收方法及客户端
CN111193686A (zh) * 2018-11-14 2020-05-22 北京开广信息技术有限公司 媒体流的递送方法及服务器

Also Published As

Publication number Publication date
CN113873343B (zh) 2023-02-24
CN113873343A (zh) 2021-12-31

Similar Documents

Publication Publication Date Title
CN107409234B (zh) 基于lct利用dash格式的基于文件格式的流式传输
TWI668982B (zh) 用於多媒體和檔案傳輸的傳輸介面的方法及伺服器設備、及用於記錄相關指令於其上的電腦可讀取儲存媒體
RU2632394C2 (ru) Система и способ для поддержки различных схем захвата и доставки в сети распределения контента
CN107251562B (zh) 检索媒体数据的方法及装置、发信媒体信息的方法及装置
CN108141455B (zh) 用于媒体数据的流式发射的期限信令
KR102301333B1 (ko) 브로드캐스트 채널을 통한 dash 콘텐츠 스트리밍 방법 및 장치
US10659502B2 (en) Multicast streaming
US20200112753A1 (en) Service description for streaming media data
US11284135B2 (en) Communication apparatus, communication data generation method, and communication data processing method
JP2015136060A (ja) 通信装置、通信データ生成方法、および通信データ処理方法
KR20140008237A (ko) 엠엠티의 하이브리드 전송 서비스에서 패킷 전송 및 수신 장치 및 방법
US11457051B2 (en) Streaming media data processing method, processing system and storage server
CN112104885B (zh) 一种直播中加快m3u8起始播放速度的系统及方法
CN108494792A (zh) 一种flash播放器播放hls视频流的转换系统及其工作方法
CN110086797B (zh) 媒体流的实时接收方法、客户端、计算机设备和存储介质
CN113285947B (zh) 一种hls直播和组播直播接续的方法和装置
KR102176404B1 (ko) 통신 장치, 통신 데이터 생성 방법, 및 통신 데이터 처리 방법
WO2022002070A1 (fr) Procédé de distribution en temps réel adaptatif pour flux multimédia, et serveur
WO2020098455A1 (fr) Procédé de distribution en temps réel de flux multimédia et serveur
CN110545492B (zh) 媒体流的实时递送方法及服务器
WO2020048268A1 (fr) Procédé de transmission en temps réel et procédé de réception en temps réel pour flux multimédia, serveur et client
WO2020216035A1 (fr) Procédé de poussée en temps réel et procédé de réception en temps réel pour flux multimédia, serveur et client

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21833863

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21833863

Country of ref document: EP

Kind code of ref document: A1