WO2022002070A1 - Adaptive real-time delivery method for media stream, and server - Google Patents

Adaptive real-time delivery method for media stream, and server Download PDF

Info

Publication number
WO2022002070A1
WO2022002070A1 PCT/CN2021/103196 CN2021103196W WO2022002070A1 WO 2022002070 A1 WO2022002070 A1 WO 2022002070A1 CN 2021103196 W CN2021103196 W CN 2021103196W WO 2022002070 A1 WO2022002070 A1 WO 2022002070A1
Authority
WO
WIPO (PCT)
Prior art keywords
media
stream
substream
sub
target
Prior art date
Application number
PCT/CN2021/103196
Other languages
French (fr)
Chinese (zh)
Inventor
姜红旗
辛振涛
姜红艳
申素辉
Original Assignee
北京开广信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京开广信息技术有限公司 filed Critical 北京开广信息技术有限公司
Publication of WO2022002070A1 publication Critical patent/WO2022002070A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/84Generation or processing of descriptive data, e.g. content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL

Definitions

  • the present application relates to the technical field of digital information transmission, and in particular, to an adaptive real-time delivery method and server of a media stream.
  • RTP Real-time Transport Protocol, real-time transmission protocol
  • RTSP Real Time Streaming Protocol, real-time streaming protocol
  • HTTP HyperText Transfer Protocol, hypertext transfer protocol
  • HTTP Adaptive Streaming HTTP Adaptive Streaming
  • HTTP adaptive streaming includes various schemes: HLS (HTTP Live Streaming) proposed by Apple, Smooth Streaming proposed by Microsoft, HDS (HTTP Dynamic Streaming) proposed by Adobe, and DASH (Dynamic Adaptive Streaming) proposed by MPEG. Streaming over HTTP, HTTP-based dynamic adaptive streaming).
  • HLS HTTP Live Streaming
  • Smooth Streaming proposed by Microsoft
  • HDS HTTP Dynamic Streaming
  • DASH Dynamic Adaptive Streaming
  • MPEG Dynamic Adaptive Streaming
  • the common feature of the above HTTP adaptive streaming scheme is that the media stream is cut into short-term (2s ⁇ 10s) media segments, and an index file or manifest file describing these media segments is generated at the same time (such as m3u8 playlist in HLS or MPD file in DASH), and then save it to each web server, the client obtains the URL (Uniform Resource Locator, Uniform Resource Locator) access address of these media segments by accessing the playlist or manifest file, and then can use HTTP protocol to download and play these media segments one by one.
  • the main difference between these schemes is reflected in the encapsulation format and manifest file format adopted by the media segment.
  • HTTP adaptive streaming is easy to deploy using common web servers and adapts to the existing Internet infrastructure, including CDN, Caches, Firewall and NATS, etc., and can support large-scale user access.
  • the client can also select clips with suitable bit rates according to network conditions and terminal capabilities, so as to realize bit rate adaptation. Therefore, HTTP adaptive streaming has become the mainstream way of real-time streaming media delivery on the Internet.
  • a media stream transmitted on the Internet may include dozens of media sub-streams, which are manifested in the following aspects: 1) Various types of media sub-streams , the same scene can generate multiple types of media sub-streams including video, audio, subtitles, pictures, auxiliary information, data, etc. These media sub-streams need to be mixed together for transmission; 2) Multi-bit rate encoding, in order to adapt to the network bandwidth Transmission needs and processing capabilities of different terminals.
  • the same video stream can generate multiple encoded sub-streams according to different resolutions, frame rates and code rates, and multiple audio streams can generate multiple coded sub-streams according to different languages, sampling rates and code rates.
  • Coding sub-streams 3) Multi-view Video, in order to obtain a more realistic video experience, the same scene will generate multiple video sub-streams from different viewpoints, such as 3D video or free-view video; 4) Multi-sound In order to obtain an immersive audio experience, the same scene will be sampled from different positions to generate multiple audio sub-streams; 5) Scalable Video Coding (SVC), in order to adapt to the transmission of network bandwidth, one channel of video A base layer and several enhancement layers are produced during encoding. Further, any combination of the above aspects (eg, using multi-view video while using multi-rate video coding or scalable coding for each view) will result in a surprisingly large number of media sub-streams and media streams.
  • SVC Scalable Video Coding
  • sub-stream combined segmentation that is, encapsulating video sub-stream segments and audio sub-stream segments of the same time range in the same media segment and corresponding to an HTTP URL.
  • the client only needs to request once to get the corresponding video clips and audio clips, which ensures the synchronization of each substream and simplifies the processing of the receiving end.
  • different video substreams The number of combinations of streams and audio sub-streams will increase rapidly, and each combination will generate a new segment, which leads to repeated storage of video sub-streams and audio sub-streams on the server side, increasing the storage overhead of the server.
  • sub-streams are segmented independently, that is, each sub-stream is segmented independently, but the time alignment between segments of these different sub-streams is maintained, and each sub-stream segment corresponds to a URL.
  • the client can request the segmentation of each sub-stream as needed, and the server does not need to store the combined segmentation of each sub-stream, but because the client needs to submit requests multiple times to obtain different sub-streams Stream segmentation, which increases the transmission overhead and the difficulty of synchronization processing.
  • the above HTTP adaptive streaming transmission scheme has another problem: in order to support real-time transmission, the server needs to continuously update its manifest file, and the client needs to obtain the manifest file before obtaining the URL address of the latest media segment. Since the manifest file needs to be transmitted to the client after a period of time, the manifest file obtained by the client does not reflect the current generation of the latest media segment on the server, which will affect the real-time transmission performance of the media stream. When the number of substreams or combinations in the media stream reaches dozens, the manifest file will become very complicated, further increasing the transmission overhead and processing overhead of the client receiving the media stream.
  • the HTTP adaptive streaming transmission scheme based on pre-segmentation and manifest file is not suitable for adaptive real-time delivery of media streams containing many sub-streams, and a new delivery method needs to be designed for it.
  • the present application aims to solve one of the technical problems in the related art at least to a certain extent.
  • the first purpose of this application is to propose an adaptive real-time delivery method for media streams, which simplifies the synchronous transmission between sub-streams while reducing the storage overhead on the server, and supports various types of Adaptive real-time delivery of sub-stream media streams (eg using multi-rate coding/multi-view/multi-channel/scalable coding).
  • the second purpose of this application is to propose an adaptive real-time delivery server for media streams.
  • the third object of the present application is to propose a computer device.
  • the fourth object of the present application is to provide a non-transitory computer-readable storage medium.
  • an embodiment of the present application proposes an adaptive real-time delivery method for a media stream
  • the media stream includes at least one media sub-stream
  • each media sub-stream is a sequence of media units generated in real time on a server
  • each media sub-stream is associated with a sub-stream number
  • each media unit is associated with a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream
  • the method includes the following steps: receiving a client A media segment request sent by the terminal, wherein the media segment request carries at least one pull command, and the pull command does not carry or carries at least one control parameter, and the control parameter includes a first parameter indicating the target media stream to be transmitted.
  • a media segment is generated according to the media segment request, wherein, for the media segment request in the For each pull command, the target media stream to be transmitted is selected, at least one target media sub-stream to be transmitted in the target media stream is selected, and the candidate media unit to be transmitted in the target media sub-stream is determined, and encapsulate the candidate media units determined by each pull command into the media segment; and send the media segment to the client.
  • the adaptive real-time delivery method of the media stream according to the embodiment of the present application can arbitrarily combine the media units of each sub-stream according to the request of the client, generate the media segment in real time, and deliver the media segment to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • an embodiment of the present application proposes an adaptive real-time delivery server for a media stream
  • the media stream includes at least one media sub-stream
  • each media sub-stream is a sequence of media units generated in real time on the server, wherein , each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number indicating the sequence in which the media unit is generated in the media substream
  • the server includes: a client interface component , used to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an instruction to be transmitted The first type parameter of the target media stream, the second type parameter indicating the target media sub-stream to be transmitted, and the third type parameter indicating the candidate media unit to be transmitted; a media segment generating component for generating according to the media segment request media segment, wherein, for each pull command in the media segment request, the target media stream
  • the adaptive real-time delivery server of the media stream in the embodiment of the present application can arbitrarily combine the media units of each substream according to the request of the client, generate media segments in real time, and deliver the media segments to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • An embodiment of the present application provides a computer device, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, The instructions are arranged to perform an adaptive real-time delivery method for media streams as described in the above embodiments.
  • Embodiments of the present application provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the media stream described in the foregoing embodiments Adaptive real-time delivery method.
  • FIG. 1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of media streams according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of a sub-flow pattern according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of media substream description information (including multi-rate coding substreams) according to an embodiment of the present application
  • FIG. 5 is a schematic diagram of media substream description information (including multi-view video substreams) according to an embodiment of the present application
  • FIG. 6 is a schematic diagram of media substream description information (including scalable coding substreams) according to an embodiment of the present application
  • FIG. 7 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of candidate media unit encapsulation under different media unit sorting modes according to an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to a specific embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the Internet it is often necessary to transfer various real-time audio streams, video streams or data streams from one network node to another network node.
  • These network nodes include various terminals, such as PCs, mobile phones, tablet computers, and It includes various application servers, such as a video server and an audio server.
  • the transmitted audio streams, video streams or data streams are collectively referred to as media streams.
  • the delivery process of the media stream can be described by a general client-server model: the server delivers the generated media stream to the client in real time.
  • the server and the client here refer to logical functional entities, wherein the server is a functional entity that sends a media stream, and the client is a functional entity that receives a media stream. Servers and clients can exist on any network node.
  • a live media stream of a concert includes at least one video stream and at least one audio stream.
  • multi-rate coding/multi-view coding/multi-channel coding/scalable coding is adopted, there will be multiple data streams ⁇ video streams ⁇ audio streams in the live stream.
  • all synchronously transmitted video streams, audio streams or data streams in a live media stream to be transmitted are referred to as media sub-streams of the media stream.
  • Each delivered media substream is a sequence of media units generated in real time on the server.
  • the corresponding media units can be selected by themselves.
  • the media substream is a real-time generated byte stream, one byte can be selected as the media unit;
  • the media substream is an audio stream or video stream obtained by real-time sampling, the original audio frame or video frame can be selected is the media unit;
  • the media substream is an audio stream or video stream sampled and encoded in real time, the encoded audio frame, the encoded video frame or the Access Unit can be selected as the media unit;
  • the encapsulated transport packet (such as RTP packet, PES/PS/TS packet, etc.) can be selected as the media unit;
  • a segmented media segment such as the TS format segment used in the HLS protocol and the
  • Each media unit can be associated with a production time, which is usually a timestamp.
  • Each media unit may also be associated with a sequence number, which may be used to indicate the order in which the media units are generated in the media substream. When the sequence number is used to indicate the order in which the media unit is generated, the meaning of the sequence number needs to be defined according to the specific media unit.
  • the sequence number of the media unit is the byte sequence number; when the media unit is an audio frame or a video frame, the sequence number of the media unit is the frame sequence number; when the media unit is a transmission packet, the sequence number of the media unit is the packet sequence number; when the media unit is a stream segment, the sequence number of the media unit is the segment sequence number (such as the Media Sequence of each TS segment in HLS).
  • a sequence number representing the generation sequence and a generation time can be associated at the same time.
  • the RTP header has a packet sequence number (Sequence Number) field to indicate the RTP The sequence of the packets, and the Timestamp field to indicate the generation time of the media data encapsulated in the RTP.
  • packet sequence number Sequence Number
  • Timestamp field to indicate the generation time of the media data encapsulated in the RTP.
  • each media substream is associated with a unique substream number.
  • the corresponding substreams are numbered 1, 2, . . . , N.
  • the generation time and/or sequence number of each media substream may be used to describe the generation sequence of each media unit.
  • the generation time of the media units of different media substreams may be synchronous timing or independent timing.
  • independent timing When independent timing is adopted, the generation times of different media sub-streams are derived from asynchronous clocks. Therefore, it is necessary to separately record the corresponding relationship between the generation times of these different media sub-streams.
  • synchronous timing When synchronous timing is used, the generation times of different media sub-streams are derived from the same reference clock, and the synchronization relationship of media units in different media sub-streams can be known through the generation times.
  • the generation time of all media substreams in one media stream uses the same reference clock on the server, which corresponds to the same time line, such as Greenwich Mean Time.
  • a media stream includes at least one media substream, wherein each media substream may be of any type, such as an audio stream, a video stream, or a subtitle stream, and each media substream may also adopt any transmission encapsulation type, Such as RTP packet stream or MPEG2-TS stream.
  • each media substream may be of any type, such as an audio stream, a video stream, or a subtitle stream, and each media substream may also adopt any transmission encapsulation type, Such as RTP packet stream or MPEG2-TS stream.
  • the media substream is an RTP packet stream
  • the media unit is an RTP packet
  • the sequence number of the RTP packet (Sequence Number) is the sequence number of the media unit
  • the timestamp (Timestamp) of the RTP packet is the timestamp of the media unit.
  • each TS segment is regarded as a media unit.
  • Each TS segment may include a plurality of media frames, and then the segments are numbered in the sequence of generation, as the sequence number of the media unit, and the time stamp of the first media frame included in each segment indicates the generation time of the segment.
  • the server push method is adopted: once there is a new media unit on the server, it will be actively sent to the client.
  • the method of the embodiment of the present application is similar to various HTTP adaptive streams (such as HLS and MPEG-DASH), and adopts the method of pulling by the client, but the difference is that in the existing various HTTP adaptive streams, the client All of the pre-segmented segments are requested or pulled according to the manifest file, and each segment can be identified by a URL.
  • the media segment is not pre-segmented, but the server real-time according to the client's request. generated, the client can control the content of the media segment.
  • FIG. 1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of a media stream provided by an embodiment of the present application.
  • the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with There is a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream, then the adaptive real-time delivery method of the media stream comprises the following steps:
  • step S101 a media segment request sent by the client is received, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication of the target media to be transmitted A first type of parameter for the stream, a second type of parameter indicating the target media sub-stream to be delivered, and a third type of parameter indicating the candidate media unit to be delivered.
  • control parameters that can be used as the first type of parameters include but are not limited to: media stream identifier, media stream name, program identifier, etc.; the control parameters that can be used as the second type of parameters include but are not limited to: substream list, substream Pattern, sub-stream type, sub-stream priority, etc.; control parameters that can be used as the third type of parameters include but are not limited to: start sequence number, start time, maximum time offset, unit type, unit priority, etc. It should be understood by those skilled in the art that new control parameters can also be defined according to the needs of further implementation.
  • a media segment request may carry one or more pull commands, and these pull commands all carry respective control parameters, or a pull command may not carry any control parameters. Additionally, new commands other than pull commands can be defined as needed for further implementation.
  • the media segment request may be submitted using any network transmission protocol, such as common HTTP protocol, TCP protocol, UDP protocol, and so on.
  • HTTP protocol such as common HTTP protocol, TCP protocol, UDP protocol, and so on.
  • HTTP-GET method or the HTTP-POST method can also be used.
  • the pull command in the media segment request carries control parameters
  • certain encapsulation rules need to be used to encapsulate the pull command and its control parameters into a string or byte stream, and then send it to the server.
  • the command and its control parameters can be encapsulated in the URL as strings.
  • a media segment request carrying a pull command (with multiple control parameters) (split the long URL string into multiple lines for easy display):
  • the parameter names streamID, substreamList, substreamPattern, seqBegin, timeBegin, maxTimeOffset, unitType, and unitPrio respectively represent the media stream ID, substream list, substream pattern, start sequence number, start time, maximum time offset, unit Type, unit priority.
  • the server side can use a web server to receive the media segment request from the above client, extract the corresponding command and its control parameters from the requested URL, and classify the control parameters carried by each pull command: if it is a media stream identifier or media stream name, this parameter is the first type parameter; if it is a substream list or substream pattern, this parameter is the second type parameter; if it is one of the following parameters: start sequence number, start time, maximum time offset, unit type, unit priority, then this parameter is the third type of parameter.
  • a media segment is generated according to the media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one target media substream to be transmitted in the target media stream is selected. stream, determine the candidate media units to be transmitted in the target media substream, and encapsulate the candidate media units determined by each pull command into media segments.
  • the media segment is generated according to the media segment request, and this step can be further divided into several sub-steps S1021-S1024: First, for each pull command in the media segment request, step S1021 selects the pending The target media stream to be transmitted, step S1022 selects the target media substream in the aforementioned target media stream according to the second type parameter, step S1023 determines the candidate media unit to be transmitted in the aforementioned target media substream according to the third type parameter, and step S1024 will Candidate media units identified in all pull commands are packaged into media segments.
  • the target media stream to be transmitted may be selected according to the media stream identifier or the media stream name, and in step S1022, the target media substream may be selected according to parameters such as the substream list and substream pattern.
  • the candidate media unit can be determined according to parameters such as the starting sequence number, starting time, maximum time offset, etc., and in step S1024, one or more media units can be encapsulated into media using a self-defined encapsulation protocol
  • a simple encapsulation protocol is as follows: a media segment consists of a segment header and a segment payload, and the segment payload is formed by concatenating several media units.
  • the segment header indicates the starting position and length of each media unit.
  • each media unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header.
  • the substream number of the unit When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub
  • step S103 the media segment is sent to the client.
  • the server can select an appropriate method to send the media segment to the client according to the protocol used by the client's media segment request. For example, when the received media segment request adopts the HTTP GET method, the HTTP GET response message can be used to respond Send the generated media segment: put the media segment into the entity body of the HTTP response message; if the media segment request is received through an established TCP connection, the generated media segment can be sent to the client directly through the TCP connection end.
  • the server When the server receives continuous media segment requests from the client, the server will continue to generate new media segments according to the client's request. These new media segments encapsulate the selected target media substreams that have recently been generated and are waiting to be sent to the client.
  • the client can parse these media segments to recover the media units of each target media substream in the real-time media stream. This process is shown in FIG. 2 .
  • the client can continuously adjust the control parameters carried by the pull command in the media segment request according to application needs or network transmission conditions, such as changing the second type of parameters (media substream list, etc.) and the third type of parameters (such as start time, maximum time offset, unit priority, etc.), to ensure the continuity, real-time and adaptability to dynamic network transmission of media stream from server to client.
  • the method of the embodiments of the present application no longer requires pre-segmentation and manifest files, and thus does not require the client to receive and process manifest files, thereby reducing transmission delay and saving overhead.
  • the client can arbitrarily combine media units in different media substreams through media segment requests, and only need one request to obtain the required media units of each media substream, which is easy to ensure synchronous reception of different media substreams.
  • the media substreams and candidate media units that need to be received at any time it can better meet the needs of terminal applications and adapt to changes in network bandwidth. Adaptive delivery of multi-channel coded/scalable coded) media streams.
  • each step may correspond to a functional entity that can run independently and interact with each other.
  • generating the media segment according to the media segment request includes: if the pull command does not carry the first type parameter, the target media stream to be transmitted is the default specified media stream; if If the pull command does not carry the second type parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream; if the pull command does not carry the third type parameter, the candidate media unit includes the target media substream.
  • the media unit specified by default in the media substream the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
  • the pull command sent by the client does not need to carry the first type of parameters, and the media stream is the selected target media stream; when there are multiple media streams in the server One of the media streams to be transmitted can be designated as the default media stream.
  • the pull command sent by the client does not carry any first-type parameters, the default media stream is selected as the target media stream.
  • the media sub-streams it contains may be various.
  • it may contain different types of media substreams: video stream, audio stream, subtitle stream, additional information stream, picture stream, etc.; for the same type of media substream, it may contain different bit rates, such as for video stream
  • it may contain media substreams corresponding to different resolutions and frame rates;
  • audio streams it may contain media substreams corresponding to different sampling rates;
  • video streams of the same type and bit rate it may contain multiple encodings layers (such as using scalable video coding SVC), these different coding layers correspond to different priorities.
  • the server should select one or more media sub-streams suitable for most terminal display and normal transmission under most network bandwidth conditions among all media sub-streams, as the default media sub-stream of the target media stream, When the client does not carry any second type parameters, these default media substreams are selected as target media substreams.
  • the server may use the default specified media unit as a candidate media unit.
  • These default specified media units are all media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or the generation time interval between all and the latest media units in the target media substream is less than the second The default media unit.
  • the first preset value or the second preset value set for each target media sub-stream shall ensure the sending of each target media sub-stream Synchronize.
  • 2 is a schematic diagram of a real-time transmission process of a media stream according to an embodiment of the present application.
  • the server contains only one media stream S1.
  • the server When the server receives a media segment request MS_REQ1, because MS_REQ1 only contains one pull command and the pull command contains Without carrying any parameters, the target media stream selected by the server is the default media stream S1, and the selected target media substreams are the default media substreams 1 and 4; for media substream 1, its first The preset value is 3. For media substream 4, the first preset value is 4. Therefore, the server determines the candidate media units of media substream 1 and media substream 4 respectively, and encapsulates them into the first media unit. Segment MS1, returned to the client.
  • Embodiment 3 in the following embodiment, how the server selects the target media substream to be transmitted according to the second type of parameters will be described.
  • the second type of parameters given in the embodiments of the present application include two types:
  • the sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams contained in the target media stream, and each bit of the sub-stream pattern is associated with a specific media sub-stream of the target media stream, and is used for indicating whether the specific media substream is a target media substream to be transmitted.
  • the substream pattern is bitstream 01101000. From left to right, each bit corresponds to substream 1 to substream 8. Therefore, when the bit value is 1 Indicates that the associated sub-stream is the target media sub-stream, that is, the target media sub-streams selected by the sub-stream pattern above are three: sub-stream 2, sub-stream 3 and sub-stream 5.
  • the substream list to represent the target media substreams; when the number of target media substreams is large and the substreams need to be specified It is recommended to use the sub-stream pattern when the same clock is used.
  • the characteristics of the sub-stream can also be defined as the second-type parameter.
  • the characteristics of these sub-streams include: sub-stream type, sub-stream priority, viewpoint number, channel number, video resolution, etc.
  • One or more sub-stream feature parameters can be used to indicate the conditions that the target media sub-stream needs to meet , the server selects the final target media substream.
  • Embodiment 4 In the following embodiments, an example will be given to illustrate how the server transmits the sub-stream related information to the client.
  • the client needs to specify the target media substream through the second type of parameters.
  • the premise is that the client should know which target media substreams are included in the current media stream and the characteristics of these target media substreams.
  • the terminal can select the target media substream to be transmitted according to the application requirements and network transmission conditions.
  • These descriptive information about the media substreams in the media stream can be provided by the server application layer to the client application layer, and the client can obtain this information in a way independent of the current transmission process (such as submitting additional request messages or through a third-party server).
  • One piece of information can also be directly obtained from the server during the transmission process.
  • a method for directly encapsulating the media substream description information into a media segment and transmitting it to the client is proposed.
  • the minimum information contained in the media substream description information is: which media substreams are included in the current media stream. If the numbers of the media substreams are consecutively numbered from 1 to N, the media substream description information only needs to include the number N of the media substreams to obtain the numbers of all the media substreams. When the media stream adopts various multi-substream encoding, more substream feature information will be introduced into the media substream description information:
  • the media component identifier is used to indicate different ways of obtaining information in a media stream.
  • the video information collected by different cameras in a live broadcast corresponds to different components.
  • Each media substream is associated with a media component, but the same media component can correspond to multiple media substreams.
  • a video captured from the same viewpoint can be represented by multiple media substreams encoded at different bit rates.
  • the types of media substreams include but are not limited to: video, audio, picture, subtitle, etc. or mixed types; the mixed types refer to a media substream that contains multiple types of media units, for example, a substream may contain both video and audio.
  • the substream code rate is used to indicate the code rate of the media substream; if a media substream is a variable code rate (VBR), the substream code rate is used to indicate the code rate of the media substream. Indicates the average bit rate of this substream over a period of time.
  • CBR fixed rate
  • VBR variable code rate
  • the priority of the media substream used to indicate the importance of different media substreams in the transmission process.
  • the media stream When the media stream adopts scalable coding, such as Scalable Video Coding (SVC), the media stream will generate multiple levels of coding streams, including: a base layer and multiple enhancement layers, and each media substream corresponds to a coding level .
  • SVC Scalable Video Coding
  • the media stream When the media stream adopts multi-view encoding such as 3D video, the media stream will generate multiple encoded streams of different viewpoints, and each media substream corresponds to a viewpoint.
  • multiple viewpoints When multiple viewpoints are jointly encoded into one media substream, there may be multiple viewpoint identifiers in one media substream.
  • the frame rate used for video coding When the type of a media substream is video, the frame rate used for video coding.
  • the media stream When the media stream adopts multi-channel encoding, the media stream will generate encoded data on multiple channels respectively. Several channels form a channel group for multi-channel joint encoding. Each media substream corresponds to a or multiple channel IDs.
  • the sampling rate used for encoding When the media substream is an audio stream, the sampling rate used for encoding.
  • the media substream is an audio stream containing vocals
  • the language of the vocals is an audio stream containing vocals
  • each media stream can customize its own media sub-stream description information according to the actual situation.
  • Examples of media sub-stream description information under three application scenarios are given in Fig. 4 to Fig. 6 , in which Fig. 4 Substream 1, substream 2 and substream 3 are substreams encoded by three different code rates of the same media content (the media component identifiers are all 10), and substream 1, substream 2, and substream 3 in Figure 5
  • Substream 4 and substream 5 correspond to two different channels of the same media content
  • substreams 1 to 4 in Figure 6 correspond to the same media content (media The component identifiers are all 30) a base layer and three enhancement layers when using scalable video coding.
  • the client After receiving the media segment, the client parses the media substream description information from it, and then selects the target media substream to be transmitted in real time according to the actual needs of the service layer, terminal performance and network conditions to support various multi-substream encoding. Adaptive delivery of media streams.
  • the media sub-stream description information of a media stream generally remains unchanged, therefore, it is not necessary to encapsulate the above-mentioned media sub-stream description information in each media segment.
  • the server when the server receives the first media segment request from the client, it can encapsulate the media substream description information in the first returned media segment, and can no longer encapsulate the media substream description in subsequent media segments. information.
  • Embodiment 5 In the following embodiments, an example will be given to illustrate how the server determines the candidate media unit to be transmitted through the third type of parameters.
  • generating the media segment according to the media segment request further includes: if the pull command carries at least one third-type parameter, wherein each third-type parameter corresponds to at least one of the candidate media units.
  • a constraint condition, the candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
  • the constraint condition corresponding to the start sequence number is: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number.
  • the constraint condition corresponding to the start time is: if the start time is valid, the generation time of the candidate unit is after the start time.
  • the constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, in the target media substream, the generation time interval between the candidate media unit and the latest media unit is less than the maximum time offset.
  • the above-mentioned third type of parameter validity and invalidity refers to whether the value of the parameter is within a specified range. Taking the start sequence number as an example, the value of the start sequence number cannot exceed the sequence number of the current latest media unit. On the other hand, to ensure real-time performance, the value of the start sequence number cannot be earlier than the sequence number of an existing media unit. The starting sequence number within the above range is valid. If a third-type parameter is invalid, it is equivalent to not carrying the third-type parameter. When all the third-type parameters are invalid, the candidate media unit to be transmitted in the target media substream is the default specified media unit.
  • each pull command may carry one or more of the third type parameters.
  • the pull command is not limited to carry other self-defined third type parameters. For example, it can be based on the characteristics of the media unit. Define other third-type parameters, such as media unit type, minimum priority, priority range, etc., as constraints of the media unit.
  • target media sub-streams when there is only one target media substream selected according to the second type parameter, it is only necessary to judge whether the media units in the target media substream satisfy the constraints corresponding to various third type parameters.
  • these target media sub-streams should use synchronization numbers. , or use the same clock for timing.
  • the synchronization number refers to: on the server, every time a specified time period elapses, all media units generated by each target media substream within the time period are associated with the same new sequence number.
  • the above-mentioned specified time period may be of fixed length or variable length, may be preset, or may be dynamically determined according to the actual generation of the media unit.
  • the serial number of the media unit can not only be used to indicate the generation sequence of the media units in each media substream, but also the synchronization relationship between the media units in different target media substreams.
  • Figure 2 shows a real-time delivery process of a media stream.
  • the client requests the media data of the target media stream S1, wherein the target media stream S1 is the default media stream on the server, and the target media stream includes 4 media streams.
  • Substreams where substream 1, substream 2, and substream 3 are three media streams that are synchronously numbered (for example, three video streams encoded with different bit rates), and substream 4 uses an independent (for example, an independently encoded output audio stream), the default specified media substreams are substream 1 and substream 4. Since sub-stream 1 and sub-stream 4 are not synchronized numbers, after the client receives the media segment, the serial numbers of the latest media units of sub-stream 1 and sub-stream 4 are different.
  • each pull command carries a different media substream list and the corresponding start sequence number, which are respectively used to specify the characteristics of the target media substream and the media unit to be sent. , so that the continuous reception of sub-stream 1 and sub-stream 4 can be guaranteed respectively.
  • the target media stream in Figure 7 is similar to Figure 2, except that the client actively requests the media data of sub-stream 1 and sub-stream 2 (for example, sub-stream 1, sub-stream 2 and sub-stream 3 respectively use scalable video coded base layer and two enhancement layers).
  • sub-stream 1 and sub-stream 2 are numbered synchronously, when the client submits a media segment request, only one pull command is used, and the substream list carried by it includes two target media substreams: For stream 1 and sub-stream 2, the starting sequence numbers carried by them can be used to indicate the candidate media units in sub-stream 1 and sub-stream 2 at the same time.
  • the target media stream in Fig. 8 is similar to Fig. 2 and Fig. 7, the difference is that the client simultaneously requests the synchronized media data of three sub-streams (including sub-stream 1, sub-stream 2 and sub-stream 4), although the sub-stream Stream 4 and sub-streams 1&2 are not numbered synchronously.
  • the generation time of all sub-streams in the target media stream S1 uses the same reference clock. Therefore, the client can still use only one pull command in the media segment request to Realize simultaneous pulling of three media substreams.
  • three target media sub-streams are specified in the sub-stream list carried by the pull command, and the start time carried by the pull command is the latest generation time of the media unit currently received by the client. The start time can ensure that all newly generated media units to be sent are continuously encapsulated into media segments and sent to the client.
  • the client can receive media streams in real time by continuously submitting media segment requests, and can adapt to changes in application requirements and network status by adjusting the target media sub-stream list, as shown in FIG. 2, at the beginning
  • substream 1 and substream 4 are received.
  • the target media substream can be modified in MS_REQ4 to only include substream 4, and it can be automatically switched to only substream 4.
  • Media units of substream 4 are received.
  • Embodiment 6 in the following embodiment, the processing procedure when the server encapsulates the candidate media unit into a media segment will be described.
  • encapsulating the candidate media units determined by each pull command into media segments includes: according to the order in which the pull commands appear in the media segment request, The candidate media units are encapsulated into the media segment, wherein if a parameter carried by a pull command includes a unit sorting method, the determined candidate media units are sorted according to the unit sorting method and then encapsulated into the media segment. If If a pull command does not carry a unit sorting mode, the determined candidate media units are sorted according to the default sorting mode and then encapsulated into the media segment.
  • the candidate media units are sorted according to the generation time of the candidate media units, and the earlier the candidate media units are generated, the earlier they are encapsulated into the media segment.
  • the order is reversed according to the generation time of the candidate media units, and the candidate media units generated later are encapsulated into the media segment first.
  • the candidate media units are sorted according to the sequence numbers of the candidate media units, and the candidate media units with the higher sequence numbers are encapsulated into the media segment earlier.
  • the sequence number of the candidate media unit is reversed, and the candidate media unit with the later sequence number is encapsulated into the media segment first.
  • the candidate media units of each sub-stream are encapsulated in sequence according to the order of the sub-stream numbers.
  • the candidate media units of the multiple substreams are encapsulated sequentially according to the order in which the substream numbers appear in the substream list.
  • the unit sorting method can also be a cascade of the above basic sorting methods, such as SSLIST_ORDER+SEQ_BACKWARD.
  • the meaning of this cascade is that first, the candidate media units are sorted according to the first basic sorting method, and the candidates with the same position after sorting are sorted. The media units are ordered according to the second basic ordering, and so on until the ordering is complete. Regardless of the basic sorting method or the cascading sorting method, if there are still candidate media units with the same position after sorting, the candidate media units with the same position are sorted according to the default sorting method.
  • Sorting method 1 The media segment request consists of two pull commands.
  • the target media substream of the first pull command is substream 4, and the target media substreams of the second pull command are substream 1 and substream 2. Therefore, according to the order of the pull commands, the candidate media units of substream 4 are firstly encapsulated into the media segment. Since the first pull command does not specify any unit sorting method, the default sorting method, that is, the time forward is used to encapsulate Candidate media units D58 to D62; then, since the unit sorting method carried by the second pull command is time reverse (TIME_BACKWARD), the media units encapsulated by substream 1 and substream 2 according to time reverse are A27/B27 in turn, A26/B26, A25/B25. Media units with the same location are sorted according to the size of their substream numbers by default. Therefore, the packaging sequence of the final candidate media units is shown in Sorting Mode 1 in FIG. 9 .
  • Ordering method 2 The media segment request includes only one pull command, and the unit ordering method carried by the pull command is a cascade of two basic ordering methods: SSLIST_ORDER+SEQ_FORWARD.
  • the candidate media unit of 4 the candidate media unit of substream 1 is encapsulated, and the candidate media unit of substream 2 is further encapsulated.
  • the second basic sorting method is sequence number forward (SEQ_FORWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted in the order of their sequence numbers from front to back.
  • SEQ_FORWARD sequence number forward
  • Figure 9 shows the sorting method 2.
  • Sorting mode 3 The media segment request includes only one pull command, and the unit sorting mode carried by the pull command is a cascade of two basic sorting modes: SSNO_ORDER+SEQ_BACKWARD.
  • the first basic ordering method is the sub-stream number order (SSNO_ORDER), which indicates that the candidate media units of each sub-stream are encapsulated in the order of the sub-stream numbers from small to large, that is, the candidate media units of sub-stream 1 are encapsulated first, and then the sub-stream is encapsulated.
  • the candidate media unit of 2 and then encapsulates the candidate media unit of substream 3.
  • the second basic sorting method is sequence number reverse (SEQ_BACKWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted according to their sequence numbers from back to front. Finally, the packaging order of the candidate media units As shown in Fig. 9 sorting mode 3.
  • Sorting method 4 The media segment request only includes one pull command, and the pull command carries only one unit sorting method: TIME_FORWARD, that is, the candidate media units are sorted from front to back according to the generation time of all candidate media units. Finally, the candidate media units are sorted. The encapsulation order of the media units is shown in Sorting Mode 4 of FIG. 9 .
  • this embodiment does not limit the definition of a new unit sorting method.
  • a new unit sorting method can be defined.
  • High-priority unit priority HGH_PRIOR_FIRST
  • SS_PRIOR_ORDER substream priority order
  • the candidate media units determined by each pull command may not be encapsulated according to the order in which the pull commands appear in the media segment request. For example, without distinguishing between pull commands, all candidate media Units are ordered and packed into media segments.
  • the order in which media units are encapsulated into media segments is controlled by pulling commands and unit sorting, so that when the network transmission bandwidth is insufficient, the specific candidate media units of specific substreams can be sent preferentially: for example, high-priority media substreams , when the video sub-stream and the audio sub-stream are transmitted at the same time, the audio transmission can be guaranteed first; when the base layer and the enhancement layer code stream are transmitted at the same time, the candidate media unit of the base layer is preferentially sent.
  • the delivery of the newly generated candidate media unit is prioritized to improve user experience.
  • the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • FIG. 10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application.
  • the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with Having a generation time and/or a sequence number indicating the order in which the media units are generated in the media substream, the server 10 includes a client interface component 100 , a media segment generating component 200 and a media segment sending component 300 .
  • the client interface component 100 is configured to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication A first type of parameter for the target media stream to be delivered, a second type of parameter to indicate the target media sub-stream to be delivered, and a third type of parameter to indicate a candidate media unit to be delivered.
  • a media segment generating component 200 configured to generate a media segment according to a media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one of the target media streams to be transmitted is selected.
  • the target media sub-stream determines the candidate media units to be transmitted in the target media sub-stream, and encapsulates the candidate media units determined by each pull command into media segments, wherein generating the media segments according to the media segment request includes: first, for the media For each pull command in the segment request, select the target media stream to be transmitted, select at least one target media substream to be transmitted in the target media stream, determine the candidate media units to be transmitted in each target media substream, and then , the candidate media units determined by each pull command are encapsulated into media segments.
  • the media segment sending component 300 is configured to send the generated media segment to the client.
  • the server 10 in this embodiment of the present application can arbitrarily combine the media units of each substream according to the client's request, generate media segments in real time, and then return the media segments to the client, thereby reducing storage overhead on the server and simplifying the interaction between substreams. synchronous transmission, and effectively reduce the media stream transmission delay and overhead.
  • the client interface component 100 is used to receive a media segment request from a client;
  • the media segment request can be one or more pull commands, and each pull command can carry 0, 1 or more control parameters;
  • the control parameters Including the following categories: the first type parameter, the second type parameter and the third type parameter; the first type parameter is used to indicate the target media stream to be transmitted; the second type parameter is used to indicate the target media stream to be transmitted in the target media stream stream; the third type of parameter is used to indicate the candidate media unit to be transmitted in the target media substream.
  • the client interface component 100 can use any specified protocol to receive the media segment request, for example, when the HTTP protocol is used, the client interface component 100 can be a Web server, which can receive any media segment request using the http protocol; protocol, the client interface component is a TCP server and provides a fixed service port.
  • the media segment generating component 200 is configured to generate the required media segment according to the media segment request of the client.
  • the media segment request is obtained from the client interface component 100, and the pull command and its control parameters are parsed out. Then, the target media stream to be transmitted is selected according to the first type of parameters, and the to-be-transmitted media stream is selected according to the second type of parameters.
  • the target media substreams to be transmitted are determined according to the third type of parameters to determine the candidate media units to be transmitted in each target media substream, and finally, the candidate media units determined by each pull command are extracted from the media stream storage unit, and the It is encapsulated into a media segment, and then directly sent to the media segment sending component 300 for sending.
  • the server 10 in this embodiment of the present application further includes at least one media stream real-time generating component for generating or receiving one or more media streams from other servers in real time by itself;
  • the media stream includes at least one media stream Stream, each media substream is a sequence of media units generated in real time on the server;
  • each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number, the sequence number is used to Indicates the generation order of the media units in the media substream;
  • the media stream real-time generation component includes one or more media sub-stream real-time generation components, and each media sub-stream real-time generation component includes one or more processing steps for the real-time generation of media sub-streams.
  • the processing steps include But not limited to: real-time acquisition of media signals, encoding and compression, transmission encapsulation and pre-segmentation.
  • the real-time media sub-stream generation component can also receive media streams from other devices in real time, or convert existing media stream files on a server into real-time generated media unit sequences.
  • the media segment generation component 200 is further configured to, when the pull command does not carry the first type of parameters, the target media stream to be transmitted is the default specified media stream, and when the pull command does not carry the first type parameter, the When the pull command does not carry the second type of parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream, and when the pull command does not carry the third type of parameter, the candidate media unit includes the target media substream.
  • the media unit specified by default in the media substream, the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
  • the second type of parameter includes a sub-stream list
  • the sub-stream list includes the serial number of at least one target media sub-stream.
  • the second type of parameter includes a sub-stream pattern
  • the sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams included in the target media stream, and the sub-stream
  • Each bit of the pattern is associated with a specific media substream of the target media stream and is used to indicate whether the specific media substream is a target media substream to be transmitted.
  • the media segment generation component 200 is further configured to encapsulate media substream description information into the media segment, where the media substream description information includes at least one entry, wherein each entry Corresponds to a media substream of the media stream, and contains at least one field: the media substream number.
  • each entry further includes at least one of the following fields: media component identifier, sub-stream type, sub-stream bit rate, sub-stream priority, coding level, viewpoint identifier, video Resolution, video frame rate, channel identification, audio sample rate, language type.
  • the media segment generation component 200 is further configured to, when the pull command carries at least one third type parameter, each third type parameter corresponds to at least one constraint condition of the candidate media unit,
  • the candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
  • the media units in the target media sub-stream adopt synchronization numbers, wherein, each time a specified time period passes, all media generated by each target media sub-stream within the specified time period are The units are all associated with the same new sequence number.
  • the third type of parameter includes the start sequence number.
  • the constraints corresponding to the start sequence number are: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number. .
  • the generation times of the media units in all the target media sub-streams are derived from the same clock on the server
  • the third type of parameters includes the start time
  • the constraints corresponding to the start time are: : If the start time is valid, the generation time of the candidate media unit is after the start time.
  • the third type of parameter includes the maximum time offset
  • the constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, then in the target media substream, the candidate media unit and the latest The generation time interval of the media unit is less than the maximum time offset.
  • the media segment generation component 200 is further configured to encapsulate the candidate media units determined by each pull command into the media segment request according to the order in which each pull command appears in the media segment request.
  • Media segment wherein, if the parameter carried by any pull command includes the unit sorting method, the candidate media units determined by the pull command are sorted according to the unit sorting method and then encapsulated into the media segment. If the unit sorting method is not carried, then The candidate media units determined by the pull command are sorted according to the default sorting method and then encapsulated into media segments.
  • the unit sorting method is a cascade of one or more basic sorting methods, and the basic sorting methods include the following types: time forward sorting, time reverse sorting, serial number Forward sorting, serial number reverse sorting, substream number sequence sorting, and substream list sequence sorting.
  • clients and servers are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS services, which are difficult to manage and weak in business scalability. defect.
  • the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client.
  • this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once
  • the combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream.
  • the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported.
  • Channel/Scalable Coding adaptive delivery.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may include:
  • Memory 1201 Memory 1201 , processor 1202 , and computer programs stored on memory 1201 and executable on processor 1202 .
  • the adaptive real-time delivery method of the media stream provided in the above embodiment is implemented.
  • the electronic device also includes:
  • the communication interface 1203 is used for communication between the memory 1201 and the processor 1202 .
  • the memory 1201 is used to store computer programs that can be executed on the processor 1202 .
  • the memory 1201 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
  • the bus can be an Industry Standard Architecture (referred to as ISA) bus, a Peripheral Component (referred to as PCI) bus, or an Extended Industry Standard Architecture (referred to as EISA) bus or the like.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is shown in FIG. 12, but it does not mean that there is only one bus or one type of bus.
  • the memory 1201, the processor 1202 and the communication interface 1203 are integrated on one chip, the memory 1201, the processor 1202 and the communication interface 1203 can communicate with each other through an internal interface.
  • the processor 1202 may be a central processing unit (Central Processing Unit, referred to as CPU), or a specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or is configured to implement one or more embodiments of the present application integrated circuit.
  • CPU Central Processing Unit
  • ASIC Application Specific Integrated Circuit
  • This embodiment also provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the above-mentioned adaptive real-time delivery method of a media stream is implemented.
  • first and second are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • N means at least two, such as two, three, etc., unless otherwise expressly and specifically defined.
  • a "computer-readable medium” can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus.
  • computer readable media include the following: electrical connections (electronic devices) with one or N wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM).
  • the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
  • N steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system.
  • a suitable instruction execution system For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
  • the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Disclosed in the present application are an adaptive real-time delivery method for a media stream, and a server. The method comprises: receiving a media segment request sent by a client, wherein the media segment request carries at least one pulling command; generating a media segment according to the media segment request, comprising: for each pulling command in the media segment request, selecting a target media stream to be transmitted, selecting at least one target media sub-stream to be transmitted in the target media stream, determining a candidate media unit to be transmitted in the target media sub-stream, and encapsulating the candidate media unit determined by each pulling command into the media segment; and sending the media segment to the client. According to embodiments of the present application, the media units of the selected sub-streams can be combined in real time according to the request of the client to generate a media segment, thereby simplifying the synchronous transmission between the sub-streams while reducing the storage overhead on the server, and adaptive real-time transmission of various multi-sub-stream media streams is supported uniformly.

Description

媒体流的自适应实时递送方法及服务器Adaptive real-time delivery method and server for media stream
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求北京开广信息技术有限公司于2020年06月30日提交的、发明名称为“媒体流的自适应实时递送方法及服务器”的、中国专利申请号“202010614997.5”的优先权。This application claims the priority of the Chinese patent application number "202010614997.5", filed by Beijing Kaiguang Information Technology Co., Ltd. on June 30, 2020, with the invention titled "Adaptive Real-time Delivery Method and Server for Media Streams".
技术领域technical field
本申请涉及数字信息传输技术领域,特别涉及一种媒体流的自适应实时递送方法及服务器。The present application relates to the technical field of digital information transmission, and in particular, to an adaptive real-time delivery method and server of a media stream.
背景技术Background technique
随着互联网特别是移动互联网的快速发展,通过互联网来实时传送音频、视频、图像等多媒体信息成为许多应用(如网络直播、实时监控、视频会议等)的基本需求,为满足这一需求,人们提出了各种流媒体实时传输技术,目前得到广泛使用的主要包括三类:实时传送协议(RTP(Real-time Transport Protocol,实时传输协议)/RTSP(Real Time Streaming Protocol,实时流传输协议))、RTMP(Real Time Messaging Protocol,实时消息传送协议)和HTTP(HyperText Transfer Protocol,超文本传输协议)自适应性流传输HAS(HTTP Adaptive Streaming)。其中,HTTP自适应流传输又包括多种方案:苹果公司提出的HLS(HTTP Live Streaming)、微软提出的平滑流Smooth Streaming、Adobe提出的HDS(HTTP Dynamic Streaming)、MPEG组织提出的DASH(Dynamic Adaptive Streaming over HTTP,基于HTTP的动态自适应流)。With the rapid development of the Internet, especially the mobile Internet, the real-time transmission of multimedia information such as audio, video, and images through the Internet has become a basic requirement for many applications (such as webcasting, real-time monitoring, video conferencing, etc.). To meet this demand, people Various streaming media real-time transmission technologies are proposed, and currently there are three types of widely used: real-time transmission protocol (RTP (Real-time Transport Protocol, real-time transmission protocol) / RTSP (Real Time Streaming Protocol, real-time streaming protocol)) , RTMP (Real Time Messaging Protocol, real-time messaging protocol) and HTTP (HyperText Transfer Protocol, hypertext transfer protocol) adaptive streaming HAS (HTTP Adaptive Streaming). Among them, HTTP adaptive streaming includes various schemes: HLS (HTTP Live Streaming) proposed by Apple, Smooth Streaming proposed by Microsoft, HDS (HTTP Dynamic Streaming) proposed by Adobe, and DASH (Dynamic Adaptive Streaming) proposed by MPEG. Streaming over HTTP, HTTP-based dynamic adaptive streaming).
上述HTTP自适应性流传输方案的共同特点是将媒体流切割成短时间(2s~10s)的媒体片段,并同时生成描述这些媒体片段的索引文件或清单文件(例如HLS中的m3u8播放列表或DASH中的MPD文件),然后将其保存到各Web服务器上,客户端通过访问播放列表或清单文件,获得这些媒体片段的URL(Uniform Resource Locator,统一资源定位符)访问地址,然后可以采用HTTP协议来逐个下载这些媒体片段并进行播放。这些方案的主要区别体现在媒体片段采用的封装格式和清单文件格式的不同。The common feature of the above HTTP adaptive streaming scheme is that the media stream is cut into short-term (2s ~ 10s) media segments, and an index file or manifest file describing these media segments is generated at the same time (such as m3u8 playlist in HLS or MPD file in DASH), and then save it to each web server, the client obtains the URL (Uniform Resource Locator, Uniform Resource Locator) access address of these media segments by accessing the playlist or manifest file, and then can use HTTP protocol to download and play these media segments one by one. The main difference between these schemes is reflected in the encapsulation format and manifest file format adopted by the media segment.
相对于RTP/RTSP和RTMP来说,HTTP自适应流传输易于利用普通Web服务器进行部署且适应现有Internet的基础架构,包括CDN、Caches、Firewall和NATS等,可以支持大规模的用户访问。同时,通过提供多种码率的媒体片段,还可以支持客户端根据网络条件和终端能力来自行选择合适码率的片段,实现码率自适应。因此,HTTP自适应流传输已成为目前互联网上实时流媒体递送的主流方式。Compared with RTP/RTSP and RTMP, HTTP adaptive streaming is easy to deploy using common web servers and adapts to the existing Internet infrastructure, including CDN, Caches, Firewall and NATS, etc., and can support large-scale user access. At the same time, by providing media clips with multiple bit rates, the client can also select clips with suitable bit rates according to network conditions and terminal capabilities, so as to realize bit rate adaptation. Therefore, HTTP adaptive streaming has become the mainstream way of real-time streaming media delivery on the Internet.
随着多媒体技术的发展,媒体流的形式也变得越来越复杂。最早的媒体流通常只包括一路音频和/或一路视频,但是,未来互联网上一个传送的媒体流中可能包括几十个媒体子流,这表现在以下方面: 1)多种类型的媒体子流,同一个场景可以产生包括视频、音频、字幕、图片、辅助信息、数据等多个类型的媒体子流,这些媒体子流需要混合在一起传送;2)多码率编码,为了适应网络带宽的传送需要和不同终端的处理能力,同一个视频流可以按照分辨率、帧率和码率的不同产生多个编码子流,多个音频流可以按照语言、采样率和码率的不同产生多个编码子流;3)多视点视频(Multi-view Video),为了获得更加逼真的视频体验,同一个场景会从不同视点产生多个视频子流,如3D视频或自由视点视频;4)多声道音频,为了获得沉浸式的音频体验,同一个场景会从不同位置采样产生多个音频子流;5)可伸缩视频编码(Scalable Video Coding,SVC),为了适应网络带宽的传输,对一路视频进行编码时会产生一个基础层和若干个增强层。进一步,上述各方面的任意组合(如使用多视点视频的同时对每个视点使用多码率视频编码或可伸缩编码)将产生数量惊人的媒体子流和媒体流。With the development of multimedia technology, the form of media stream has become more and more complex. The earliest media stream usually only includes one audio and/or one video, but in the future, a media stream transmitted on the Internet may include dozens of media sub-streams, which are manifested in the following aspects: 1) Various types of media sub-streams , the same scene can generate multiple types of media sub-streams including video, audio, subtitles, pictures, auxiliary information, data, etc. These media sub-streams need to be mixed together for transmission; 2) Multi-bit rate encoding, in order to adapt to the network bandwidth Transmission needs and processing capabilities of different terminals. The same video stream can generate multiple encoded sub-streams according to different resolutions, frame rates and code rates, and multiple audio streams can generate multiple coded sub-streams according to different languages, sampling rates and code rates. Coding sub-streams; 3) Multi-view Video, in order to obtain a more realistic video experience, the same scene will generate multiple video sub-streams from different viewpoints, such as 3D video or free-view video; 4) Multi-sound In order to obtain an immersive audio experience, the same scene will be sampled from different positions to generate multiple audio sub-streams; 5) Scalable Video Coding (SVC), in order to adapt to the transmission of network bandwidth, one channel of video A base layer and several enhancement layers are produced during encoding. Further, any combination of the above aspects (eg, using multi-view video while using multi-rate video coding or scalable coding for each view) will result in a surprisingly large number of media sub-streams and media streams.
相关技术中,各种HTTP自适应流传输协议在传输上述多子流的媒体流时,都需要对媒体流进行预先分段,并产生对应的清单文件(如HLS中的M3U8或DASH中的MPD文件),预分段的方案有两种:In the related art, when various HTTP adaptive streaming protocols transmit the media streams of the above-mentioned multiple sub-streams, they all need to pre-segment the media streams and generate corresponding manifest files (such as M3U8 in HLS or MPD in DASH). file), there are two pre-segmentation schemes:
方案1,子流组合分段,即将同一时间范围的视频子流片段和音频子流片段封装在同一个媒体段中,并对应着一个HTTP URL。客户端只需要请求一次,即可得到对应的视频片段和音频片段,这保证了各子流的同步并简化了接收端的处理,但是,一旦视频子流和音频子流的数目增加,不同视频子流和音频子流的组合数目将快速上升,而每种组合都会产生一个新的分段,这导致视频子流和音频子流在服务器端的重复存储,增加了服务器的存储开销。 Option 1, sub-stream combined segmentation, that is, encapsulating video sub-stream segments and audio sub-stream segments of the same time range in the same media segment and corresponding to an HTTP URL. The client only needs to request once to get the corresponding video clips and audio clips, which ensures the synchronization of each substream and simplifies the processing of the receiving end. However, once the number of video substreams and audio substreams increases, different video substreams The number of combinations of streams and audio sub-streams will increase rapidly, and each combination will generate a new segment, which leads to repeated storage of video sub-streams and audio sub-streams on the server side, increasing the storage overhead of the server.
方案2,子流独立分段,即将各子流独立分段,但保持这些不同子流的分段之间的时间对齐,每一个子流的分段都对应着一个URL。采用这种子流独立分段的方式,客户端可以根据需要来请求各子流的分段,服务器端不需要存储各子流的组合分段,但是由于客户端需要多次提交请求来获取不同子流的分段,这增加了传输开销和同步处理的难度,另一方面在对各子流进行预分段时需要严格保证不同子流分段之间的同步,这使得各子流的预分段处理较为复杂。 Scheme 2, sub-streams are segmented independently, that is, each sub-stream is segmented independently, but the time alignment between segments of these different sub-streams is maintained, and each sub-stream segment corresponds to a URL. In this way of sub-stream independent segmentation, the client can request the segmentation of each sub-stream as needed, and the server does not need to store the combined segmentation of each sub-stream, but because the client needs to submit requests multiple times to obtain different sub-streams Stream segmentation, which increases the transmission overhead and the difficulty of synchronization processing. On the other hand, when pre-segmenting each sub-stream, it is necessary to strictly ensure the synchronization between different sub-stream segments, which makes the Segment processing is more complex.
此外,上述HTTP自适应流传输方案还存在另一个问题:为了支持实时传送,服务器需要不断更新其清单文件,客户端需要先得到清单文件,才能获得最新媒体片段的URL地址。由于清单文件需要经过一段时间才能传输给客户端,客户端得到的清单文件并不能反映服务器上当前最新的媒体片段的生成情况,这将影响媒体流的实时传输性能。当媒体流中的子流数目或组合数目达到几十个时,清单文件将变得非常复杂,进一步增加客户端接收媒体流的传输开销和处理开销。In addition, the above HTTP adaptive streaming transmission scheme has another problem: in order to support real-time transmission, the server needs to continuously update its manifest file, and the client needs to obtain the manifest file before obtaining the URL address of the latest media segment. Since the manifest file needs to be transmitted to the client after a period of time, the manifest file obtained by the client does not reflect the current generation of the latest media segment on the server, which will affect the real-time transmission performance of the media stream. When the number of substreams or combinations in the media stream reaches dozens, the manifest file will become very complicated, further increasing the transmission overhead and processing overhead of the client receiving the media stream.
综上所述,基于预分段和清单文件的HTTP自适应流传输方案并不适合包含较多子流的媒体流的自适应实时传送,亟待为其设计新的递送方法。To sum up, the HTTP adaptive streaming transmission scheme based on pre-segmentation and manifest file is not suitable for adaptive real-time delivery of media streams containing many sub-streams, and a new delivery method needs to be designed for it.
发明内容SUMMARY OF THE INVENTION
本申请旨在至少在一定程度上解决相关技术中的技术问题之一。The present application aims to solve one of the technical problems in the related art at least to a certain extent.
为此,本申请的第一个目的在于提出一种媒体流的自适应实时递送方法,该方法在降低服务器上存储开销的同时简化各子流之间的同步传送,统一支持各种类型的多子流媒体流(如采用多码率编码/多视点/多声道/可伸缩编码)的自适应实时传送。Therefore, the first purpose of this application is to propose an adaptive real-time delivery method for media streams, which simplifies the synchronous transmission between sub-streams while reducing the storage overhead on the server, and supports various types of Adaptive real-time delivery of sub-stream media streams (eg using multi-rate coding/multi-view/multi-channel/scalable coding).
本申请的第二个目的在于提出一种媒体流的自适应实时递送服务器。The second purpose of this application is to propose an adaptive real-time delivery server for media streams.
本申请的第三个目的在于提出一种计算机设备。The third object of the present application is to propose a computer device.
本发申请的第四个目的在于提出一种非临时性计算机可读存储介质。The fourth object of the present application is to provide a non-transitory computer-readable storage medium.
为达到上述目的,本申请实施例提出了一种媒体流的自适应实时递送方法,所述媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,所述每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,所述方法包括以下步骤:接收客户端发送的媒体段请求,其中,所述媒体段请求携带至少一个拉取命令,所述拉取命令不携带或携带至少一个控制参数,所述控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数;根据所述媒体段请求生成媒体段,其中,针对所述媒体段请求中的每个拉取命令,选定所述待传送的目标媒体流,选定所述目标媒体流中待传送的至少一个目标媒体子流,确定所述目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成所述媒体段;发送所述媒体段至所述客户端。In order to achieve the above object, an embodiment of the present application proposes an adaptive real-time delivery method for a media stream, the media stream includes at least one media sub-stream, and each media sub-stream is a sequence of media units generated in real time on a server, wherein , each media sub-stream is associated with a sub-stream number, and each media unit is associated with a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream, the method includes the following steps: receiving a client A media segment request sent by the terminal, wherein the media segment request carries at least one pull command, and the pull command does not carry or carries at least one control parameter, and the control parameter includes a first parameter indicating the target media stream to be transmitted. class parameter, a second class parameter indicating the target media substream to be transmitted, and a third class parameter indicating a candidate media unit to be transmitted; a media segment is generated according to the media segment request, wherein, for the media segment request in the For each pull command, the target media stream to be transmitted is selected, at least one target media sub-stream to be transmitted in the target media stream is selected, and the candidate media unit to be transmitted in the target media sub-stream is determined, and encapsulate the candidate media units determined by each pull command into the media segment; and send the media segment to the client.
本申请实施例的媒体流的自适应实时递送方法,可以根据客户端的请求来任意组合各子流的媒体单元并实时生成媒体段,并将媒体段递送给客户端。首先,这使得服务器只需要按各子流分别保存媒体单元,无需预先生成各种子流组合的片段,降低了服务器的存储需求,同时,也简化了客户端的同步处理,客户端只需要一次请求即可获得同一时间段内各子流的组合片段,易于保证各子流的同步接收。其次,客户端可以根据应用需要和网络情况,在媒体段请求中动态调整目标媒体子流,这样,可统一支持各种类型的多子流媒体流(如采用多码率编码/多视点/多声道/可伸缩编码)的自适应传送。最后,由于每个媒体段是由客户端的请求触发产生的,无论媒体流包括多少个子流,都不再需要清单文件,客户端也不需要请求和解析清单文件,这显著降低了复杂清单文件带来的传输开销和处理开销,从而有效降低媒体流的实时传输延时和传输开销。The adaptive real-time delivery method of the media stream according to the embodiment of the present application can arbitrarily combine the media units of each sub-stream according to the request of the client, generate the media segment in real time, and deliver the media segment to the client. First, this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once The combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream. Secondly, the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported. Channel/Scalable Coding) adaptive delivery. Finally, since each media segment is triggered by the client's request, no matter how many sub-streams the media stream includes, the manifest file is no longer required, and the client does not need to request and parse the manifest file, which significantly reduces the complexity of manifest files. Therefore, the real-time transmission delay and transmission overhead of the media stream can be effectively reduced.
为达到上述目的,本申请实施例提出了一种媒体流的自适应实时递送服务器,所述媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,所述每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,所述服务器包括:客户端接口组件,用于接收客户端发送的媒体段请求,其中,所述媒体段请求携带至少一个拉取命令,所述拉取命令不携带或携带至少一个控制参数,且每个控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数;媒体段生成组件,用于根据所述媒体段请求生成媒体段,其中,针对所述媒体段请求中的 每个拉取命令,选定所述待传送的目标媒体流,选定所述目标媒体流中待传送的至少一个目标媒体子流,确定所述目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成所述媒体段;媒体段发送组件,用于将生成的媒体段发送给所述客户端。In order to achieve the above purpose, an embodiment of the present application proposes an adaptive real-time delivery server for a media stream, the media stream includes at least one media sub-stream, and each media sub-stream is a sequence of media units generated in real time on the server, wherein , each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number indicating the sequence in which the media unit is generated in the media substream, and the server includes: a client interface component , used to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an instruction to be transmitted The first type parameter of the target media stream, the second type parameter indicating the target media sub-stream to be transmitted, and the third type parameter indicating the candidate media unit to be transmitted; a media segment generating component for generating according to the media segment request media segment, wherein, for each pull command in the media segment request, the target media stream to be transmitted is selected, at least one target media sub-stream to be transmitted in the target media stream is selected, and the The candidate media unit to be transmitted in the target media substream, and the candidate media units determined by each pull command are encapsulated into the media segment; the media segment sending component is used to send the generated media segment to the client. .
本申请实施例的媒体流的自适应实时递送服务器,可以根据客户端的请求来任意组合各子流的媒体单元并实时生成媒体段,并将媒体段递送给客户端。首先,这使得服务器只需要按各子流分别保存媒体单元,无需预先生成各种子流组合的片段,降低了服务器的存储需求,同时,也简化了客户端的同步处理,客户端只需要一次请求即可获得同一时间段内各子流的组合片段,易于保证各子流的同步接收。其次,客户端可以根据应用需要和网络情况,在媒体段请求中动态调整目标媒体子流,这样,可统一支持各种类型的多子流媒体流(如采用多码率编码/多视点/多声道/可伸缩编码)的自适应传送。最后,由于每个媒体段是由客户端的请求触发产生的,无论媒体流包括多少个子流,都不再需要清单文件,客户端也不需要请求和解析清单文件,这显著降低了复杂清单文件带来的传输开销和处理开销,从而有效降低媒体流的实时传输延时和传输开销。The adaptive real-time delivery server of the media stream in the embodiment of the present application can arbitrarily combine the media units of each substream according to the request of the client, generate media segments in real time, and deliver the media segments to the client. First, this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once The combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream. Secondly, the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported. Channel/Scalable Coding) adaptive delivery. Finally, since each media segment is triggered by the client's request, no matter how many sub-streams the media stream includes, the manifest file is no longer required, and the client does not need to request and parse the manifest file, which significantly reduces the complexity of manifest files. Therefore, the real-time transmission delay and transmission overhead of the media stream can be effectively reduced.
本申请实施例提供一种计算机设备,包括:至少一个处理器;以及,与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被设置为用于执行如上述实施例所述的媒体流的自适应实时递送方法。An embodiment of the present application provides a computer device, including: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, The instructions are arranged to perform an adaptive real-time delivery method for media streams as described in the above embodiments.
本申请实施例提供一种非临时性计算机可读存储介质,所述非临时性计算机可读存储介质存储计算机指令,所述计算机指令用于使所述计算机执行如上述实施例所述的媒体流的自适应实时递送方法。Embodiments of the present application provide a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, and the computer instructions are used to cause the computer to execute the media stream described in the foregoing embodiments Adaptive real-time delivery method.
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。Additional aspects and advantages of the present application will be set forth, in part, in the following description, and in part will be apparent from the following description, or learned by practice of the present application.
附图说明Description of drawings
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:
图1为根据本申请一个实施例的媒体流的自适应实时递送方法的处理过程示意图;1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of media streams according to an embodiment of the present application;
图2为根据本申请一个实施例的媒体流的自适应实时传送过程示意图;2 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application;
图3为根据本申请一个实施例的子流图样的示意图;3 is a schematic diagram of a sub-flow pattern according to an embodiment of the present application;
图4为根据本申请一个实施例的媒体子流描述信息(含多码率编码子流)的示意图;4 is a schematic diagram of media substream description information (including multi-rate coding substreams) according to an embodiment of the present application;
图5为根据本申请一个实施例的媒体子流描述信息(含多视点视频子流)的示意图;5 is a schematic diagram of media substream description information (including multi-view video substreams) according to an embodiment of the present application;
图6为根据本申请一个实施例的媒体子流描述信息(含可伸缩编码子流)的示意图;6 is a schematic diagram of media substream description information (including scalable coding substreams) according to an embodiment of the present application;
图7为根据本申请一个实施例的媒体流的自适应实时传送过程示意图;7 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application;
图8为根据本申请一个实施例的媒体流的自适应实时传送过程示意图;8 is a schematic diagram of an adaptive real-time transmission process of a media stream according to an embodiment of the present application;
图9为根据本申请一个实施例的不同媒体单元排序方式下候选媒体单元封装示意图;9 is a schematic diagram of candidate media unit encapsulation under different media unit sorting modes according to an embodiment of the present application;
图10为根据本申请一个实施例的媒体流的自适应实时递送服务器的结构示意图;10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application;
图11为根据本申请一个具体实施例的媒体流的自适应实时递送服务器的结构示意图;11 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to a specific embodiment of the present application;
图12为本申请实施例提供的电子设备的结构示意图。FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式detailed description
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary, and are intended to be used to explain the present application, but should not be construed as a limitation to the present application.
在互联网中,经常需要将各种实时产生的音频流、视频流或数据流从一个网络节点即时传送到另一个网络节点,这些网络节点既包括各种终端,如PC、手机、平板电脑,也包括各种应用服务器,如视频服务器、音频服务器,这里,将传送的这些音频流、视频流或数据流统称为媒体流。媒体流的传送过程可以用通用的客户端-服务器模型来描述:服务器将产生的媒体流实时递送给客户端。这里的服务器和客户端指的是逻辑上的功能实体,其中,服务器为发送媒体流的功能实体,客户端为接收媒体流的功能实体。服务器和客户端可存在于任何网络节点上。In the Internet, it is often necessary to transfer various real-time audio streams, video streams or data streams from one network node to another network node. These network nodes include various terminals, such as PCs, mobile phones, tablet computers, and It includes various application servers, such as a video server and an audio server. Here, the transmitted audio streams, video streams or data streams are collectively referred to as media streams. The delivery process of the media stream can be described by a general client-server model: the server delivers the generated media stream to the client in real time. The server and the client here refer to logical functional entities, wherein the server is a functional entity that sends a media stream, and the client is a functional entity that receives a media stream. Servers and clients can exist on any network node.
在许多应用场景中,存在多个需要同步传送的音频流、视频流或数据流,例如,一个演唱会的直播媒体流中包含至少一路视频流和至少一路音频流。一旦采用了多码率编码/多视点编码/多声道编码/可伸缩编码中的任意一种或多种,直播流中将存在多路数据流\视频流\音频流。这里,称一个待传送的直播媒体流中所有同步传送的视频流、音频流或数据流为该媒体流的媒体子流。In many application scenarios, there are multiple audio streams, video streams or data streams that need to be transmitted synchronously. For example, a live media stream of a concert includes at least one video stream and at least one audio stream. Once any one or more of multi-rate coding/multi-view coding/multi-channel coding/scalable coding is adopted, there will be multiple data streams\video streams\audio streams in the live stream. Here, all synchronously transmitted video streams, audio streams or data streams in a live media stream to be transmitted are referred to as media sub-streams of the media stream.
每个传送的媒体子流是服务器上实时产生的媒体单元的序列。不同的媒体子流,其对应的媒体单元可以自行选择。当媒体子流是一个实时产生的字节流时,可以选取一个字节为媒体单元;当媒体子流是一个经过实时采样获得的音频流或视频流时,可以选取原始的音频帧或视频帧为媒体单元;当媒体子流是一个经过实时采样和编码的音频流或视频流时,可以选择编码后的音频帧、编码后的视频帧或访问单元(Access Unit)为媒体单元;当媒体子流是一个经过实时采样、编码和封装的音频流或视频流时,可以选择封装后的传输包(如RTP包,PES/PS/TS包等)为媒体单元;当媒体子流是一个经过实时采样、编码、封装和预分段的音频流或视频流时,可以选择一个已分割的媒体片段(如HLS协议中使用的TS格式片段、DASH协议中使用的fMP4格式片段)为媒体单元。Each delivered media substream is a sequence of media units generated in real time on the server. For different media substreams, the corresponding media units can be selected by themselves. When the media substream is a real-time generated byte stream, one byte can be selected as the media unit; when the media substream is an audio stream or video stream obtained by real-time sampling, the original audio frame or video frame can be selected is the media unit; when the media substream is an audio stream or video stream sampled and encoded in real time, the encoded audio frame, the encoded video frame or the Access Unit can be selected as the media unit; when the media substream When the stream is an audio stream or video stream that has been sampled, encoded and encapsulated in real time, the encapsulated transport packet (such as RTP packet, PES/PS/TS packet, etc.) can be selected as the media unit; When sampling, encoding, encapsulating and pre-segmented audio or video streams, a segmented media segment (such as the TS format segment used in the HLS protocol and the fMP4 format segment used in the DASH protocol) can be selected as a media unit.
每个媒体单元可以关联一个产生时间,该产生时间通常为一个时间戳。每个媒体单元也可以关联一个序号,该序号可以用来表示媒体单元在媒体子流中产生的顺序。当序号用来表示媒体单元产生的顺序时,序号的意义需要根据具体的媒体单元来定义。当媒体单元为一个字节时,媒体 单元的序号为字节序号;当媒体单元为音频帧、视频帧时,媒体单元的序号为帧序号;当媒体单元为一个传输包时,媒体单元的序号为包序号;当媒体单元为一个流片段时,媒体单元的序号为片段序号(如HLS中每个TS片段的Media Sequence)。对于一个媒体子流来说,可以同时关联一个表示产生顺序的序号和一个产生时间,比如,当媒体子流为一个RTP包流时,RTP头部既有包序号(Sequence Number)字段来指示RTP包的顺序,又有Timestamp字段来指示RTP中封装的媒体数据的产生时间。Each media unit can be associated with a production time, which is usually a timestamp. Each media unit may also be associated with a sequence number, which may be used to indicate the order in which the media units are generated in the media substream. When the sequence number is used to indicate the order in which the media unit is generated, the meaning of the sequence number needs to be defined according to the specific media unit. When the media unit is one byte, the sequence number of the media unit is the byte sequence number; when the media unit is an audio frame or a video frame, the sequence number of the media unit is the frame sequence number; when the media unit is a transmission packet, the sequence number of the media unit is the packet sequence number; when the media unit is a stream segment, the sequence number of the media unit is the segment sequence number (such as the Media Sequence of each TS segment in HLS). For a media substream, a sequence number representing the generation sequence and a generation time can be associated at the same time. For example, when the media substream is an RTP packet stream, the RTP header has a packet sequence number (Sequence Number) field to indicate the RTP The sequence of the packets, and the Timestamp field to indicate the generation time of the media data encapsulated in the RTP.
为了识别一个媒体流中不同的媒体子流,需要对媒体子流进行编号:每个媒体子流关联一个唯一的子流编号。通常来说,当媒体流包括N个媒体子流时,对应的子流编号为1,2,……,N。每个媒体子流可以采用产生时间和/或序号来描述每个媒体单元的产生顺序。In order to identify different media substreams in a media stream, the media substreams need to be numbered: each media substream is associated with a unique substream number. Generally speaking, when a media stream includes N media substreams, the corresponding substreams are numbered 1, 2, . . . , N. The generation time and/or sequence number of each media substream may be used to describe the generation sequence of each media unit.
不同媒体子流的媒体单元的产生时间可以采用同步计时或独立计时。当采用独立计时,不同媒体子流的产生时间来源于非同步的时钟,因此,需要单独记录这些不同媒体子流之间产生时间的对应关系。当采用同步计时,不同媒体子流的产生时间来源于同一个参考时钟,通过产生时间即可了解到不同媒体子流中媒体单元的同步关系。为了简化起见,假定下述所有实施例中,一个媒体流中的所有媒体子流的产生时间使用服务器上的同一个参考时钟,对应着同一个时间线,如格林威治时间。The generation time of the media units of different media substreams may be synchronous timing or independent timing. When independent timing is adopted, the generation times of different media sub-streams are derived from asynchronous clocks. Therefore, it is necessary to separately record the corresponding relationship between the generation times of these different media sub-streams. When synchronous timing is used, the generation times of different media sub-streams are derived from the same reference clock, and the synchronization relationship of media units in different media sub-streams can be known through the generation times. For the sake of simplicity, it is assumed that in all the following embodiments, the generation time of all media substreams in one media stream uses the same reference clock on the server, which corresponds to the same time line, such as Greenwich Mean Time.
本申请实施例中一个媒体流包含至少一个媒体子流,其中,每个媒体子流可以是任意类型,如音频流、视频流或字幕流,每个媒体子流也可以采用任意传输封装类型,如RTP包流或MPEG2-TS流。当媒体子流为一个RTP包流时,媒体单元为一个RTP包,RTP的包序号(Sequence Number)为媒体单元的序号,RTP包的时间戳(Timestamp)即为媒体单元的时间戳。当媒体子流为一个MPEG2-TS流时,可以采用类似于HLS/DASH的方式,将TS流分割成固定时长(如1秒左右)的TS片段,将每个TS片段作为一个媒体单元,每个TS片段可包括多个媒体帧,然后将这些片段按产生顺序编上序号,作为媒体单元序号,每个片段中包含的第一个媒体帧的时间戳指明了该片段的产生时间。In this embodiment of the present application, a media stream includes at least one media substream, wherein each media substream may be of any type, such as an audio stream, a video stream, or a subtitle stream, and each media substream may also adopt any transmission encapsulation type, Such as RTP packet stream or MPEG2-TS stream. When the media substream is an RTP packet stream, the media unit is an RTP packet, the sequence number of the RTP packet (Sequence Number) is the sequence number of the media unit, and the timestamp (Timestamp) of the RTP packet is the timestamp of the media unit. When the media substream is an MPEG2-TS stream, a method similar to HLS/DASH can be used to divide the TS stream into TS segments of a fixed duration (such as about 1 second), and each TS segment is regarded as a media unit. Each TS segment may include a plurality of media frames, and then the segments are numbered in the sequence of generation, as the sequence number of the media unit, and the time stamp of the first media frame included in each segment indicates the generation time of the segment.
在传统的实时流媒体协议如RTP或RTMP中,采用的是服务器推送的方式:服务器上一旦有新的媒体单元,则主动发送给客户端。本申请实施例的方法则与各种HTTP自适应流(如HLS和MPEG-DASH)类似,采用客户端拉取的方式,但是不同点在于,现有的各种HTTP自适应流中,客户端都是根据清单文件来请求或拉取预先分割好的片段,每个片段可以通过一个URL来标识,而在本申请实施例中,媒体段不是预先分割好的,而是服务器根据客户端的请求即时生成的,客户端可以控制媒体段的内容。In traditional real-time streaming media protocols such as RTP or RTMP, the server push method is adopted: once there is a new media unit on the server, it will be actively sent to the client. The method of the embodiment of the present application is similar to various HTTP adaptive streams (such as HLS and MPEG-DASH), and adopts the method of pulling by the client, but the difference is that in the existing various HTTP adaptive streams, the client All of the pre-segmented segments are requested or pulled according to the manifest file, and each segment can be identified by a URL. In this embodiment of the present application, the media segment is not pre-segmented, but the server real-time according to the client's request. generated, the client can control the content of the media segment.
具体而言,图1为本申请实施例所提供的一种媒体流的自适应实时递送方法的处理过程示意图。Specifically, FIG. 1 is a schematic diagram of a processing process of a method for adaptive real-time delivery of a media stream provided by an embodiment of the present application.
如图1所示,媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,那么该媒体流的自适应实时递送方法包括以下步骤:As shown in FIG. 1 , the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with There is a generation time and/or a sequence number indicating the generation sequence of the media unit in the media sub-stream, then the adaptive real-time delivery method of the media stream comprises the following steps:
在步骤S101中,接收客户端发送的媒体段请求,其中,媒体段请求携带至少一个拉取命令,拉取命令不携带或携带至少一个控制参数,且每个控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数。In step S101, a media segment request sent by the client is received, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication of the target media to be transmitted A first type of parameter for the stream, a second type of parameter indicating the target media sub-stream to be delivered, and a third type of parameter indicating the candidate media unit to be delivered.
一些示例中,可以作为第一类参数的控制参数包括但不仅限于:媒体流标识、媒体流名称、节目标识等;可以作为第二类参数的控制参数包括但不仅限于:子流列表、子流图样、子流类型、子流优先级等;可以作为第三类参数的控制参数包括但不仅限于:起始序号、起始时间、最大时偏、单元类型、单元优先级等。本领域技术人员应当理解的是,还可以根据进一步实施的需要来定义新的控制参数。In some examples, the control parameters that can be used as the first type of parameters include but are not limited to: media stream identifier, media stream name, program identifier, etc.; the control parameters that can be used as the second type of parameters include but are not limited to: substream list, substream Pattern, sub-stream type, sub-stream priority, etc.; control parameters that can be used as the third type of parameters include but are not limited to: start sequence number, start time, maximum time offset, unit type, unit priority, etc. It should be understood by those skilled in the art that new control parameters can also be defined according to the needs of further implementation.
在一些情况下,一个媒体段请求可以携带一个或多个拉取命令,这些拉取命令都携带有各自的控制参数,或者,某个拉取命令也可以不携带任何控制参数。此外,还可以根据进一步实施的需要来定义除拉取命令之外的新的命令。In some cases, a media segment request may carry one or more pull commands, and these pull commands all carry respective control parameters, or a pull command may not carry any control parameters. Additionally, new commands other than pull commands can be defined as needed for further implementation.
当然,一些实施例中,媒体段请求可以采用任何网络传送协议来提交,比如常见的HTTP协议、TCP协议、UDP协议等。当采用HTTP协议提交媒体段请求时,也可以采用HTTP-GET方式或者HTTP-POST方式。Certainly, in some embodiments, the media segment request may be submitted using any network transmission protocol, such as common HTTP protocol, TCP protocol, UDP protocol, and so on. When using the HTTP protocol to submit the media segment request, the HTTP-GET method or the HTTP-POST method can also be used.
当媒体段请求中的拉取命令携带控制参数时,需要采用一定的封装规则将拉取命令及其控制参数封装成字符串或字节流,再发送给服务器。例如,当采用HTTP-GET来发送媒体段请求时,命令及其控制参数可以作为字符串封装到URL中。When the pull command in the media segment request carries control parameters, certain encapsulation rules need to be used to encapsulate the pull command and its control parameters into a string or byte stream, and then send it to the server. For example, when using HTTP-GET to send a media segment request, the command and its control parameters can be encapsulated in the URL as strings.
采用HTTP-GET的媒体段请求的示例如下:An example of a media segment request using HTTP-GET is as follows:
携带一个拉取命令(无控制参数)的媒体段请求:Media segment request with a pull command (without control parameters):
GET“http://www.xxx-server.com/msreq?cmd=PULL”GET "http://www.xxx-server.com/msreq?cmd=PULL"
携带一个拉取命令(有1个控制参数)的媒体段请求:Media segment request with a pull command (with 1 control parameter):
GET“http://www.xxx-server.com/msreq?cmd=PULL&streamID=601”GET "http://www.xxx-server.com/msreq?cmd=PULL&streamID=601"
GET“http://www.xxx-server.com/msreq?cmd=PULL&substreamList=2,3”GET "http://www.xxx-server.com/msreq?cmd=PULL&substreamList=2,3"
GET“http://www.xxx-server.com/msreq?cmd=PULL&substreamPattern=01100100”GET "http://www.xxx-server.com/msreq?cmd=PULL&substreamPattern=01100100"
GET“http://www.xxx-server.com/msreq?cmd=PULL&seqBegin=1003”GET "http://www.xxx-server.com/msreq?cmd=PULL&seqBegin=1003"
GET“http://www.xxx-server.com/msreq?cmd=PULL&timeBegin=31000”GET "http://www.xxx-server.com/msreq?cmd=PULL&timeBegin=31000"
GET“http://www.xxx-server.com/msreq?cmd=PULL&maxTimeOffset=1000”GET "http://www.xxx-server.com/msreq?cmd=PULL&maxTimeOffset=1000"
携带一个拉取命令(有多个控制参数)的媒体段请求(为方便显示,将较长的URL字符串拆成多行):A media segment request carrying a pull command (with multiple control parameters) (split the long URL string into multiple lines for easy display):
GET“http://www.xxx-server.com/msreq?cmd=PULL&streamID=601&timeBegin=32000”GET "http://www.xxx-server.com/msreq?cmd=PULL&streamID=601&timeBegin=32000"
GET“http://www.xxx-server.com/msreq?cmd=PULL&substreamList=1,2&seqBegin=1005”GET "http://www.xxx-server.com/msreq?cmd=PULL&substreamList=1,2&seqBegin=1005"
GET“http://www.xxx-server.com/msreq?GET "http://www.xxx-server.com/msreq?
cmd=PULL&substreamList=3&seqBegin=1005&unitType=3”cmd=PULL&substreamList=3&seqBegin=1005&unitType=3"
GET“http://www.xxx-server.com/msreq?GET "http://www.xxx-server.com/msreq?
cmd=PULL&substreamPattern=1101&timeBegin=32000&unitPrio=1”cmd=PULL&substreamPattern=1101&timeBegin=32000&unitPrio=1"
携带多个拉取命令的媒体段请求:(为方便显示,将较长的URL字符串拆成多行)Media segment request with multiple pull commands: (split long URL strings into multiple lines for easy display)
GET“http://www.xxx-server.com/msreq?GET "http://www.xxx-server.com/msreq?
cmd=PULL&streamID=601&substreamList=1&timeBegin=32000&cmd=PULL&streamID=601&substreamList=1&timeBegin=32000&
cmd=PULL&streamID=601&substreamList=2&seqBegin=1005”cmd=PULL&streamID=601&substreamList=2&seqBegin=1005"
GET“http://www.xxx-server.com/msreq?GET "http://www.xxx-server.com/msreq?
cmd=PULL&streamID=601&substreamPattern=0100001&timeBegin=32000&cmd=PULL&streamID=601&substreamPattern=0100001&timeBegin=32000&
cmd=PULL&streamID=601&substreamList=2&seqBegin=1005”cmd=PULL&streamID=601&substreamList=2&seqBegin=1005"
GET“http://www.xxx-server.com/msreq?GET "http://www.xxx-server.com/msreq?
cmd=PULL&streamID=601&substreamPattern=0100001&timeBegin=32000&cmd=PULL&streamID=601&substreamPattern=0100001&timeBegin=32000&
cmd=PULL&streamID=602&substreamPattern=0011&timeBegin=32000”cmd=PULL&streamID=602&substreamPattern=0011&timeBegin=32000”
上述请求的URL中,参数名streamID、substreamList、substreamPattern、seqBegin、timeBegin、maxTimeOffset、unitType、unitPrio分别代表媒体流标识、子流列表、子流图样、起始序号、起始时间、最大时偏、单元类型、单元优先级。In the URL of the above request, the parameter names streamID, substreamList, substreamPattern, seqBegin, timeBegin, maxTimeOffset, unitType, and unitPrio respectively represent the media stream ID, substream list, substream pattern, start sequence number, start time, maximum time offset, unit Type, unit priority.
服务器端可以采用Web服务器来接收上述客户端的媒体段请求,并从请求的URL中提取出相应的命令及其控制参数,并对每个拉取命令携带的控制参数进行分类:如果是媒体流标识或媒体流名称,则该参数为第一类参数;如果是子流列表或子流图样,则该参数为第二类参数;如果为下述参数之一:起始序号、起始时间、最大时偏、单元类型、单元优先级,则该参数为第三类参数。The server side can use a web server to receive the media segment request from the above client, extract the corresponding command and its control parameters from the requested URL, and classify the control parameters carried by each pull command: if it is a media stream identifier or media stream name, this parameter is the first type parameter; if it is a substream list or substream pattern, this parameter is the second type parameter; if it is one of the following parameters: start sequence number, start time, maximum time offset, unit type, unit priority, then this parameter is the third type of parameter.
在步骤S102中,根据媒体段请求生成媒体段,其中,针对媒体段请求中的每个拉取命令,选定待传送的目标媒体流,选定目标媒体流中待传送的至少一个目标媒体子流,确定目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成媒体段。In step S102, a media segment is generated according to the media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one target media substream to be transmitted in the target media stream is selected. stream, determine the candidate media units to be transmitted in the target media substream, and encapsulate the candidate media units determined by each pull command into media segments.
具体地,根据媒体段请求生成媒体段,这一步骤又可进一步划分为若干个子步骤S1021~S1024:首先,针对媒体段请求中的每个拉取命令,步骤S1021根据第一类参数选定待传送的目标媒体流,步骤S1022根据第二类参数选定前述目标媒体流中的目标媒体子流,步骤S1023根据第三类参数确定前述目标媒体子流中待传送的候选媒体单元,步骤S1024将所有拉取命令中确定的候选媒体单元封装成媒体段。Specifically, the media segment is generated according to the media segment request, and this step can be further divided into several sub-steps S1021-S1024: First, for each pull command in the media segment request, step S1021 selects the pending The target media stream to be transmitted, step S1022 selects the target media substream in the aforementioned target media stream according to the second type parameter, step S1023 determines the candidate media unit to be transmitted in the aforementioned target media substream according to the third type parameter, and step S1024 will Candidate media units identified in all pull commands are packaged into media segments.
一些实例中,在步骤S1021中,可根据媒体流标识或媒体流名称来选定待传送的目标媒体流,在步 骤S1022中,可根据子流列表、子流图样等参数来选定目标媒体子流,在步骤S1023中,可根据起始序号、起始时间、最大时偏等参数来确定候选媒体单元,在步骤S1024中,可以采用自定义的封装协议将一个或多个媒体单元封装成媒体段,例如一个简单的封装协议如下:媒体段由段头和段净荷组成,段净荷由若干个媒体单元级联而成,段头中则指示每个媒体单元的起始位置和长度,当每个媒体单元不携带产生时间或序号时,还应在段头中指示每个媒体单元的序号和/或产生时间,当每个媒体单元不携带子流编号时,还应在段头中指示每个媒体单元的子流编号。In some instances, in step S1021, the target media stream to be transmitted may be selected according to the media stream identifier or the media stream name, and in step S1022, the target media substream may be selected according to parameters such as the substream list and substream pattern. stream, in step S1023, the candidate media unit can be determined according to parameters such as the starting sequence number, starting time, maximum time offset, etc., and in step S1024, one or more media units can be encapsulated into media using a self-defined encapsulation protocol For example, a simple encapsulation protocol is as follows: a media segment consists of a segment header and a segment payload, and the segment payload is formed by concatenating several media units. The segment header indicates the starting position and length of each media unit. When the unit does not carry the generation time or sequence number, the sequence number and/or generation time of each media unit shall also be indicated in the segment header, and when each media unit does not carry the sub-stream number, each media unit shall also be indicated in the segment header. The substream number of the unit.
在步骤S103中,发送媒体段至客户端。In step S103, the media segment is sent to the client.
具体而言,服务器可以根据客户端的媒体段请求所使用的协议来选择适当的方式将媒体段发送给客户端,例如,当接收的媒体段请求采用HTTP GET方式时,可以通过HTTP GET响应消息来发送生成的媒体段:将媒体段放入到HTTP响应消息的实体主体中;如果媒体段请求是通过某个建立的TCP连接接收的,则可以直接通过该TCP连接将生成的媒体段发送给客户端。Specifically, the server can select an appropriate method to send the media segment to the client according to the protocol used by the client's media segment request. For example, when the received media segment request adopts the HTTP GET method, the HTTP GET response message can be used to respond Send the generated media segment: put the media segment into the entity body of the HTTP response message; if the media segment request is received through an established TCP connection, the generated media segment can be sent to the client directly through the TCP connection end.
当服务器接收到来自客户端的连续的媒体段请求时,服务器将根据客户端的请求不断生成新的媒体段,这些新的媒体段中封装了选定的目标媒体子流中最近产生且等待发送给客户端的若干媒体单元,客户端解析这些媒体段,即可恢复出实时媒体流中各目标媒体子流的媒体单元,这一过程如附图2所示。客户端可以根据应用需要或网络传送情况,不断调整媒体段请求中拉取命令携带的控制参数,如改变第二类参数(媒体子流列表等)和第三类参数(如起始时间、最大时偏、单元优先级等),保证媒体流从服务器到客户端传送的连续性、实时性和对动态网络的适应性。When the server receives continuous media segment requests from the client, the server will continue to generate new media segments according to the client's request. These new media segments encapsulate the selected target media substreams that have recently been generated and are waiting to be sent to the client. The client can parse these media segments to recover the media units of each target media substream in the real-time media stream. This process is shown in FIG. 2 . The client can continuously adjust the control parameters carried by the pull command in the media segment request according to application needs or network transmission conditions, such as changing the second type of parameters (media substream list, etc.) and the third type of parameters (such as start time, maximum time offset, unit priority, etc.), to ensure the continuity, real-time and adaptability to dynamic network transmission of media stream from server to client.
由于采用了即时生成媒体段的方式,本申请实施例的方法不再需要预分段和清单文件,也就不需要客户端接收和处理清单文件,从而降低传输延时和节省开销。同时,客户端可以通过媒体段请求来任意组合不同媒体子流中的媒体单元,只需要一次请求即可获得所需的各媒体子流的媒体单元,易于保证不同媒体子流的同步接收。最后,通过随时调整需要接收的媒体子流和候选媒体单元,更好的满足终端应用需求及适应网络带宽的变化,可以支持各种采用多子流编码(如多码率编码/多视点编码/多声道编码/可伸缩编码)的媒体流的自适应传送。Due to the instant generation of media segments, the method of the embodiments of the present application no longer requires pre-segmentation and manifest files, and thus does not require the client to receive and process manifest files, thereby reducing transmission delay and saving overhead. At the same time, the client can arbitrarily combine media units in different media substreams through media segment requests, and only need one request to obtain the required media units of each media substream, which is easy to ensure synchronous reception of different media substreams. Finally, by adjusting the media substreams and candidate media units that need to be received at any time, it can better meet the needs of terminal applications and adapt to changes in network bandwidth. Adaptive delivery of multi-channel coded/scalable coded) media streams.
应当理解的是,上述各步骤S101、步骤S102和步骤S103的设置仅为了描述的方便,而不用于限制方法的执行顺序。在具体实现时,各步骤可以对应着可独立运行并互相交互的功能实体。It should be understood that the settings of the above steps S101 , S102 and S103 are only for the convenience of description, and are not used to limit the execution sequence of the method. During specific implementation, each step may correspond to a functional entity that can run independently and interact with each other.
上述是对实施例1的详细阐述,下面将对实施例2进行详细说明,以下实施例中,将对服务器如何根据媒体段请求来生成媒体段做出举例说明。The above is a detailed description of the first embodiment, and the second embodiment will be described in detail below. In the following embodiments, an example will be given to illustrate how the server generates the media segment according to the media segment request.
可选地,在本申请的一个实施例中,根据媒体段请求生成媒体段,包括:如果拉取命令不携带第一类参数,则待传送的目标媒体流为缺省指定的媒体流;如果拉取命令不携带第二类参数,则待传送的目标媒体子流为目标媒体流中缺省指定的至少一个媒体子流;如果拉取命令不携带第三类参数,则候选媒体单元包括目标媒体子流中缺省指定的媒体单元,缺省指定的媒体单元为目标媒体子流中所有和最新媒 体单元的序号间隔小于第一预设值的媒体单元,或者为目标媒体子流中所有和最新媒体单元的产生时间间隔小于第二预设值的媒体单元,第一预设值和第二预设值均根据目标媒体子流得到。Optionally, in an embodiment of the present application, generating the media segment according to the media segment request includes: if the pull command does not carry the first type parameter, the target media stream to be transmitted is the default specified media stream; if If the pull command does not carry the second type parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream; if the pull command does not carry the third type parameter, the candidate media unit includes the target media substream. The media unit specified by default in the media substream, the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
具体而言,当服务器中仅有一个待传送的媒体流时,客户端发送的拉取命令不需要携带第一类参数,该媒体流即为选定的目标媒体流;当服务器中存在多个待传送的媒体流时,可以指定其中之一为缺省的媒体流,当客户端发送的拉取命令不携带任何第一类参数时,将该缺省媒体流选定为目标媒体流。Specifically, when there is only one media stream to be transmitted in the server, the pull command sent by the client does not need to carry the first type of parameters, and the media stream is the selected target media stream; when there are multiple media streams in the server One of the media streams to be transmitted can be designated as the default media stream. When the pull command sent by the client does not carry any first-type parameters, the default media stream is selected as the target media stream.
在实际执行过程中,对于任何一个目标媒体流来说,其包含的媒体子流可能是多种多样的。如可能包含不同类型的媒体子流:视频流、音频流、字幕流、附加信息流、图片流等等;对于同一种类型的媒体子流来说,可能包含不同的码率,如对于视频流来说,可能包含对应不同分辨率和帧率的媒体子流,对于音频流来说,可能包含对应不同采样率的媒体子流;对于同一种类型和码率的视频流,可能包括多个编码层(如采用可伸缩视频编码SVC),这些不同的编码层对应着不同的优先级。通常,服务器应该在所有媒体子流中,选择一种适合大多数终端显示和在大多数网络带宽情况下正常传送的一个或多个媒体子流,作为该目标媒体流的缺省媒体子流,当客户端不携带任何第二类参数时,将这些缺省媒体子流选定为目标媒体子流。In the actual execution process, for any target media stream, the media sub-streams it contains may be various. For example, it may contain different types of media substreams: video stream, audio stream, subtitle stream, additional information stream, picture stream, etc.; for the same type of media substream, it may contain different bit rates, such as for video stream For example, it may contain media substreams corresponding to different resolutions and frame rates; for audio streams, it may contain media substreams corresponding to different sampling rates; for video streams of the same type and bit rate, it may contain multiple encodings layers (such as using scalable video coding SVC), these different coding layers correspond to different priorities. Usually, the server should select one or more media sub-streams suitable for most terminal display and normal transmission under most network bandwidth conditions among all media sub-streams, as the default media sub-stream of the target media stream, When the client does not carry any second type parameters, these default media substreams are selected as target media substreams.
对于任何一个目标媒体子流而言,为了保证媒体流的实时传送,应首先传送该目标媒体子流最新产生的一个或多个媒体单元。当客户端发送的拉取命令中不携带任何第三类参数时,服务器可以将缺省指定的媒体单元作为候选媒体单元。这些缺省指定的媒体单元为目标媒体子流中所有和最新媒体单元的序号间隔小于第一预设值的媒体单元,或者为目标媒体子流中所有和最新媒体单元的产生时间间隔小于第二预设值的媒体单元。当拉取命令包括多个目标媒体子流且不携带任何第三类参数时,每个目标媒体子流设定的第一预设值或第二预设值应保证各目标媒体子流的发送同步。For any target media substream, in order to ensure the real-time transmission of the media stream, one or more media units newly generated by the target media substream should be transmitted first. When the pull command sent by the client does not carry any parameters of the third type, the server may use the default specified media unit as a candidate media unit. These default specified media units are all media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or the generation time interval between all and the latest media units in the target media substream is less than the second The default media unit. When the pull command includes multiple target media sub-streams and does not carry any third-type parameters, the first preset value or the second preset value set for each target media sub-stream shall ensure the sending of each target media sub-stream Synchronize.
基于相关实施例的说明可以理解到的是,采用上述实施方式,即使当客户端发送的媒体段请求中某个拉取命令不携带某一类控制参数时,服务器也可以选定缺省指定的目标媒体流\目标媒体子流\候选媒体单元,并生成媒体段发送给客户端。图2是本申请一个实施例的媒体流的实时传送过程示意图,服务器上只包含一个媒体流S1,当服务器收到媒体段请求MS_REQ1时,由于MS_REQ1只包含一个拉取命令且该拉取命令中不携带任何参数,服务器选定的目标媒体流即为缺省媒体流S1,选定的目标媒体子流即为缺省的媒体子流1和4;对于媒体子流1来说,其第一预设值为3,对于媒体子流4来说,第一预设值为4,因此,服务器分别确定媒体子流1和媒体子流4的候选媒体单元,并将其封装成第一个媒体段MS1,返回给客户端。It can be understood based on the description of the relevant embodiments that, by adopting the above-mentioned implementation manner, even when a certain pull command in the media segment request sent by the client does not carry a certain type of control parameter, the server can select the default specified control parameter. Target media stream\target media substream\candidate media unit, and generate media segments and send them to the client. 2 is a schematic diagram of a real-time transmission process of a media stream according to an embodiment of the present application. The server contains only one media stream S1. When the server receives a media segment request MS_REQ1, because MS_REQ1 only contains one pull command and the pull command contains Without carrying any parameters, the target media stream selected by the server is the default media stream S1, and the selected target media substreams are the default media substreams 1 and 4; for media substream 1, its first The preset value is 3. For media substream 4, the first preset value is 4. Therefore, the server determines the candidate media units of media substream 1 and media substream 4 respectively, and encapsulates them into the first media unit. Segment MS1, returned to the client.
实施例3,以下实施例中,将对服务器如何根据第二类参数来选定待传送的目标媒体子流进行说明。本申请实施例中给出的第二类参数包括两种: Embodiment 3, in the following embodiment, how the server selects the target media substream to be transmitted according to the second type of parameters will be described. The second type of parameters given in the embodiments of the present application include two types:
1)子流列表1) Subflow list
子流列表直接给出了目标媒体子流的编号。举例来说:substreamList=1,4可以用来表示选定的目标 媒体子流为子流1和子流4。The substream list directly gives the number of the target media substream. For example: substreamList=1,4 can be used to indicate that the selected target media substreams are substream 1 and substream 4.
2)子流图样2) Subflow pattern
子流图样是一个N位比特流,其中,N是所述目标媒体流包含的媒体子流的个数,子流图样的每个比特关联着所述目标媒体流的一个特定媒体子流,并用于指示所述特定媒体子流是否是一个待传送的目标媒体子流。附图3给出了一个子流图样(N=8)的示例,子流图样为比特流01101000,从左到右每个比特分别对应子流1到子流8,因此,比特值为1时指示所关联的子流为目标媒体子流,即上述子流图样所选定的目标媒体子流为三个:子流2,子流3和子流5。The sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams contained in the target media stream, and each bit of the sub-stream pattern is associated with a specific media sub-stream of the target media stream, and is used for indicating whether the specific media substream is a target media substream to be transmitted. Figure 3 shows an example of a substream pattern (N=8). The substream pattern is bitstream 01101000. From left to right, each bit corresponds to substream 1 to substream 8. Therefore, when the bit value is 1 Indicates that the associated sub-stream is the target media sub-stream, that is, the target media sub-streams selected by the sub-stream pattern above are three: sub-stream 2, sub-stream 3 and sub-stream 5.
通常在具体实施时,当目标媒体子流数目较少时或者需要指明目标媒体子流的封装顺序时,建议使用子流列表来表示目标媒体子流;当目标媒体子流数目较多且子流间采用相同时钟计时,建议使用子流图样。Usually in the specific implementation, when the number of target media substreams is small or the encapsulation order of the target media substreams needs to be specified, it is recommended to use the substream list to represent the target media substreams; when the number of target media substreams is large and the substreams need to be specified It is recommended to use the sub-stream pattern when the same clock is used.
在其他的实施方式中,除了将子流的编号作为第二类参数外,还可以将子流的特征定义为第二类参数。这些子流的特征包括:子流类型、子流优先级、视点编号,声道编号,视频分辨率等等,可以通过一个或多个子流特征参数来指示出目标媒体子流所需要满足的条件,由服务器来选择最终的目标媒体子流。In other implementation manners, in addition to the number of the sub-stream as the second-type parameter, the characteristics of the sub-stream can also be defined as the second-type parameter. The characteristics of these sub-streams include: sub-stream type, sub-stream priority, viewpoint number, channel number, video resolution, etc. One or more sub-stream feature parameters can be used to indicate the conditions that the target media sub-stream needs to meet , the server selects the final target media substream.
实施例4,以下实施例中,将对服务器如何将子流相关信息传递给客户端做出举例说明。 Embodiment 4. In the following embodiments, an example will be given to illustrate how the server transmits the sub-stream related information to the client.
在具体实施过程中,客户端需要通过第二类参数来指明目标媒体子流,其前提是客户端应该知道当前媒体流中包含哪些目标媒体子流以及这些目标媒体子流各有哪些特征,客户端才可以根据应用需求和网络传输情况来对要传送的目标媒体子流进行选择。In the specific implementation process, the client needs to specify the target media substream through the second type of parameters. The premise is that the client should know which target media substreams are included in the current media stream and the characteristics of these target media substreams. The terminal can select the target media substream to be transmitted according to the application requirements and network transmission conditions.
这些关于媒体流中媒体子流的描述信息可以由服务器应用层提供给客户端应用层,客户端可以通过独立于当前传送过程的方式(如提交额外的请求消息或通过第三方服务器)来获取这一信息,也可以直接在传送过程中从服务器上获取,本申请实施例中即提出了一种将媒体子流描述信息直接封装到媒体段中传送给客户端的方法。These descriptive information about the media substreams in the media stream can be provided by the server application layer to the client application layer, and the client can obtain this information in a way independent of the current transmission process (such as submitting additional request messages or through a third-party server). One piece of information can also be directly obtained from the server during the transmission process. In the embodiment of the present application, a method for directly encapsulating the media substream description information into a media segment and transmitting it to the client is proposed.
媒体子流描述信息包含的最少信息是:当前媒体流包含哪些媒体子流。如果媒体子流的编号是从1到N连续编号的,媒体子流描述信息中只需要包含媒体子流个数N,就可以获得所有媒体子流的编号。当媒体流采用各种多子流编码时,媒体子流描述信息中将引入更多的子流特征信息:The minimum information contained in the media substream description information is: which media substreams are included in the current media stream. If the numbers of the media substreams are consecutively numbered from 1 to N, the media substream description information only needs to include the number N of the media substreams to obtain the numbers of all the media substreams. When the media stream adopts various multi-substream encoding, more substream feature information will be introduced into the media substream description information:
1)媒体组件标识1) Media component identification
媒体组件标识用于指示一个媒体流中信息获取的不同途径,如直播现场通过不同摄像头采集到的视频信息对应着不同的组件。每个媒体子流都关联一个媒体组件,但同一个媒体组件可以对应多个媒体子流,比如同一个视点采集的视频可以用多个不同码率编码的媒体子流来表示。The media component identifier is used to indicate different ways of obtaining information in a media stream. For example, the video information collected by different cameras in a live broadcast corresponds to different components. Each media substream is associated with a media component, but the same media component can correspond to multiple media substreams. For example, a video captured from the same viewpoint can be represented by multiple media substreams encoded at different bit rates.
2)子流类型2) Subflow type
媒体子流的类型包括但不限于:视频、音频、图片、字幕等或者混合类型;所述混合类型是指一个 媒体子流中包含了多种类型的媒体单元,比如某一个子流可能同时包含视频和音频。The types of media substreams include but are not limited to: video, audio, picture, subtitle, etc. or mixed types; the mixed types refer to a media substream that contains multiple types of media units, for example, a substream may contain both video and audio.
3)子流码率3) Substream bit rate
当一个媒体子流是固定速率(CBR)时,子流码率用来指示该媒体子流的码率;如果一个媒体子流是可变码率(VBR)的,则子流码率用来指示该子流一段时间内的平均码率。When a media substream is a fixed rate (CBR), the substream code rate is used to indicate the code rate of the media substream; if a media substream is a variable code rate (VBR), the substream code rate is used to indicate the code rate of the media substream. Indicates the average bit rate of this substream over a period of time.
4)子流优先级4) Subflow priority
媒体子流的优先级,用于指示不同媒体子流在传送过程中的重要性。The priority of the media substream, used to indicate the importance of different media substreams in the transmission process.
5)编码层级5) Coding level
当媒体流采用可伸缩编码,如可伸缩视频编码(SVC)时,媒体流会产生多个层级的编码流,包括:一个基本层和多个增强层,每个媒体子流对应着一个编码层级。When the media stream adopts scalable coding, such as Scalable Video Coding (SVC), the media stream will generate multiple levels of coding streams, including: a base layer and multiple enhancement layers, and each media substream corresponds to a coding level .
6)视点标识6) Viewpoint identification
当媒体流采用多视点编码如3D视频时,媒体流会产生不同视点的多个编码流,每个媒体子流对应着一个视点。当多个视点联合编码成一个媒体子流时,则一个媒体子流可能存在多个视点标识。When the media stream adopts multi-view encoding such as 3D video, the media stream will generate multiple encoded streams of different viewpoints, and each media substream corresponds to a viewpoint. When multiple viewpoints are jointly encoded into one media substream, there may be multiple viewpoint identifiers in one media substream.
7)视频分辨率7) Video resolution
当一个媒体子流的类型为视频时,其编码时所采用的视频分辨率;When the type of a media substream is video, the video resolution used for encoding;
8)视频帧率8) Video frame rate
当一个媒体子流的类型为视频时,其视频编码所采用的帧率。When the type of a media substream is video, the frame rate used for video coding.
9)声道标识9) Channel identification
当媒体流采用多声道编码时,媒体流会分别产生多个声道上的编码数据,若干个声道组成一个声道组可进行多声道联合编码,每一个媒体子流都对应着一个或多个声道标识。When the media stream adopts multi-channel encoding, the media stream will generate encoded data on multiple channels respectively. Several channels form a channel group for multi-channel joint encoding. Each media substream corresponds to a or multiple channel IDs.
10)音频采样率10) Audio sample rate
当媒体子流为一个音频流时,其编码时所采用的采样率。When the media substream is an audio stream, the sampling rate used for encoding.
11)语言类型11) Language Type
当媒体子流为一个包含人声的音频流时,其人声所采用的语言。When the media substream is an audio stream containing vocals, the language of the vocals.
在具体实施时,每个媒体流可根据实际情况来定制自己的媒体子流描述信息,附图4~附图6中给出三种应用场景下媒体子流描述信息的示例,其中附图4中子流1、子流2和子流3是同一个媒体内容(媒体组件标识均为10)的三个不同码率编码的子流,附图5中子流1、子流2、子流3对应着同一个媒体内容的三个不同视点,子流4、子流5则对应着同一个媒体内容的两个不同声道,附图6中子流1~4对应着同一个媒体内容(媒体组件标识均为30)采用可伸缩视频编码时的一个基本层和3个增强层。客户端在接收到媒体段后,从中解析出媒体子流描述信息,然后可以根据业务层实际需要、终端性能和网络状况来实时选择需要传送的目标媒体子流,以支持各种多子流编码媒体流的自适应传送。During specific implementation, each media stream can customize its own media sub-stream description information according to the actual situation. Examples of media sub-stream description information under three application scenarios are given in Fig. 4 to Fig. 6 , in which Fig. 4 Substream 1, substream 2 and substream 3 are substreams encoded by three different code rates of the same media content (the media component identifiers are all 10), and substream 1, substream 2, and substream 3 in Figure 5 Corresponding to three different viewpoints of the same media content, substream 4 and substream 5 correspond to two different channels of the same media content, and substreams 1 to 4 in Figure 6 correspond to the same media content (media The component identifiers are all 30) a base layer and three enhancement layers when using scalable video coding. After receiving the media segment, the client parses the media substream description information from it, and then selects the target media substream to be transmitted in real time according to the actual needs of the service layer, terminal performance and network conditions to support various multi-substream encoding. Adaptive delivery of media streams.
一个媒体流的媒体子流描述信息一般是保持不变的,因此,并不需要在每个媒体段中都封装上述媒 体子流描述信息。通常来讲,当服务器收到客户端的第一个媒体段请求中,即可在返回的第一个媒体段中封装媒体子流描述信息,在后续的媒体段中可以不再封装媒体子流描述信息。The media sub-stream description information of a media stream generally remains unchanged, therefore, it is not necessary to encapsulate the above-mentioned media sub-stream description information in each media segment. Generally speaking, when the server receives the first media segment request from the client, it can encapsulate the media substream description information in the first returned media segment, and can no longer encapsulate the media substream description in subsequent media segments. information.
实施例5,以下实施例中,将对服务器如何通过第三类参数来确定待传送的候选媒体单元来做出举例说明。 Embodiment 5. In the following embodiments, an example will be given to illustrate how the server determines the candidate media unit to be transmitted through the third type of parameters.
可选地,在本申请的一个实施例中,根据媒体段请求生成媒体段,进一步包括:如果拉取命令携带至少一个第三类参数,其中,每个第三类参数对应候选媒体单元的至少一个约束条件,待传送的候选媒体单元包括每个目标媒体子流中同时满足第三类参数对应的全部约束条件的所有媒体单元。Optionally, in an embodiment of the present application, generating the media segment according to the media segment request further includes: if the pull command carries at least one third-type parameter, wherein each third-type parameter corresponds to at least one of the candidate media units. A constraint condition, the candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
下面将给出若干种第三类参数,以及每种第三类参数对应的约束条件:Several third-type parameters are given below, as well as the constraints corresponding to each third-type parameter:
1)起始序号1) Start sequence number
起始序号对应的约束条件为:如果起始序号有效,则候选媒体单元的序号在起始序号之后,或者等于起始序号。The constraint condition corresponding to the start sequence number is: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number.
2)起始时间2) Start time
起始时间对应的约束条件为:如果起始时间有效,则候选单元的产生时间在起始时间之后。The constraint condition corresponding to the start time is: if the start time is valid, the generation time of the candidate unit is after the start time.
3)最大时偏3) Maximum time offset
最大时偏对应的约束条件为:如果最大时偏有效,则在目标媒体子流中,候选媒体单元和最新媒体单元的产生时间间隔小于所述最大时偏。The constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, in the target media substream, the generation time interval between the candidate media unit and the latest media unit is less than the maximum time offset.
上述的第三类参数有效和无效指的是参数的值是否在一个指定的范围内。以起始序号为例,该起始序号的值不能超过当前最新媒体单元的序号,另一方面,为保证实时性,起始序号的值不能是早于某个现有媒体单元的序号,在上述范围内的起始序号即为有效。如果某个第三类参数为无效,则等同于不携带这个第三类参数。当所有第三类参数均无效时,则目标媒体子流中待传送的候选媒体单元为缺省指定的媒体单元。The above-mentioned third type of parameter validity and invalidity refers to whether the value of the parameter is within a specified range. Taking the start sequence number as an example, the value of the start sequence number cannot exceed the sequence number of the current latest media unit. On the other hand, to ensure real-time performance, the value of the start sequence number cannot be earlier than the sequence number of an existing media unit. The starting sequence number within the above range is valid. If a third-type parameter is invalid, it is equivalent to not carrying the third-type parameter. When all the third-type parameters are invalid, the candidate media unit to be transmitted in the target media substream is the default specified media unit.
另外,具体实施时每个拉取命令可以携带其中的一种或多种第三类参数,这里,并不限定拉取命令携带其他自行定义的第三类参数,例如,可以根据媒体单元的特征定义其他的第三类参数,如媒体单元类型、最小优先级、优先级范围等等,作为媒体单元的约束条件。In addition, during specific implementation, each pull command may carry one or more of the third type parameters. Here, the pull command is not limited to carry other self-defined third type parameters. For example, it can be based on the characteristics of the media unit. Define other third-type parameters, such as media unit type, minimum priority, priority range, etc., as constraints of the media unit.
需要进一步指出的是,当根据第二类参数选定的目标媒体子流只有一个时,只需要判断该目标媒体子流中的媒体单元是否满足各种第三类参数对应的约束条件。但是,如果根据第二类参数选定了多个目标媒体子流时,为了使得上述基于序号或产生时间的约束条件同时对多个目标媒体子流发生作用,这些目标媒体子流应该采用同步编号,或者采用同一个时钟来计时。其中,同步编号是指:在服务器上,每经过一个指定时间段,将所述每个目标媒体子流在所述时间段内产生的所有媒体单元都关联到同一个新序号。上述的指定时间段可以是定长或变长,可以是预先设定,也可以根据媒体单元的实际产生情况来 动态决定。采用同步编号后,媒体单元的序号不仅能用来指示各个媒体子流中媒体单元的产生顺序,还可以指示出不同目标媒体子流中媒体单元之间的同步关系。It should be further pointed out that when there is only one target media substream selected according to the second type parameter, it is only necessary to judge whether the media units in the target media substream satisfy the constraints corresponding to various third type parameters. However, if multiple target media sub-streams are selected according to the second type of parameters, in order to make the above-mentioned constraints based on sequence numbers or generation time act on multiple target media sub-streams at the same time, these target media sub-streams should use synchronization numbers. , or use the same clock for timing. The synchronization number refers to: on the server, every time a specified time period elapses, all media units generated by each target media substream within the time period are associated with the same new sequence number. The above-mentioned specified time period may be of fixed length or variable length, may be preset, or may be dynamically determined according to the actual generation of the media unit. After using the synchronization number, the serial number of the media unit can not only be used to indicate the generation sequence of the media units in each media substream, but also the synchronization relationship between the media units in different target media substreams.
附图2中给出了一个媒体流的实时递送过程,客户端请求目标媒体流S1的媒体数据,其中,目标媒体流S1为服务器上缺省的媒体流,该目标媒体流中包括4个媒体子流,其中,子流1、子流2和子流3为三个进行同步编号的媒体流(比如,三个采用不同码率编码的视频流),子流4则采用了独立于其他子流的编号过程(比如,一个独立编码输出的音频流),缺省指定的媒体子流为子流1和子流4。由于子流1和子流4不是同步编号,客户端收到媒体段后,得到的子流1和子流4的最新媒体单元的序号是不同的,因此,当客户端要继续接收新的媒体单元时,需要在媒体段请求中携带两个拉取命令,每个拉取命令携带不同的媒体子流列表和对应的起始序号,分别用于指定目标媒体子流及其待发送的媒体单元的特征,这样可以分别保证子流1和子流4的连续接收。Figure 2 shows a real-time delivery process of a media stream. The client requests the media data of the target media stream S1, wherein the target media stream S1 is the default media stream on the server, and the target media stream includes 4 media streams. Substreams, where substream 1, substream 2, and substream 3 are three media streams that are synchronously numbered (for example, three video streams encoded with different bit rates), and substream 4 uses an independent (for example, an independently encoded output audio stream), the default specified media substreams are substream 1 and substream 4. Since sub-stream 1 and sub-stream 4 are not synchronized numbers, after the client receives the media segment, the serial numbers of the latest media units of sub-stream 1 and sub-stream 4 are different. Therefore, when the client continues to receive new media units , you need to carry two pull commands in the media segment request, each pull command carries a different media substream list and the corresponding start sequence number, which are respectively used to specify the characteristics of the target media substream and the media unit to be sent. , so that the continuous reception of sub-stream 1 and sub-stream 4 can be guaranteed respectively.
附图7中的目标媒体流与附图2类似,不同的是,客户端主动请求子流1和子流2的媒体数据(比如,子流1、子流2和子流3分别对应采用可伸缩视频编码的基本层和两个增强层)。在这种情况下,由于子流1和子流2是同步编号的,客户端提交媒体段请求时,只使用一个拉取命令,其携带的子流列表中包括了两个目标媒体子流:子流1和子流2,其携带的起始序号可用于同时指示子流1和子流2中的候选媒体单元。The target media stream in Figure 7 is similar to Figure 2, except that the client actively requests the media data of sub-stream 1 and sub-stream 2 (for example, sub-stream 1, sub-stream 2 and sub-stream 3 respectively use scalable video coded base layer and two enhancement layers). In this case, since substream 1 and substream 2 are numbered synchronously, when the client submits a media segment request, only one pull command is used, and the substream list carried by it includes two target media substreams: For stream 1 and sub-stream 2, the starting sequence numbers carried by them can be used to indicate the candidate media units in sub-stream 1 and sub-stream 2 at the same time.
附图8中的目标媒体流与附图2和附图7类似,不同的是,客户端同时请求三个子流的同步媒体数据(包括子流1、子流2和子流4),虽然,子流4和子流1&2不是同步编号的,但是,目标媒体流S1中所有子流的产生时间使用的是同一个参考时钟,因此,客户端仍然可以只在媒体段请求中使用一个拉取命令,来实现对三个媒体子流的同时拉取。此时,拉取命令携带的子流列表中指定三个目标媒体子流,拉取命令携带的起始时间,则是客户端当前已接收到的媒体单元的最新产生时间,这样,服务器根据该起始时间,即可保证连续将新产生的待发送的所有媒体单元封装到媒体段中发送给客户端。The target media stream in Fig. 8 is similar to Fig. 2 and Fig. 7, the difference is that the client simultaneously requests the synchronized media data of three sub-streams (including sub-stream 1, sub-stream 2 and sub-stream 4), although the sub-stream Stream 4 and sub-streams 1&2 are not numbered synchronously. However, the generation time of all sub-streams in the target media stream S1 uses the same reference clock. Therefore, the client can still use only one pull command in the media segment request to Realize simultaneous pulling of three media substreams. At this time, three target media sub-streams are specified in the sub-stream list carried by the pull command, and the start time carried by the pull command is the latest generation time of the media unit currently received by the client. The start time can ensure that all newly generated media units to be sent are continuously encapsulated into media segments and sent to the client.
在该实施例中,客户端可以通过连续提交媒体段请求来实现媒体流的实时接收,可以通过调整目标媒体子流列表来适应应用需求和网络状态的变化,如在附图2中,开始时缺省接收的是子流1和子流4,当应用层只需要接收子流4或检测到网络带宽降低时,可在MS_REQ4中修改目标媒体子流仅包括子流4,即可自动切换到只接收子流4的媒体单元。In this embodiment, the client can receive media streams in real time by continuously submitting media segment requests, and can adapt to changes in application requirements and network status by adjusting the target media sub-stream list, as shown in FIG. 2, at the beginning By default, substream 1 and substream 4 are received. When the application layer only needs to receive substream 4 or detects that the network bandwidth is reduced, the target media substream can be modified in MS_REQ4 to only include substream 4, and it can be automatically switched to only substream 4. Media units of substream 4 are received.
实施例6,在以下实施例中,将对服务器将候选媒体单元封装成媒体段时的处理过程进行说明。Embodiment 6, in the following embodiment, the processing procedure when the server encapsulates the candidate media unit into a media segment will be described.
进一步,在本申请的一个实施例中,将各个拉取命令所确定的候选媒体单元封装成媒体段包括:按照拉取命令在媒体段请求中出现的顺序,将每个拉取命令所确定的候选媒体单元封装到所述媒体段,其中,如果某个拉取命令携带的参数包括单元排序方式,则按照单元排序方式对所确定的候选媒体单元进行排序后再封装到所述媒体段,如果某个拉取命令不携带单元排序方式,则按照缺省的排序方式对所确定的候选媒体单元排序后再封装到所述媒体段。Further, in an embodiment of the present application, encapsulating the candidate media units determined by each pull command into media segments includes: according to the order in which the pull commands appear in the media segment request, The candidate media units are encapsulated into the media segment, wherein if a parameter carried by a pull command includes a unit sorting method, the determined candidate media units are sorted according to the unit sorting method and then encapsulated into the media segment. If If a pull command does not carry a unit sorting mode, the determined candidate media units are sorted according to the default sorting mode and then encapsulated into the media segment.
本申请实施例给出六种基本的单元排序方式:The embodiment of the present application provides six basic cell sorting methods:
1)时间正向(TIME_FORWARD)1) Time forward (TIME_FORWARD)
按照候选媒体单元的产生时间来排序,越早产生的候选媒体单元越先封装到媒体段。The candidate media units are sorted according to the generation time of the candidate media units, and the earlier the candidate media units are generated, the earlier they are encapsulated into the media segment.
2)时间反向(TIME_BACKWARD)2) Time reverse (TIME_BACKWARD)
按照候选媒体单元的产生时间来反向排序,越晚产生的候选媒体单元越先封装到媒体段。The order is reversed according to the generation time of the candidate media units, and the candidate media units generated later are encapsulated into the media segment first.
3)序号正向(SEQ_FORWARD)3) Serial number forward (SEQ_FORWARD)
按照候选媒体单元的序号来排序,序号在前的候选媒体单元越先封装到媒体段。The candidate media units are sorted according to the sequence numbers of the candidate media units, and the candidate media units with the higher sequence numbers are encapsulated into the media segment earlier.
4)序号反向(SEQ_BACKWARD)4) Serial number reverse (SEQ_BACKWARD)
按照候选媒体单元的序号来反向排序,序号在后的候选媒体单元越先封装到媒体段。The sequence number of the candidate media unit is reversed, and the candidate media unit with the later sequence number is encapsulated into the media segment first.
5)子流编号顺序(SSNO_ORDER)5) Substream numbering order (SSNO_ORDER)
当存在多个目标媒体子流时,按照子流编号的大小顺序来依次封装各个子流的候选媒体单元。When there are multiple target media sub-streams, the candidate media units of each sub-stream are encapsulated in sequence according to the order of the sub-stream numbers.
6)子流列表顺序(SSLIST_ORDER)6) Substream list order (SSLIST_ORDER)
当存在多个目标媒体子流时且这些目标媒体子流由子流列表参数(SubStreamList)来定义时,按照各子流编号出现在子流列表中的顺序来依次封装多个子流的候选媒体单元。When there are multiple target media substreams and these target media substreams are defined by the substream list parameter (SubStreamList), the candidate media units of the multiple substreams are encapsulated sequentially according to the order in which the substream numbers appear in the substream list.
单元排序方式还可以是上述基本排序方式的级联,比如SSLIST_ORDER+SEQ_BACKWARD,这种级联的含义是,首先将所述候选媒体单元按照第一基本排序方式排序,且将排序后位置相同的候选媒体单元按照第二基本排序方式排序,依此类推直至完成排序。无论采用基本排序方式还是级联排序方式,如果经过排序后还存在位置相同的候选媒体单元,则按照缺省的排序方式对这些位置相同的候选媒体单元进行排序。The unit sorting method can also be a cascade of the above basic sorting methods, such as SSLIST_ORDER+SEQ_BACKWARD. The meaning of this cascade is that first, the candidate media units are sorted according to the first basic sorting method, and the candidates with the same position after sorting are sorted. The media units are ordered according to the second basic ordering, and so on until the ordering is complete. Regardless of the basic sorting method or the cascading sorting method, if there are still candidate media units with the same position after sorting, the candidate media units with the same position are sorted according to the default sorting method.
附图9给出了在不同媒体单元排序方式下候选媒体单元封装到媒体段的过程,这里的待生成的媒体段MS3包含了相同的候选媒体单元,但对应着不同的媒体段请求,最终媒体单元封装到媒体段的顺序也各不相同。9 shows the process of encapsulating candidate media units into media segments under different media unit sorting methods, where the media segment MS3 to be generated contains the same candidate media unit, but corresponds to different media segment requests, and the final media The order in which cells are encapsulated into media segments also varies.
排序方式1:媒体段请求由两个拉取命令组成,第一个拉取命令的目标媒体子流是子流4,第二个拉取命令的目标媒体子流是子流1和子流2。因此,按照拉取命令的顺序,首先将子流4的候选媒体单元封装到媒体段,由于第一个拉取命令未指明任何单元排序方式,因此,采用缺省的排序方式即时间正向来封装候选媒体单元D58~D62;然后,由于第二个拉取命令携带的单元排序方式为时间反向(TIME_BACKWARD),子流1和子流2按照时间反向所封装的媒体单元依次为A27/B27,A26/B26,A25/B25。对于位置相同的媒体单元,缺省按照其子流编号的大小进行排序,因此,最终候选媒体单元的封装顺序如附图9排序方式1所示。Sorting method 1: The media segment request consists of two pull commands. The target media substream of the first pull command is substream 4, and the target media substreams of the second pull command are substream 1 and substream 2. Therefore, according to the order of the pull commands, the candidate media units of substream 4 are firstly encapsulated into the media segment. Since the first pull command does not specify any unit sorting method, the default sorting method, that is, the time forward is used to encapsulate Candidate media units D58 to D62; then, since the unit sorting method carried by the second pull command is time reverse (TIME_BACKWARD), the media units encapsulated by substream 1 and substream 2 according to time reverse are A27/B27 in turn, A26/B26, A25/B25. Media units with the same location are sorted according to the size of their substream numbers by default. Therefore, the packaging sequence of the final candidate media units is shown in Sorting Mode 1 in FIG. 9 .
排序方式2:媒体段请求只包括一个拉取命令,该拉取命令携带的单元排序方式是两个基本排序方式的级联:SSLIST_ORDER+SEQ_FORWARD。第一个基本排序方式是子流列表顺序(SSLIST_ORDER), 指示按照子流列表(SubStreamList=4,1,2)中子流编号的顺序来封装各子流的候选媒体单元,即先封装子流4的候选媒体单元,再封装子流1的候选媒体单元,再封装子流2的候选媒体单元。第二个基本排序方式是序号正向(SEQ_FORWARD),即对于属于同一个子流的候选媒体单元,按其序号从前到后的顺序来对候选媒体单元进行排序,最终,候选媒体单元的封装顺序如附图9排序方式2所示。Ordering method 2: The media segment request includes only one pull command, and the unit ordering method carried by the pull command is a cascade of two basic ordering methods: SSLIST_ORDER+SEQ_FORWARD. The first basic sorting method is the substream list order (SSLIST_ORDER), which indicates that the candidate media units of each substream are encapsulated in the order of the substream numbers in the substream list (SubStreamList=4, 1, 2), that is, the substream is encapsulated first. The candidate media unit of 4, the candidate media unit of substream 1 is encapsulated, and the candidate media unit of substream 2 is further encapsulated. The second basic sorting method is sequence number forward (SEQ_FORWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted in the order of their sequence numbers from front to back. Finally, the encapsulation order of the candidate media units is as follows Figure 9 shows the sorting method 2.
排序方式3:媒体段请求只包括一个拉取命令,该拉取命令携带的单元排序方式是两个基本排序方式的级联:SSNO_ORDER+SEQ_BACKWARD。第一个基本排序方式是子流编号顺序(SSNO_ORDER),指示按照子流编号从小到大的顺序来封装各子流的候选媒体单元,即先封装子流1的候选媒体单元,再封装子流2的候选媒体单元,再封装子流3的候选媒体单元。第二个基本排序方式是序号反向(SEQ_BACKWARD),即对于属于同一个子流的候选媒体单元,按其序号从后到前的顺序来对候选媒体单元进行排序,最终,候选媒体单元的封装顺序如附图9排序方式3所示。Sorting mode 3: The media segment request includes only one pull command, and the unit sorting mode carried by the pull command is a cascade of two basic sorting modes: SSNO_ORDER+SEQ_BACKWARD. The first basic ordering method is the sub-stream number order (SSNO_ORDER), which indicates that the candidate media units of each sub-stream are encapsulated in the order of the sub-stream numbers from small to large, that is, the candidate media units of sub-stream 1 are encapsulated first, and then the sub-stream is encapsulated. The candidate media unit of 2, and then encapsulates the candidate media unit of substream 3. The second basic sorting method is sequence number reverse (SEQ_BACKWARD), that is, for candidate media units belonging to the same substream, the candidate media units are sorted according to their sequence numbers from back to front. Finally, the packaging order of the candidate media units As shown in Fig. 9 sorting mode 3.
排序方式4:媒体段请求只包括一个拉取命令,该拉取命令携带的单元排序方式只有一个:TIME_FORWARD,即按照所有候选媒体单元的产生时间从前到后来对候选媒体单元进行排序,最终,候选媒体单元的封装顺序如附图9排序方式4所示。Sorting method 4: The media segment request only includes one pull command, and the pull command carries only one unit sorting method: TIME_FORWARD, that is, the candidate media units are sorted from front to back according to the generation time of all candidate media units. Finally, the candidate media units are sorted. The encapsulation order of the media units is shown in Sorting Mode 4 of FIG. 9 .
当然,本实施例并不限制定义新的单元排序方式,例如,当每个媒体单元关联了一个优先级时,可以按照单元优先级对候选媒体单元进行排序,那么可以定义一个新的单元排序方式:高优先级单元优先(HIGH_PRIOR_FIRST);当每个媒体子流也关联了一个优先级时,可以按照子流优先级来对同一个拉取命令中的多个目标媒体子流进行排序,又可以定义一个新的单元排序方式:子流优先级顺序(SS_PRIOR_ORDER)。此外,在生成媒体段时,也可以不按照拉取命令在媒体段请求中出现的顺序来对各拉取命令所确定的候选媒体单元进行封装,比如不区分拉取命令,将所有的候选媒体单元进行排序并封装到媒体段中。Of course, this embodiment does not limit the definition of a new unit sorting method. For example, when each media unit is associated with a priority, the candidate media units can be sorted according to the unit priority, then a new unit sorting method can be defined. : High-priority unit priority (HIGH_PRIOR_FIRST); when each media sub-stream is also associated with a priority, multiple target media sub-streams in the same pull command can be sorted according to the sub-stream priority. Defines a new unit ordering method: substream priority order (SS_PRIOR_ORDER). In addition, when generating a media segment, the candidate media units determined by each pull command may not be encapsulated according to the order in which the pull commands appear in the media segment request. For example, without distinguishing between pull commands, all candidate media Units are ordered and packed into media segments.
通过拉取命令和单元排序方式来控制媒体单元封装到媒体段的顺序,可以在网络传输带宽不足的时候,保证特定子流的特定候选媒体单元的优先发送:比如,高优先级的媒体子流,在视频子流和音频子流同时传送时,可以优先保证音频的传送;在基本层和增强层码流同时传送的时候,优先发送基本层的候选媒体单元,在实时性要求高的场合,优先保证最新产生的候选媒体单元的传送,从而提高用户体验。The order in which media units are encapsulated into media segments is controlled by pulling commands and unit sorting, so that when the network transmission bandwidth is insufficient, the specific candidate media units of specific substreams can be sent preferentially: for example, high-priority media substreams , when the video sub-stream and the audio sub-stream are transmitted at the same time, the audio transmission can be guaranteed first; when the base layer and the enhancement layer code stream are transmitted at the same time, the candidate media unit of the base layer is preferentially sent. In the occasions with high real-time requirements, The delivery of the newly generated candidate media unit is prioritized to improve user experience.
根据本申请实施例提出的媒体流的自适应实时递送方法,可以根据客户端的请求来任意组合各子流的媒体单元并实时生成媒体段,并将媒体段递送给客户端。首先,这使得服务器只需要按各子流分别保存媒体单元,无需预先生成各种子流组合的片段,降低了服务器的存储需求,同时,也简化了客户端的同步处理,客户端只需要一次请求即可得到同一时间段内各子流的组合片段,易于保证各子流的同步接收。其次,客户端可以根据应用需要和网络情况,在媒体段请求中动态调整目标媒体子流,这样,可统一支持各种类型的多子流媒体流(如采用多码率编码/多视点/多声道/可伸缩编码)的自适应传送。最后,由于每个媒体段是由客户端的请求触发产生的,无论媒体流包括多少个子流,都不再需要清单文件,客 户端也不需要请求和解析清单文件,这显著降低了复杂清单文件带来的传输开销和处理开销,从而有效降低媒体流的实时传输延时和传输开销。According to the adaptive real-time delivery method of the media stream proposed by the embodiment of the present application, the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client. First, this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once The combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream. Secondly, the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported. Channel/Scalable Coding) adaptive delivery. Finally, since each media segment is triggered by the client's request, no matter how many sub-streams the media stream includes, the manifest file is no longer required, and the client does not need to request and parse the manifest file, which significantly reduces the complexity of manifest files. Therefore, the real-time transmission delay and transmission overhead of the media stream can be effectively reduced.
其次参照附图描述根据本申请实施例提出的媒体流的自适应实时递送服务器。Next, the adaptive real-time delivery server for media streams proposed according to the embodiments of the present application will be described with reference to the accompanying drawings.
图10是本申请一个实施例的媒体流的自适应实时递送服务器的结构示意图。FIG. 10 is a schematic structural diagram of an adaptive real-time delivery server for media streams according to an embodiment of the present application.
如图10所示,媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,服务器10包括:客户端接口组件100、媒体段生成组件200和媒体段发送组件300。As shown in FIG. 10 , the media stream includes at least one media substream, and each media substream is a sequence of media units generated in real time on the server, wherein each media substream is associated with a substream number, and each media unit is associated with Having a generation time and/or a sequence number indicating the order in which the media units are generated in the media substream, the server 10 includes a client interface component 100 , a media segment generating component 200 and a media segment sending component 300 .
其中,客户端接口组件100,用于接收客户端发送的媒体段请求,其中,媒体段请求携带至少一个拉取命令,拉取命令不携带或携带至少一个控制参数,且每个控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数。媒体段生成组件200,用于根据媒体段请求生成媒体段,其中,针对媒体段请求中的每个拉取命令,选定待传送的目标媒体流,选定目标媒体流中待传送的至少一个目标媒体子流,确定目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成媒体段,其中,根据媒体段请求生成媒体段包括:首先,针对媒体段请求中的每个拉取命令,选定待传送的目标媒体流,选定目标媒体流中待传送的至少一个目标媒体子流,确定各目标媒体子流中待传送的候选媒体单元,其次,将各个拉取命令所确定的候选媒体单元封装成媒体段。媒体段发送组件300,用于将生成的媒体段发送给客户端。本申请实施例的服务器10可以根据客户端的请求来任意组合各子流的媒体单元并实时生成媒体段,然后媒体段返回给客户端,从而在降低服务器上存储开销的同时简化各子流之间的同步传送,并且有效降低媒体流传输延时和开销。The client interface component 100 is configured to receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and each control parameter includes an indication A first type of parameter for the target media stream to be delivered, a second type of parameter to indicate the target media sub-stream to be delivered, and a third type of parameter to indicate a candidate media unit to be delivered. A media segment generating component 200, configured to generate a media segment according to a media segment request, wherein, for each pull command in the media segment request, a target media stream to be transmitted is selected, and at least one of the target media streams to be transmitted is selected The target media sub-stream, determines the candidate media units to be transmitted in the target media sub-stream, and encapsulates the candidate media units determined by each pull command into media segments, wherein generating the media segments according to the media segment request includes: first, for the media For each pull command in the segment request, select the target media stream to be transmitted, select at least one target media substream to be transmitted in the target media stream, determine the candidate media units to be transmitted in each target media substream, and then , the candidate media units determined by each pull command are encapsulated into media segments. The media segment sending component 300 is configured to send the generated media segment to the client. The server 10 in this embodiment of the present application can arbitrarily combine the media units of each substream according to the client's request, generate media segments in real time, and then return the media segments to the client, thereby reducing storage overhead on the server and simplifying the interaction between substreams. synchronous transmission, and effectively reduce the media stream transmission delay and overhead.
具体而言,客户端接口组件100用于接收客户端的媒体段请求;媒体段请求可以一个或多个拉取命令,每个拉取命令可携带0个、1个或多个控制参数;控制参数包括以下类别:第一类参数、第二类参数和第三类参数;第一类参数用于指示待传送的目标媒体流;第二类参数用于指示目标媒体流中待传送的目标媒体子流;第三类参数用于指示目标媒体子流中待传送的候选媒体单元。客户端接口组件100可以采用任何指定的协议来接收媒体段请求,例如,当采用HTTP协议时,客户端接口组件100可以是一个Web服务器,可以接收任何采用http协议的媒体段请求;当采用TCP协议时,客户端接口组件是一个TCP服务器,并提供一个固定的服务端口。Specifically, the client interface component 100 is used to receive a media segment request from a client; the media segment request can be one or more pull commands, and each pull command can carry 0, 1 or more control parameters; the control parameters Including the following categories: the first type parameter, the second type parameter and the third type parameter; the first type parameter is used to indicate the target media stream to be transmitted; the second type parameter is used to indicate the target media stream to be transmitted in the target media stream stream; the third type of parameter is used to indicate the candidate media unit to be transmitted in the target media substream. The client interface component 100 can use any specified protocol to receive the media segment request, for example, when the HTTP protocol is used, the client interface component 100 can be a Web server, which can receive any media segment request using the http protocol; protocol, the client interface component is a TCP server and provides a fixed service port.
媒体段生成组件200用于根据客户端的媒体段请求来生成所需的媒体段。从客户端接口组件100获取媒体段请求,并解析出其携带的拉取命令及其控制参数,然后根据第一类参数来选定待传送的目标媒体流,根据第二类参数来选定待传送的目标媒体子流,根据第三类参数来确定各个目标媒体子流中待传送的候选媒体单元,最后,从媒体流存储单元中提取出各个拉取命令所确定的候选媒体单元,将其封装 成媒体段,然后直接交由媒体段发送组件300来发送。The media segment generating component 200 is configured to generate the required media segment according to the media segment request of the client. The media segment request is obtained from the client interface component 100, and the pull command and its control parameters are parsed out. Then, the target media stream to be transmitted is selected according to the first type of parameters, and the to-be-transmitted media stream is selected according to the second type of parameters. The target media substreams to be transmitted are determined according to the third type of parameters to determine the candidate media units to be transmitted in each target media substream, and finally, the candidate media units determined by each pull command are extracted from the media stream storage unit, and the It is encapsulated into a media segment, and then directly sent to the media segment sending component 300 for sending.
进一步,如图11所示,本申请实施例的服务器10还包括至少一个媒体流实时生成组件,用于自行产生或实时接收来自其他服务器的一个或多个媒体流;媒体流包括至少一个媒体子流,每个媒体子流是服务器上实时产生的媒体单元的序列;每个媒体子流都关联着一个子流编号,每个媒体单元都关联有一个产生时间和/或一个序号,序号用来指示媒体单元在媒体子流中的产生顺序;Further, as shown in FIG. 11 , the server 10 in this embodiment of the present application further includes at least one media stream real-time generating component for generating or receiving one or more media streams from other servers in real time by itself; the media stream includes at least one media stream Stream, each media substream is a sequence of media units generated in real time on the server; each media substream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number, the sequence number is used to Indicates the generation order of the media units in the media substream;
具体而言,媒体流实时生成组件包括一个或多个媒体子流实时生成组件,每个媒体子流实时生成组件则包含了媒体子流实时产生的一个或多个处理步骤,例如,处理步骤包括但不限于:媒体信号的实时采集、编码压缩、传输封装和预分段。此外,媒体子流实时生成组件还可以实时接收来自其他装置的媒体流,或者将一个服务器上已存在的媒体流文件转换成实时产生的媒体单元序列。Specifically, the media stream real-time generation component includes one or more media sub-stream real-time generation components, and each media sub-stream real-time generation component includes one or more processing steps for the real-time generation of media sub-streams. For example, the processing steps include But not limited to: real-time acquisition of media signals, encoding and compression, transmission encapsulation and pre-segmentation. In addition, the real-time media sub-stream generation component can also receive media streams from other devices in real time, or convert existing media stream files on a server into real-time generated media unit sequences.
可选地,在本申请的一个实施例中,媒体段生成组件200进一步用于在拉取命令不携带第一类参数时,待传送的目标媒体流为缺省指定的媒体流,并且在拉取命令不携带第二类参数时,待传送的目标媒体子流为目标媒体流中缺省指定的至少一个媒体子流,以及在拉取命令不携带第三类参数时,候选媒体单元包括目标媒体子流中缺省指定的媒体单元,缺省指定的媒体单元为目标媒体子流中所有和最新媒体单元的序号间隔小于第一预设值的媒体单元,或者为目标媒体子流中所有和最新媒体单元的产生时间间隔小于第二预设值的媒体单元,第一预设值和第二预设值均根据目标媒体子流得到。Optionally, in an embodiment of the present application, the media segment generation component 200 is further configured to, when the pull command does not carry the first type of parameters, the target media stream to be transmitted is the default specified media stream, and when the pull command does not carry the first type parameter, the When the pull command does not carry the second type of parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream, and when the pull command does not carry the third type of parameter, the candidate media unit includes the target media substream. The media unit specified by default in the media substream, the default specified media unit is all the media units in the target media substream whose sequence number interval from the latest media unit is less than the first preset value, or all the media units in the target media substream and the latest media unit. For media units whose generation time interval of the latest media unit is less than the second preset value, both the first preset value and the second preset value are obtained according to the target media substream.
可选地,在本申请的一个实施例中,第二类参数包括子流列表,子流列表包含至少一个目标媒体子流的编号。Optionally, in an embodiment of the present application, the second type of parameter includes a sub-stream list, and the sub-stream list includes the serial number of at least one target media sub-stream.
可选地,在本申请的一个实施例中,第二类参数包括子流图样,子流图样为一个N位比特流,其中,N为目标媒体流包含的媒体子流的个数,子流图样的每个比特关联有目标媒体流的一个特定媒体子流,并用于指示特定媒体子流是否是一个待传送的目标媒体子流。Optionally, in an embodiment of the present application, the second type of parameter includes a sub-stream pattern, and the sub-stream pattern is an N-bit bit stream, where N is the number of media sub-streams included in the target media stream, and the sub-stream Each bit of the pattern is associated with a specific media substream of the target media stream and is used to indicate whether the specific media substream is a target media substream to be transmitted.
可选地,在本申请的一个实施例中,媒体段生成组件200还用于将媒体子流描述信息封装至媒体段中,媒体子流描述信息包括至少一个表项,其中,每个表项对应媒体流的一个媒体子流,并包含至少一个字段:媒体子流编号。Optionally, in an embodiment of the present application, the media segment generation component 200 is further configured to encapsulate media substream description information into the media segment, where the media substream description information includes at least one entry, wherein each entry Corresponds to a media substream of the media stream, and contains at least one field: the media substream number.
可选地,在本申请的一个实施例中,每个表项还包括至少一个下述字段:媒体组件标识、子流类型、子流码率、子流优先级、编码层级、视点标识、视频分辨率、视频帧率、声道标识、音频采样率、语言类型。Optionally, in an embodiment of the present application, each entry further includes at least one of the following fields: media component identifier, sub-stream type, sub-stream bit rate, sub-stream priority, coding level, viewpoint identifier, video Resolution, video frame rate, channel identification, audio sample rate, language type.
可选地,在本申请的一个实施例中,媒体段生成组件200进一步用于在拉取命令携带至少一个第三类参数时,每个第三类参数对应候选媒体单元的至少一个约束条件,待传送的候选媒体单元包括每个目标媒体子流中同时满足第三类参数对应的全部约束条件的所有媒体单元。Optionally, in an embodiment of the present application, the media segment generation component 200 is further configured to, when the pull command carries at least one third type parameter, each third type parameter corresponds to at least one constraint condition of the candidate media unit, The candidate media units to be transmitted include all media units in each target media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
可选地,在本申请的一个实施例中,目标媒体子流中的媒体单元采用同步编号,其中,每经过一个指定时间段,将每个目标媒体子流在指定时间段内产生的所有媒体单元都关联到同一个新序号,第三类 参数包括起始序号,起始序号对应的约束条件为:如果起始序号有效,则候选媒体单元的序号在起始序号之后,或者等于起始序号。Optionally, in an embodiment of the present application, the media units in the target media sub-stream adopt synchronization numbers, wherein, each time a specified time period passes, all media generated by each target media sub-stream within the specified time period are The units are all associated with the same new sequence number. The third type of parameter includes the start sequence number. The constraints corresponding to the start sequence number are: if the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number. .
可选地,在本申请的一个实施例中,所有目标媒体子流中媒体单元的产生时间来源于服务器上的同一个时钟,第三类参数包括起始时间,起始时间对应的约束条件为:如果起始时间有效,则候选媒体单元的产生时间在起始时间之后。Optionally, in an embodiment of the present application, the generation times of the media units in all the target media sub-streams are derived from the same clock on the server, the third type of parameters includes the start time, and the constraints corresponding to the start time are: : If the start time is valid, the generation time of the candidate media unit is after the start time.
可选地,在本申请的一个实施例中,第三类参数包括最大时偏,最大时偏对应的约束条件为:如果最大时偏有效,则在目标媒体子流中,候选媒体单元和最新媒体单元的产生时间间隔小于最大时偏。Optionally, in an embodiment of the present application, the third type of parameter includes the maximum time offset, and the constraint condition corresponding to the maximum time offset is: if the maximum time offset is valid, then in the target media substream, the candidate media unit and the latest The generation time interval of the media unit is less than the maximum time offset.
可选地,在本申请的一个实施例中,媒体段生成组件200进一步用于按照每个拉取命令在媒体段请求中出现的顺序,将每个拉取命令所确定的候选媒体单元封装到媒体段,其中,如果任意一个拉取命令携带的参数包括单元排序方式,则按照单元排序方式对拉取命令确定的候选媒体单元进行排序后再封装到媒体段,如果未携带单元排序方式,则按照缺省的排序方式对拉取命令确定的候选媒体单元排序后再封装到媒体段。Optionally, in an embodiment of the present application, the media segment generation component 200 is further configured to encapsulate the candidate media units determined by each pull command into the media segment request according to the order in which each pull command appears in the media segment request. Media segment, wherein, if the parameter carried by any pull command includes the unit sorting method, the candidate media units determined by the pull command are sorted according to the unit sorting method and then encapsulated into the media segment. If the unit sorting method is not carried, then The candidate media units determined by the pull command are sorted according to the default sorting method and then encapsulated into media segments.
可选地,在本申请的一个实施例中,单元排序方式为基本排序方式之一或多种基本排序方式的级联,基本排序方式包括以下种类:时间正向排序、时间反向排序、序号正向排序、序号反向排序、子流编号顺序排序、子流列表顺序排序。Optionally, in an embodiment of the present application, the unit sorting method is a cascade of one or more basic sorting methods, and the basic sorting methods include the following types: time forward sorting, time reverse sorting, serial number Forward sorting, serial number reverse sorting, substream number sequence sorting, and substream list sequence sorting.
另外,客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与VPS服务中,存在的管理难度大,业务扩展性弱的缺陷。Additionally, clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also known as a cloud computing server or a cloud host. It is a host product in the cloud computing service system to solve the traditional physical host and VPS services, which are difficult to manage and weak in business scalability. defect.
需要说明的是,前述对媒体流的自适应实时递送方法实施例的解释说明也适用于该实施例的媒体流的自适应实时递送服务器,此处不再赘述。It should be noted that, the foregoing explanation of the embodiment of the method for adaptive real-time delivery of media streams is also applicable to the adaptive real-time delivery server of media streams in this embodiment, and details are not repeated here.
根据本申请实施例提出的媒体流的自适应实时递送服务器,可以根据客户端的请求来任意组合各子流的媒体单元并实时生成媒体段,并将媒体段递送给客户端。首先,这使得服务器只需要按各子流分别保存媒体单元,无需预先生成各种子流组合的片段,降低了服务器的存储需求,同时,也简化了客户端的同步处理,客户端只需要一次请求即可得到同一时间段内各子流的组合片段,易于保证各子流的同步接收。其次,客户端可以根据应用需要和网络情况,在媒体段请求中动态调整目标媒体子流,这样,可统一支持各种类型的多子流媒体流(如采用多码率编码/多视点/多声道/可伸缩编码)的自适应传送。最后,由于每个媒体段是由客户端的请求触发产生的,无论媒体流包括多少个子流,都不再需要清单文件,客户端也不需要请求和解析清单文件,这显著降低了复杂清单文件带来的传输开销和处理开销,从而有效降低媒体流的实时传输延时和传输开销。According to the adaptive real-time delivery server of the media stream proposed by the embodiment of the present application, the media units of each sub-stream can be arbitrarily combined according to the request of the client, and the media segment can be generated in real time, and the media segment can be delivered to the client. First, this makes the server only need to store the media units according to each sub-stream, and does not need to generate fragments of various sub-stream combinations in advance, which reduces the storage requirements of the server, and at the same time, simplifies the synchronization processing of the client, and the client only needs to request once The combined segment of each substream in the same time period can be obtained, and it is easy to ensure the synchronous reception of each substream. Secondly, the client can dynamically adjust the target media sub-stream in the media segment request according to application needs and network conditions, so that various types of multi-sub-stream media streams (such as multi-rate encoding/multi-view/multi-stream media streams) can be uniformly supported. Channel/Scalable Coding) adaptive delivery. Finally, since each media segment is triggered by the client's request, no matter how many sub-streams the media stream includes, the manifest file is no longer required, and the client does not need to request and parse the manifest file, which significantly reduces the complexity of manifest files. Therefore, the real-time transmission delay and transmission overhead of the media stream can be effectively reduced.
图12为本申请实施例提供的电子设备的结构示意图。该电子设备可以包括:FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device may include:
存储器1201、处理器1202及存储在存储器1201上并可在处理器1202上运行的计算机程序。 Memory 1201 , processor 1202 , and computer programs stored on memory 1201 and executable on processor 1202 .
处理器1202执行程序时实现上述实施例中提供的媒体流的自适应实时递送方法。When the processor 1202 executes the program, the adaptive real-time delivery method of the media stream provided in the above embodiment is implemented.
进一步地,电子设备还包括:Further, the electronic device also includes:
通信接口1203,用于存储器1201和处理器1202之间的通信。The communication interface 1203 is used for communication between the memory 1201 and the processor 1202 .
存储器1201,用于存放可在处理器1202上运行的计算机程序。The memory 1201 is used to store computer programs that can be executed on the processor 1202 .
存储器1201可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。The memory 1201 may include high-speed RAM memory, and may also include non-volatile memory, such as at least one disk memory.
如果存储器1201、处理器1202和通信接口1203独立实现,则通信接口1203、存储器1201和处理器1202可以通过总线相互连接并完成相互间的通信。总线可以是工业标准体系结构(Industry Standard Architecture,简称为ISA)总线、外部设备互连(Peripheral Component,简称为PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,简称为EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,图12中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。If the memory 1201, the processor 1202 and the communication interface 1203 are independently implemented, the communication interface 1203, the memory 1201 and the processor 1202 can be connected to each other through a bus and complete communication with each other. The bus can be an Industry Standard Architecture (referred to as ISA) bus, a Peripheral Component (referred to as PCI) bus, or an Extended Industry Standard Architecture (referred to as EISA) bus or the like. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is shown in FIG. 12, but it does not mean that there is only one bus or one type of bus.
可选的,在具体实现上,如果存储器1201、处理器1202及通信接口1203,集成在一块芯片上实现,则存储器1201、处理器1202及通信接口1203可以通过内部接口完成相互间的通信。Optionally, in terms of specific implementation, if the memory 1201, the processor 1202 and the communication interface 1203 are integrated on one chip, the memory 1201, the processor 1202 and the communication interface 1203 can communicate with each other through an internal interface.
处理器1202可能是一个中央处理器(Central Processing Unit,简称为CPU),或者是特定集成电路(Application Specific Integrated Circuit,简称为ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路。The processor 1202 may be a central processing unit (Central Processing Unit, referred to as CPU), or a specific integrated circuit (Application Specific Integrated Circuit, referred to as ASIC), or is configured to implement one or more embodiments of the present application integrated circuit.
本实施例还提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如上的媒体流的自适应实时递送方法。This embodiment also provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the above-mentioned adaptive real-time delivery method of a media stream is implemented.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或N个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, description with reference to the terms "one embodiment," "some embodiments," "example," "specific example," or "some examples", etc., mean specific features described in connection with the embodiment or example , structure, material or feature is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials or characteristics described may be combined in any suitable manner in any one or N of the embodiments or examples. Furthermore, those skilled in the art may combine and combine the different embodiments or examples described in this specification, as well as the features of the different embodiments or examples, without conflicting each other.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“N个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In addition, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with "first", "second" may expressly or implicitly include at least one of that feature. In the description of the present application, "N" means at least two, such as two, three, etc., unless otherwise expressly and specifically defined.
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更N个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本申请的优选实施方式 的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本申请的实施例所属技术领域的技术人员所理解。Any process or method description in the flowchart or otherwise described herein may be understood to represent a module, segment or portion of code comprising one or N more executable instructions for implementing custom logical functions or steps of the process , and the scope of the preferred embodiments of the present application includes alternative implementations in which the functions may be performed out of the order shown or discussed, including performing the functions substantially concurrently or in the reverse order depending upon the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present application belong.
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行系统、装置或设备(如基于计算机的系统、包括处理器的系统或其他可以从指令执行系统、装置或设备取指令并执行指令的系统)使用,或结合这些指令执行系统、装置或设备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行系统、装置或设备或结合这些指令执行系统、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或N个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。The logic and/or steps represented in flowcharts or otherwise described herein, for example, may be considered an ordered listing of executable instructions for implementing the logical functions, may be embodied in any computer-readable medium, For use with, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a system including a processor, or other system that can fetch instructions from and execute instructions from an instruction execution system, apparatus, or apparatus) or equipment. For the purposes of this specification, a "computer-readable medium" can be any device that can contain, store, communicate, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or apparatus. More specific examples (non-exhaustive list) of computer readable media include the following: electrical connections (electronic devices) with one or N wires, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read Only Memory (ROM), Erasable Editable Read Only Memory (EPROM or Flash Memory), Fiber Optic Devices, and Portable Compact Disc Read Only Memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program may be printed, as the paper or other medium may be optically scanned, for example, followed by editing, interpretation, or other suitable medium as necessary process to obtain the program electronically and then store it in computer memory.
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,N个步骤或方法可以用存储在存储器中且由合适的指令执行系统执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。It should be understood that various parts of this application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the N steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware as in another embodiment, it can be implemented by any one of the following techniques known in the art, or a combination thereof: discrete with logic gates for implementing logic functions on data signals Logic circuits, application specific integrated circuits with suitable combinational logic gates, Programmable Gate Arrays (PGA), Field Programmable Gate Arrays (FPGA), etc.
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。Those skilled in the art can understand that all or part of the steps carried by the methods of the above embodiments can be completed by instructing the relevant hardware through a program, and the program can be stored in a computer-readable storage medium, and the program can be stored in a computer-readable storage medium. When executed, one or a combination of the steps of the method embodiment is included.
此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist physically alone, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated modules are implemented in the form of software functional modules and sold or used as independent products, they may also be stored in a computer-readable storage medium.
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, and the like. Although the embodiments of the present application have been shown and described above, it should be understood that the above embodiments are exemplary and should not be construed as limitations to the present application. Embodiments are subject to variations, modifications, substitutions and variations.

Claims (26)

  1. 一种媒体流的自适应实时递送方法,其特征在于,所述媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,所述每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,所述方法包括以下步骤:An adaptive real-time delivery method for a media stream, wherein the media stream includes at least one media sub-stream, and each media sub-stream is a sequence of media units generated in real time on a server, wherein each media sub-stream is a sequence of media units generated in real time on a server. The stream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number indicating the sequence in which the media unit is generated in the media substream, and the method includes the following steps:
    接收客户端发送的媒体段请求,其中,所述媒体段请求携带至少一个拉取命令,所述拉取命令不携带或携带至少一个控制参数,所述控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数;Receive a media segment request sent by the client, wherein the media segment request carries at least one pull command, and the pull command does not carry or carries at least one control parameter, and the control parameter includes a message indicating the target media stream to be transmitted. a first type of parameter, a second type of parameter indicating a target media substream to be transmitted, and a third type of parameter indicating a candidate media unit to be transmitted;
    根据所述媒体段请求生成媒体段,其中,针对所述媒体段请求中的每个拉取命令,选定所述待传送的目标媒体流,选定所述目标媒体流中待传送的至少一个目标媒体子流,确定所述目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成所述媒体段;A media segment is generated according to the media segment request, wherein, for each pull command in the media segment request, the target media stream to be transmitted is selected, and at least one of the target media streams to be transmitted is selected. the target media substream, determining the candidate media units to be transmitted in the target media substream, and encapsulating the candidate media units determined by each pull command into the media segment;
    发送所述媒体段至所述客户端。The media segment is sent to the client.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述媒体段请求生成媒体段,包括:The method according to claim 1, wherein the generating a media segment according to the media segment request comprises:
    如果所述拉取命令不携带所述第一类参数,则所述待传送的目标媒体流为缺省指定的媒体流;If the pull command does not carry the first type parameter, the target media stream to be transmitted is a default specified media stream;
    如果所述拉取命令不携带所述第二类参数,则所述待传送的目标媒体子流为所述目标媒体流中缺省指定的至少一个媒体子流;If the pull command does not carry the second type of parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream;
    如果所述拉取命令不携带所述第三类参数,则所述候选媒体单元包括所述目标媒体子流中缺省指定的媒体单元,所述缺省指定的媒体单元为所述目标媒体子流中所有和最新媒体单元的序号间隔小于第一预设值的媒体单元,或者为所述目标媒体子流中所有和最新媒体单元的产生时间间隔小于第二预设值的媒体单元,所述第一预设值和第二预设值均根据所述目标媒体子流得到。If the pull command does not carry the third type of parameter, the candidate media unit includes a default specified media unit in the target media substream, and the default specified media unit is the target media substream The sequence number interval between all media units in the stream and the latest media unit is less than the first preset value, or all media units in the target media sub-stream whose generation time interval between the latest media units is less than the second preset value, the Both the first preset value and the second preset value are obtained according to the target media substream.
  3. 根据权利要求1或2所述的方法,其特征在于,所述第二类参数包括子流列表,所述子流列表包含至少一个目标媒体子流的编号。The method according to claim 1 or 2, wherein the second type of parameter comprises a sub-stream list, and the sub-stream list includes the number of at least one target media sub-stream.
  4. 根据权利要求1所述的方法,其特征在于,所述第二类参数包括子流图样,所述子流图样为一个N位比特流,其中,N为所述目标媒体流包含的媒体子流的个数,所述子流图样的每个比特关联有所述目标媒体流的一个特定媒体子流,并用于指示所述特定媒体子流是否是一个待传送的目标媒体子流。The method according to claim 1, wherein the second type parameter comprises a sub-stream pattern, and the sub-stream pattern is an N-bit bit stream, wherein N is a media sub-stream included in the target media stream Each bit of the sub-stream pattern is associated with a specific media sub-stream of the target media stream, and is used to indicate whether the specific media sub-stream is a target media sub-stream to be transmitted.
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述媒体段请求生成媒体段,还包括:The method according to claim 1, wherein the generating a media segment according to the media segment request further comprises:
    将媒体子流描述信息封装至所述媒体段中,所述媒体子流描述信息包括至少一个表项,其中,每个表项对应所述媒体流的一个媒体子流,并包含至少一个字段:媒体子流编号。Encapsulate the media substream description information into the media segment, where the media substream description information includes at least one entry, wherein each entry corresponds to a media substream of the media stream and includes at least one field: Media substream number.
  6. 根据权利要求5所述的方法,其特征在于,所述每个表项还包括至少一个下述字段:媒体组件标识、子流类型、子流码率、子流优先级、编码层级、视点标识、视频分辨率、视频帧率、声道标识、音频采样率、语言类型。The method according to claim 5, wherein each entry further comprises at least one of the following fields: media component identifier, substream type, substream bit rate, substream priority, coding level, viewpoint identifier , video resolution, video frame rate, channel identification, audio sample rate, language type.
  7. 根据权利要求1所述的方法,其特征在于,所述根据媒体段请求生成媒体段,进一步包括:The method according to claim 1, wherein the generating a media segment according to the media segment request further comprises:
    如果所述拉取命令携带至少一个所述第三类参数,其中,所述每个第三类参数对应候选媒体单元的至少一个约束条件,所述待传送的候选媒体单元包括所述每个目标媒体子流中同时满足所述第三类参数对应的全部约束条件的所有媒体单元。If the pull command carries at least one parameter of the third type, wherein each parameter of the third type corresponds to at least one constraint condition of a candidate media unit, and the candidate media unit to be transmitted includes each target All media units in the media substream that simultaneously satisfy all the constraints corresponding to the third type of parameters.
  8. 根据权利要求7所述的方法,其特征在于,所述目标媒体子流中的媒体单元采用同步编号,其中,每经过一个指定时间段,将所述每个目标媒体子流在所述指定时间段内产生的所有媒体单元都关联到同一个新序号,所述第三类参数包括起始序号,所述起始序号对应的约束条件为:The method according to claim 7, wherein the media units in the target media sub-streams adopt synchronization numbers, wherein each time a specified time period elapses, each target media sub-stream is updated at the specified time All media units generated in the segment are associated with the same new sequence number, the third type of parameter includes a start sequence number, and the constraints corresponding to the start sequence number are:
    如果所述起始序号有效,则所述候选媒体单元的序号在所述起始序号之后,或者等于所述起始序号。If the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number.
  9. 根据权利要求7所述的方法,其特征在于,所述所有目标媒体子流中媒体单元的产生时间来源于服务器上的同一个时钟,所述第三类参数包括起始时间,所述起始时间对应的约束条件为:The method according to claim 7, wherein the generation time of the media units in all the target media sub-streams comes from the same clock on the server, the third type of parameters includes a start time, and the start time The constraints corresponding to time are:
    如果所述起始时间有效,则所述候选媒体单元的产生时间在所述起始时间之后。If the start time is valid, the generation time of the candidate media unit is after the start time.
  10. 根据权利要求7所述的方法,其特征在于,所述第三类参数包括最大时偏,所述最大时偏对应的约束条件为:The method according to claim 7, wherein the third type of parameter comprises a maximum time offset, and the constraint condition corresponding to the maximum time offset is:
    如果所述最大时偏有效,则在所述目标媒体子流中,所述候选媒体单元和最新媒体单元的产生时间间隔小于所述最大时偏。If the maximum time offset is valid, in the target media substream, the generation time interval between the candidate media unit and the latest media unit is smaller than the maximum time offset.
  11. 根据权利要求1所述的方法,其特征在于,所述将各个拉取命令所确定的候选媒体单元封装成所述媒体段,包括:The method according to claim 1, wherein the encapsulating the candidate media units determined by each pull command into the media segment comprises:
    按照每个拉取命令在所述媒体段请求中出现的顺序,将所述每个拉取命令所确定的候选媒体单元封装到所述媒体段,其中,如果任意一个拉取命令携带的参数包括单元排序方式,则按照所述单元排序方式对所述拉取命令确定的候选媒体单元进行排序后再封装到所述媒体段,如果未携带所述单元排序方式,则按照缺省的排序方式对所述拉取命令确定的候选媒体单元排序后再封装到所述媒体段。According to the order in which each pull command appears in the media segment request, the candidate media units determined by each pull command are encapsulated into the media segment, wherein, if the parameters carried by any one pull command include If the unit sorting mode is selected, the candidate media units determined by the pull command are sorted according to the unit sorting mode and then encapsulated into the media segment. If the unit sorting mode is not included, the default sorting mode is used. The candidate media units determined by the pull command are sorted and then encapsulated into the media segment.
  12. 根据权利要求11所述的方法,其特征在于,所述单元排序方式为基本排序方式之一或多种基本排序方式的级联,所述基本排序方式包括以下种类:时间正向排序、时间反向排序、序号正向排序、序号反向排序、子流编号顺序排序、子流列表顺序排序。The method according to claim 11, wherein the unit sorting method is a cascade of one or more basic sorting methods, and the basic sorting methods include the following types: time forward sorting, time reverse sorting Forward sorting, sequence number forward sorting, sequence number reverse sorting, substream number sequence sorting, and substream list sequence sorting.
  13. 一种媒体流的自适应实时递送服务器,其特征在于,所述媒体流包括至少一个媒体子流,每个媒体子流为服务器上实时产生的媒体单元的序列,其中,所述每个媒体子流关联有一个子流编号,每个媒体单元关联有一个产生时间和/或一个指示媒体单元在媒体子流中产生顺序的序号,所述服务器包括:An adaptive real-time delivery server for media streams, characterized in that the media stream includes at least one media sub-stream, and each media sub-stream is a sequence of media units generated in real time on the server, wherein each media sub-stream is The stream is associated with a substream number, and each media unit is associated with a generation time and/or a sequence number indicating the sequence in which the media unit is generated in the media substream, and the server includes:
    客户端接口组件,用于接收客户端发送的媒体段请求,其中,所述媒体段请求携带至少一个拉取命令,所述拉取命令不携带或携带至少一个控制参数,所述控制参数包括指示待传送的目标媒体流的第一类参数、指示待传送的目标媒体子流的第二类参数和指示待传送的候选媒体单元的第三类参数;A client interface component, configured to receive a media segment request sent by a client, wherein the media segment request carries at least one pull command, the pull command does not carry or carries at least one control parameter, and the control parameter includes an indication The first type parameter of the target media stream to be transmitted, the second type parameter indicating the target media substream to be transmitted, and the third type parameter indicating the candidate media unit to be transmitted;
    媒体段生成组件,用于根据所述媒体段请求生成媒体段,其中,针对所述媒体段请求中的每个拉取命令,选定所述待传送的目标媒体流,选定所述目标媒体流中待传送的至少一个目标媒体子流,确定所 述目标媒体子流中待传送的候选媒体单元,并将各个拉取命令所确定的候选媒体单元封装成所述媒体段;A media segment generation component, configured to generate a media segment according to the media segment request, wherein, for each pull command in the media segment request, the target media stream to be transmitted is selected, and the target media is selected. At least one target media sub-stream to be transmitted in the stream, determine the candidate media unit to be transmitted in the target media sub-stream, and encapsulate the candidate media unit determined by each pull command into the media segment;
    媒体段发送组件,用于将生成的媒体段发送给所述客户端。A media segment sending component, configured to send the generated media segment to the client.
  14. 根据权利要求13所述的服务器,其特征在于,所述媒体段生成组件进一步用于在所述拉取命令不携带所述第一类参数时,所述待传送的目标媒体流为缺省指定的媒体流,并且在所述拉取命令不携带所述第二类参数时,所述待传送的目标媒体子流为所述目标媒体流中缺省指定的至少一个媒体子流,以及在所述拉取命令不携带所述第三类参数时,所述候选媒体单元包括所述目标媒体子流中缺省指定的媒体单元,所述缺省指定的媒体单元为所述目标媒体子流中所有和最新媒体单元的序号间隔小于第一预设值的媒体单元,或者为所述目标媒体子流中所有和最新媒体单元的产生时间间隔小于第二预设值的媒体单元,所述第一预设值和第二预设值均根据所述目标媒体子流得到。The server according to claim 13, wherein the media segment generating component is further configured to, when the pull command does not carry the first type parameter, the target media stream to be transmitted is a default specified and when the pull command does not carry the second type parameter, the target media substream to be transmitted is at least one media substream specified by default in the target media stream, and the When the pull command does not carry the third type of parameter, the candidate media unit includes the media unit specified by default in the target media substream, and the default specified media unit is the media unit in the target media substream. All media units whose sequence number interval from the latest media unit is less than the first preset value, or all media units whose generation time interval from the latest media unit in the target media substream is less than the second preset value, the first media unit. Both the preset value and the second preset value are obtained according to the target media substream.
  15. 根据权利要求13或14所述的服务器,其特征在于,所述第二类参数包括子流列表,所述子流列表包含至少一个目标媒体子流的编号。The server according to claim 13 or 14, wherein the second type of parameter comprises a sub-stream list, and the sub-stream list includes the number of at least one target media sub-stream.
  16. 根据权利要求13所述的服务器,其特征在于,所述第二类参数包括子流图样,所述子流图样为一个N位比特流,其中,N为所述目标媒体流包含的媒体子流的个数,所述子流图样的每个比特关联有所述目标媒体流的一个特定媒体子流,并用于指示所述特定媒体子流是否是一个待传送的目标媒体子流。The server according to claim 13, wherein the second type parameter comprises a sub-stream pattern, and the sub-stream pattern is an N-bit bit stream, wherein N is a media sub-stream included in the target media stream Each bit of the sub-stream pattern is associated with a specific media sub-stream of the target media stream, and is used to indicate whether the specific media sub-stream is a target media sub-stream to be transmitted.
  17. 根据权利要求13所述的服务器,其特征在于,所述媒体段生成组件还用于将媒体子流描述信息封装至所述媒体段中,所述媒体子流描述信息包括至少一个表项,其中,每个表项对应所述媒体流的一个媒体子流,并包含至少一个字段:媒体子流编号。The server according to claim 13, wherein the media segment generating component is further configured to encapsulate media substream description information into the media segment, the media substream description information comprising at least one entry, wherein , each entry corresponds to a media substream of the media stream, and includes at least one field: a media substream number.
  18. 根据权利要求17所述的服务器,其特征在于,所述每个表项还包括至少一个下述字段:媒体组件标识、子流类型、子流码率、子流优先级、编码层级、视点标识、视频分辨率、视频帧率、声道标识、音频采样率、语言类型。The server according to claim 17, wherein each entry further comprises at least one of the following fields: media component identifier, substream type, substream bit rate, substream priority, encoding level, viewpoint identifier , video resolution, video frame rate, channel identification, audio sample rate, language type.
  19. 根据权利要求13所述的服务器,其特征在于,所述媒体段生成组件进一步用于在所述拉取命令携带至少一个所述第三类参数时,所述每个第三类参数对应候选媒体单元的至少一个约束条件,所述待传送的候选媒体单元包括所述每个目标媒体子流中同时满足所述第三类参数对应的全部约束条件的所有媒体单元。The server according to claim 13, wherein the media segment generating component is further configured to, when the pull command carries at least one parameter of the third type, each parameter of the third type corresponds to candidate media At least one constraint condition of the unit, the candidate media unit to be transmitted includes all media units in each target media substream that simultaneously satisfy all constraints corresponding to the third type of parameters.
  20. 根据权利要求19所述的服务器,其特征在于,所述目标媒体子流中的媒体单元采用同步编号,其中,每经过一个指定时间段,将所述每个目标媒体子流在所述指定时间段内产生的所有媒体单元都关联到同一个新序号,所述第三类参数包括起始序号,所述起始序号对应的约束条件为:The server according to claim 19, wherein the media units in the target media sub-streams adopt synchronization numbers, wherein each time a specified time period elapses, each target media sub-stream is updated at the specified time All media units generated in the segment are associated with the same new sequence number, the third type of parameter includes a start sequence number, and the constraints corresponding to the start sequence number are:
    如果所述起始序号有效,则所述候选媒体单元的序号在所述起始序号之后,或者等于所述起始序号。If the start sequence number is valid, the sequence number of the candidate media unit is after the start sequence number or equal to the start sequence number.
  21. 根据权利要求19所述的服务器,其特征在于,所述所有目标媒体子流中媒体单元的产生时间来源于服务器上的同一个时钟,所述第三类参数包括起始时间,所述起始时间对应的约束条件为:The server according to claim 19, wherein the generation time of the media units in all the target media sub-streams comes from the same clock on the server, and the third type of parameters includes a start time, and the start time The constraints corresponding to time are:
    如果所述起始时间有效,则所述候选媒体单元的产生时间在所述起始时间之后。If the start time is valid, the generation time of the candidate media unit is after the start time.
  22. 根据权利要求19所述的服务器,其特征在于,所述第三类参数包括最大时偏,所述最大时偏对应的约束条件为:The server according to claim 19, wherein the third type of parameter comprises a maximum time offset, and the constraint condition corresponding to the maximum time offset is:
    如果所述最大时偏有效,则在所述目标媒体子流中,所述候选媒体单元和最新媒体单元的产生时间间隔小于所述最大时偏。If the maximum time offset is valid, in the target media substream, the generation time interval between the candidate media unit and the latest media unit is smaller than the maximum time offset.
  23. 根据权利要求13所述的服务器,其特征在于,所述媒体段生成组件进一步用于按照每个拉取命令在所述媒体段请求中出现的顺序,将所述每个拉取命令所确定的候选媒体单元封装到所述媒体段,其中,如果任意一个拉取命令携带的参数包括单元排序方式,则按照所述单元排序方式对所述拉取命令确定的候选媒体单元进行排序后再封装到所述媒体段,如果未携带所述单元排序方式,则按照缺省的排序方式对所述拉取命令确定的候选媒体单元排序后再封装到所述媒体段。The server according to claim 13, wherein the media segment generation component is further configured to, according to the order in which each pull command appears in the media segment request, The candidate media units are encapsulated into the media segment, wherein, if the parameter carried by any pull command includes a unit sorting mode, the candidate media units determined by the pull command are sorted according to the unit sorting mode and then encapsulated into the media segment. If the media segment does not carry the unit sorting mode, the candidate media units determined by the pull command are sorted according to the default sorting mode and then encapsulated into the media segment.
  24. 根据权利要求23所述的服务器,其特征在于,所述单元排序方式为基本排序方式之一或多种基本排序方式的级联,所述基本排序方式包括以下种类:时间正向排序、时间反向排序、序号正向排序、序号反向排序、子流编号顺序排序、子流列表顺序排序。The server according to claim 23, wherein the unit sorting method is a cascade of one or more basic sorting methods, and the basic sorting methods include the following types: time forward sorting, time reverse sorting Forward sorting, sequence number forward sorting, sequence number reverse sorting, substream number sequence sorting, and substream list sequence sorting.
  25. 一种计算机设备,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述程序,以实现如权利要求1-12任一项所述的媒体流的自适应实时递送方法。A computer device, characterized by comprising: a memory, a processor, and a computer program stored on the memory and running on the processor, the processor executing the program to implement the method as claimed in claim 1 - 12 The adaptive real-time delivery method of a media stream according to any one of 12.
  26. 一种非临时性计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行,以用于实现如权利要求1-12任一项所述的媒体流的自适应实时递送方法。A non-transitory computer-readable storage medium on which a computer program is stored, characterized in that the program is executed by a processor to implement the adaptation of the media stream according to any one of claims 1-12 Real-time delivery method.
PCT/CN2021/103196 2020-06-30 2021-06-29 Adaptive real-time delivery method for media stream, and server WO2022002070A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010614997.5 2020-06-30
CN202010614997.5A CN113873343B (en) 2020-06-30 2020-06-30 Self-adaptive real-time delivery method of media stream and server

Publications (1)

Publication Number Publication Date
WO2022002070A1 true WO2022002070A1 (en) 2022-01-06

Family

ID=78981470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/103196 WO2022002070A1 (en) 2020-06-30 2021-06-29 Adaptive real-time delivery method for media stream, and server

Country Status (2)

Country Link
CN (1) CN113873343B (en)
WO (1) WO2022002070A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102857478A (en) * 2011-06-30 2013-01-02 华为技术有限公司 Method and device for controlling media data
US20170244772A1 (en) * 2012-11-30 2017-08-24 Google Technology Holdings LLC Multi-streaming multimedia data
CN110545492A (en) * 2018-09-05 2019-12-06 北京开广信息技术有限公司 real-time delivery method and server of media stream
CN110881018A (en) * 2018-09-05 2020-03-13 北京开广信息技术有限公司 Real-time receiving method and client of media stream
CN111193686A (en) * 2018-11-14 2020-05-22 北京开广信息技术有限公司 Media stream delivery method and server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109964471B (en) * 2016-10-18 2022-03-22 埃克斯普韦公司 Method for transmitting content to a mobile user equipment
CN111193684B (en) * 2018-11-14 2021-12-21 北京开广信息技术有限公司 Real-time delivery method and server of media stream

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102857478A (en) * 2011-06-30 2013-01-02 华为技术有限公司 Method and device for controlling media data
US20170244772A1 (en) * 2012-11-30 2017-08-24 Google Technology Holdings LLC Multi-streaming multimedia data
CN110545492A (en) * 2018-09-05 2019-12-06 北京开广信息技术有限公司 real-time delivery method and server of media stream
CN110881018A (en) * 2018-09-05 2020-03-13 北京开广信息技术有限公司 Real-time receiving method and client of media stream
CN111193686A (en) * 2018-11-14 2020-05-22 北京开广信息技术有限公司 Media stream delivery method and server

Also Published As

Publication number Publication date
CN113873343B (en) 2023-02-24
CN113873343A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
CN107409234B (en) Streaming based on file format using DASH format based on LCT
TWI668982B (en) Method and server device for transport interface for multimedia and file transport, and computer-readable storage medium for recording related instructions thereon
RU2632394C2 (en) System and method for supporting various schemes of capture and delivery in content distribution network
CN107251562B (en) Method and device for searching media data, method and device for transmitting media information
CN108141455B (en) Deadline signaling for streaming of media data
KR102301333B1 (en) Method and apparatus for streaming dash content over broadcast channels
US10659502B2 (en) Multicast streaming
US20200112753A1 (en) Service description for streaming media data
US11284135B2 (en) Communication apparatus, communication data generation method, and communication data processing method
JP2015136060A (en) Communication device, communication data generation method, and communication data processing method
KR20140008237A (en) Packet transmission and reception apparatus and method in mmt hybrid transmissing service
US11457051B2 (en) Streaming media data processing method, processing system and storage server
CN112104885B (en) System and method for accelerating M3U8 initial playing speed in live broadcasting
CN108494792A (en) A kind of flash player plays the converting system and its working method of hls video flowings
CN110086797B (en) Real-time receiving method of media stream, client, computer device and storage medium
CN113285947B (en) HLS live broadcast and multicast live broadcast connection method and device
KR102176404B1 (en) Communication apparatus, communication data generation method, and communication data processing method
WO2022002070A1 (en) Adaptive real-time delivery method for media stream, and server
WO2020098455A1 (en) Method for real-time delivery of media stream and server
CN110545492B (en) Real-time delivery method and server of media stream
WO2020048268A1 (en) Real-time transmitting method and real-time receiving method for media stream, server, and client
WO2020216035A1 (en) Real-time pushing method and real-time receiving method for media stream, server, and client

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21833863

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21833863

Country of ref document: EP

Kind code of ref document: A1