WO2021143362A1 - 资源传输方法及终端 - Google Patents

资源传输方法及终端 Download PDF

Info

Publication number
WO2021143362A1
WO2021143362A1 PCT/CN2020/131840 CN2020131840W WO2021143362A1 WO 2021143362 A1 WO2021143362 A1 WO 2021143362A1 CN 2020131840 W CN2020131840 W CN 2020131840W WO 2021143362 A1 WO2021143362 A1 WO 2021143362A1
Authority
WO
WIPO (PCT)
Prior art keywords
multimedia resource
code rate
media
target
media description
Prior art date
Application number
PCT/CN2020/131840
Other languages
English (en)
French (fr)
Inventor
周超
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Priority to EP20914247.0A priority Critical patent/EP3930335A4/en
Publication of WO2021143362A1 publication Critical patent/WO2021143362A1/zh
Priority to US17/473,189 priority patent/US11528311B2/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/23805Controlling the feeding rate to the network, e.g. by controlling the video pump
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/613Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for the control of the source by the destination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/61Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
    • H04L65/612Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • H04N21/26258Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists for generating a list of items to be played back in a given order, e.g. playlist, or scheduling item distribution according to such list
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/458Scheduling content for creating a personalised stream, e.g. by combining a locally stored advertisement with an incoming stream; Updating operations, e.g. for OS modules ; time-related management operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/64Addressing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/647Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
    • H04N21/64723Monitoring of network processes or resources, e.g. monitoring of network load
    • H04N21/64738Monitoring network characteristics, e.g. bandwidth, congestion level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8455Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • H04N21/8456Structuring of content, e.g. decomposing content into time segments by decomposing the content in the time domain, e.g. in time segments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/289Intermediate processing functionally located close to the data consumer application, e.g. in same machine, in same home or in same sub-network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Definitions

  • the present disclosure relates to the field of communication technology, and in particular to a resource transmission method and terminal.
  • Fragmentation-based media transmission methods include the common DASH (Dynamic Adaptive Streaming over HTTP, an HTTP-based adaptive streaming media transmission standard formulated by MPEG.
  • MPEG is called Moving Picture Experts Group in English, and Dynamic Picture Experts Group in Chinese.
  • HLS HTTP Live Streaming, an adaptive streaming media transmission standard based on HTTP developed by Apple
  • the server divides audio and video resources into segments of audio and video fragments, and each audio and video fragment can be transcoded into a different Bit rate:
  • the terminal accesses the URLs of each audio and video segment divided into the audio and video resources.
  • Different audio and video segments can correspond to the same or different bit rates, so that the terminal can conveniently log in Switching between audio and video resources with different bit rates is also called adaptive adjustment of the bit rate based on the terminal's own bandwidth.
  • the present disclosure provides a resource transmission method and terminal.
  • the technical solutions of the present disclosure are as follows:
  • a resource transmission method including: determining the target address information of the multimedia resource with a target bit rate based on a media description file of the multimedia resource, and the media description file is used to provide Address information of the multimedia resources with different code rates; sending a frame acquisition request carrying the target address information to the server, where the frame acquisition request is used to instruct the server to return the media of the multimedia resource at the target code rate frame.
  • a resource transmission device including: a determining unit configured to execute a media description file based on a multimedia resource to determine target address information of the multimedia resource with a target bit rate, so The media description file is used to provide address information of the multimedia resources with different code rates; the sending unit is configured to send a frame acquisition request carrying the target address information to the server, and the frame acquisition request is used to instruct all The server returns the media frame of the multimedia resource at the target code rate.
  • a terminal including: one or more processors; one or more memories for storing executable instructions of the one or more processors; wherein, the one Or multiple processors are configured to perform the following operations: based on the media description file of the multimedia resource, determine the target address information of the multimedia resource with a target code rate, and the media description file is used to provide the media description file with different code rates. Address information of the multimedia resource; sending a frame acquisition request carrying the target address information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target code rate.
  • a storage medium When at least one instruction in the storage medium is executed by one or more processors of a terminal, the terminal can perform the following operations: multimedia resource-based media
  • the description file determines the target address information of the multimedia resource with a target code rate, the media description file is used to provide the address information of the multimedia resource with different code rates; and sends a frame carrying the target address information to the server An acquisition request, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target bit rate.
  • a computer program product including one or more instructions, and the one or more instructions can be executed by one or more processors of a terminal, so that the terminal can perform the above-mentioned resource transmission method.
  • Fig. 1 is a schematic diagram showing an implementation environment of a resource transmission method according to an embodiment
  • FIG. 2 is a schematic diagram of a FAS framework provided by an embodiment of the present disclosure
  • Fig. 3 is a flow chart showing a method for resource transmission according to an embodiment
  • Fig. 4 is an interaction flowchart of a resource transmission method according to an embodiment
  • Fig. 5 is a block diagram showing a logical structure of a resource transmission device according to an embodiment
  • FIG. 6 shows a structural block diagram of a terminal 600 provided by an embodiment of the present disclosure.
  • the user information involved in the present disclosure may be information authorized by the user or fully authorized by all parties.
  • FLV is a streaming media format
  • FLV streaming media format is a video format developed with the introduction of Flash MX (an animation production software). Due to the extremely small file size and fast loading speed, it makes it possible to watch video files online (that is, to browse videos online). Its emergence effectively solves the problem of SWF (a special file for Flash) that is exported after video files are imported into Flash. Format) The file is so large that it cannot be used well on the Internet.
  • Streaming media uses a streaming transmission method, which refers to a technology and process that compresses a series of multimedia resources and sends resource packages through the network, thereby real-time transmission of multimedia resources on the Internet for viewing.
  • This technology makes the resource packages look like flowing water Send; if you don’t use this technology, you must download the entire media file before using it, so that you can only watch multimedia resources offline.
  • Streaming can transmit on-site multimedia resources or multimedia resources pre-stored on the server. When viewer users are watching these multimedia resources, the multimedia resources can be played by specific playback software after being delivered to the viewer terminal of the viewer user.
  • FAS FLV Adaptive Streaming, FLV-based adaptive streaming media transmission standard
  • FAS is a streaming resource transmission standard (or called a resource transmission protocol) proposed in this disclosure. Unlike the traditional fragment-based media transmission method, the FAS standard can achieve frame-level multimedia resource transmission, and the server does not need to wait for a complete The resource package can only be sent to the terminal after the video clip arrives. Instead, the target timestamp is determined after the terminal’s frame acquisition request is parsed. If the target timestamp is less than zero, all media frames that have been buffered starting from the target timestamp will be packaged and sent The terminal (without fragmentation). After that, if the target timestamp is greater than or equal to zero or there is a real-time stream in addition to the buffered media frames, the media frames of the multimedia resources are sent to the terminal frame by frame.
  • the target timestamp is determined after the terminal’s frame acquisition request is parsed. If the target timestamp is less than zero, all media frames that have been buffered starting from the target timestamp will be packaged and sent The terminal (without fragmentation). After that, if the target
  • the target code rate is specified in the frame acquisition request.
  • the code rate to be switched can be adjusted adaptively, and the frame acquisition request corresponding to the code rate to be switched can be resent to achieve The effect of adaptively adjusting the code rate of multimedia resources.
  • the FAS standard can realize frame-level transmission and reduce end-to-end delay. Only when the code rate is switched, a new frame acquisition request needs to be sent, which greatly reduces the number of requests and reduces the communication overhead of the resource transmission process.
  • Live broadcast Multimedia resources are recorded in real time.
  • the host user “pushes” the media stream (referring to push based on the streaming transmission method) to the server through the host terminal.
  • the media stream is "pulled” from the server (referring to pull based on the streaming transmission method) to the audience terminal, and the audience terminal decodes and plays multimedia resources, thereby real-time video playback.
  • On-demand also known as Video On Demand (VOD)
  • multimedia resources are pre-stored on the server, and the server can provide the multimedia resources specified by the audience user according to the requirements of the audience user.
  • the audience terminal sends an on-demand request to the server. After the multimedia resource specified by the on-demand request is inquired, the multimedia resource is sent to the audience terminal, that is, the audience user can selectively play a specific multimedia resource.
  • On-demand content can be arbitrarily controlled to play progress, while live broadcast is not.
  • the speed of live broadcast content depends on the real-time live broadcast progress of the host user.
  • Fig. 1 is a schematic diagram showing an implementation environment of a resource transmission method according to an embodiment.
  • the implementation environment may include at least one terminal 101 and a server 102, which will be described in detail below:
  • the terminal 101 is used for multimedia resource transmission, and each terminal may be equipped with a media codec component and a media playback component.
  • the media codec component is used to receive multimedia resources (such as fragmented transmission resources).
  • the multimedia resource is decoded after the package, frame-level transmission of the media frame, and the media playback component is used to play the multimedia resource after the multimedia resource is decoded.
  • the terminal 101 can be divided into a host terminal and a viewer terminal.
  • the host terminal corresponds to the host user
  • the viewer terminal corresponds to the viewer user.
  • the terminal can be the host terminal.
  • It can also be a viewer terminal for example, the terminal is the host terminal when the user is recording a live broadcast, and the terminal is the viewer terminal when the user is watching the live broadcast.
  • the terminal 101 and the server 102 may be connected through a wired network or a wireless network.
  • the server 102 is used to provide multimedia resources to be transmitted, and the server 102 may include at least one of a server, multiple servers, a cloud computing platform, or a virtualization center.
  • the server 102 is responsible for the main calculation work, and the terminal 101 is responsible for the secondary calculation work; or, the server 102 is responsible for the secondary calculation work, and the terminal 101 is responsible for the main calculation work; or, between the terminal 101 and the server 102 Distributed computing architecture for collaborative computing.
  • the server 102 may be a clustered CDN (Content Delivery Network, Content Delivery Network) server.
  • the CDN server includes a central platform and edge servers deployed in various places.
  • the central platform performs load balancing, content distribution, scheduling, etc.
  • the functional module enables the terminal where the user is located to rely on the local edge server to obtain the required content (ie, multimedia resources) nearby.
  • the CDN server adds a caching mechanism between the terminal and the central platform.
  • the caching mechanism is also an edge server (such as a WEB server) deployed in different geographic locations.
  • the central platform will be based on the distance between the terminal and the edge server. Near and far, dispatching the edge server closest to the terminal to provide services to the terminal can more effectively distribute content to the terminal.
  • the multimedia resources involved in the embodiments of the present disclosure include, but are not limited to: at least one of video resources, audio resources, image resources, or text resources.
  • the embodiments of the present disclosure do not specifically limit the types of multimedia resources.
  • the multimedia resource is a live video stream of a network host, or a historical on-demand video pre-stored on the server, or a live audio stream of a radio host, or a historical on-demand audio pre-stored on the server.
  • the device types of the terminal 101 include, but are not limited to: TVs, smart phones, smart speakers, vehicle-mounted terminals, tablet computers, e-book readers, MP3 (Moving Picture Experts Group Audio Layer III, moving picture expert compression Standard audio layer 3) player, MP4 (Moving Picture Experts Group Audio Layer IV, moving picture expert compression standard audio layer 4) player, laptop portable computer or desktop computer at least one.
  • the terminal 101 includes a smart phone as an example.
  • the number of the foregoing terminal 101 may be only one, or the number of the terminal 101 may be tens or hundreds, or more.
  • the embodiments of the present disclosure do not limit the number of terminals 101 and device types.
  • FIG. 2 is a schematic diagram of a FAS framework provided by an embodiment of the present disclosure. Please refer to FIG. 2.
  • An embodiment of the present disclosure provides a FAS (Streaming-based Multi-rate Adaptive) framework in which the terminal The multimedia resource transmission is carried out between 101 and the server 102 through the FAS protocol.
  • FAS Streaming-based Multi-rate Adaptive
  • an application also known as FAS client
  • the application is used to browse multimedia resources.
  • the application can be a short video application, a live broadcast application, or a video On-demand applications, social applications, shopping applications, etc., the embodiments of the present disclosure do not specifically limit the types of applications.
  • the user can start the application on the terminal and display the resource push interface (such as the home page or function interface of the application).
  • the resource push interface includes at least one multimedia resource's abbreviated information.
  • the abbreviated information includes title, introduction, and release.
  • the terminal can jump from the resource pushing interface to the resource playing interface, and in response to at least one of a poster, a poster, a trailer, or a highlight clip, Including the playback option of the multimedia resource, in response to the user's touch operation on the playback option, the terminal downloads the media presentation description file (MPD) of the multimedia resource from the server, and determines the target code rate based on the media description file
  • MPD media presentation description file
  • the target address information of the multimedia resource is sent to the server with a frame acquisition request (or FAS request) carrying the target address information, so that the server processes the frame acquisition request based on a certain specification (the processing specification of the FAS request), and the server locates After the media frame of the multimedia resource (continu
  • the media stream requested by the terminal is usually a live video stream that the host user pushes to the server in real time.
  • the server can transcode the live video stream after receiving the live video stream of the host user.
  • live video streams of multiple bit rates assign different address information to live video streams of different bit rates, and record them in the media description file, so that the corresponding frame acquisition requests that carry different address information can be returned at different bit rates.
  • Live video stream is usually a live video stream that the host user pushes to the server in real time.
  • the server can transcode the live video stream after receiving the live video stream of the host user.
  • live video streams of multiple bit rates assign different address information to live video streams of different bit rates, and record them in the media description file, so that the corresponding frame acquisition requests that carry different address information can be returned at different bit rates.
  • Live video stream is usually a live video stream that the host user pushes to the server in real time.
  • the server can transcode the live video stream after receiving the live video stream of the host user.
  • a mechanism for adaptively adjusting the code rate is provided.
  • the code rate to be switched that matches the current network bandwidth situation is adaptively adjusted. For example, when the code rate needs to be switched, the terminal can disconnect the media stream transmission link of the current code rate, send to the server a frame acquisition request carrying the address information to be switched corresponding to the code rate to be switched, and establish a media stream based on the code rate to be switched Transmission link, of course, the terminal can also not disconnect the media stream transmission link of the current bit rate, but directly re-initiate the frame acquisition request carrying the address information to be switched, and establish a media stream transmission link based on the bit rate to be switched (used to transmit new Media stream), the original media stream is used as the backup stream. Once the transmission of the new media stream is abnormal, the backup stream can continue to be played.
  • Fig. 3 is a flow chart showing a method for resource transmission according to an embodiment. The method for resource transmission is applied to a terminal in the FAS framework involved in the foregoing implementation environment.
  • the terminal determines the target address information of the multimedia resource with the target code rate based on the media description file of the multimedia resource, and the media description file is used to provide the address information of the multimedia resource with different code rates.
  • the terminal sends a frame acquisition request carrying the target address information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target code rate.
  • determining the target address information of the multimedia resource with the target code rate includes:
  • determining the target code rate includes:
  • the value carried in the code rate selection instruction is determined as the target code rate; or,
  • the target code rate is adjusted to the code rate corresponding to the current network bandwidth information.
  • the method further includes:
  • the frame acquisition request further includes at least one of a first extended parameter or a second extended parameter
  • the first extended parameter is used to indicate whether the media frame is an audio frame
  • the second extended parameter is used to indicate The media frame of the multimedia resource is transmitted from the target timestamp indicated by the second extension parameter.
  • the frame acquisition request only includes the first extended parameter, or the frame acquisition request only includes the second extended parameter, or the frame acquisition request includes both the first extended parameter and the second extended parameter.
  • the target timestamp is greater than the current time
  • the target timestamp is the timestamp of the key frame or audio frame closest to the current moment
  • the target timestamp is smaller than the current time, and the media frame includes media frames that have been buffered starting from the target timestamp.
  • the media description file includes a version number and a media description set, where the version number includes at least one of the version number of the media description file or the version number of the resource transmission standard, and the media description set includes multiple Media description meta-information, each media description meta-information corresponds to a multimedia resource of a bit rate, and each media description meta-information includes the length of the picture group and attribute information of the multimedia resource of the bit rate corresponding to the media description meta-information.
  • the version number only includes the version number of the media description file, or the version number only includes the version number of the resource transmission standard, or the version number includes both the version number of the media description file and the version number of the resource transmission standard.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate.
  • each attribute information further includes at least one of the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option, wherein the first adaptive function option is Y indicates whether the multimedia resource is visible relative to the adaptive function.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate, as well as the quality type of the multimedia resource, and the multimedia resource's address information. Any one of the hidden option of the resource, the first adaptive function option, or the default playback function option, or at least the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option A combination of two items.
  • the media description file further includes at least one of a service type, a second adaptive function option, or a third adaptive function option, where the service type includes at least one of live broadcast or on-demand, and the second The adaptive function option is used to indicate whether to turn on the adaptive function, and the third adaptive function option is used to indicate whether to turn on the adaptive function by default when starting playback.
  • the media description file also includes any one of the service type, the second adaptive function option, or the third adaptive function option, or it also includes the service type and the second adaptive function option. A combination of at least two of the function options or the third adaptive function options.
  • Fig. 4 is an interaction flow chart of a method for resource transmission according to an embodiment.
  • the method for resource transmission can be used in the FAS framework involved in the foregoing implementation environment.
  • the embodiment includes the following content.
  • the terminal displays a resource playback interface, and the resource playback interface includes playback options of multimedia resources.
  • an application program may be installed on the terminal, and the application program is used to browse multimedia resources.
  • the application program may include at least one of a short video application, a live broadcast application, a video-on-demand application, a social application, or a shopping application. The example does not specifically limit the type of application.
  • the multimedia resources involved in the embodiments of the present disclosure include, but are not limited to: at least one of video resources, audio resources, image resources, or text resources.
  • the embodiments of the present disclosure do not specifically limit the types of multimedia resources.
  • the multimedia resource is a live video stream of a network host, or a historical on-demand video pre-stored on the server, or a live audio stream of a radio host, or a historical on-demand audio pre-stored on the server.
  • the user can start the application on the terminal, and the application displays the resource push interface.
  • the resource push interface can be the homepage or function interface of the application.
  • the embodiment of the present disclosure does not specifically limit the type of the resource push interface.
  • the resource pushing interface may include abbreviated information of at least one multimedia resource, and the abbreviated information includes at least one of a title, a brief introduction, a poster, a trailer, or a highlight segment of the multimedia resource.
  • the user can click on the abbreviated information of the multimedia resource of interest.
  • the terminal can jump from the resource push interface to the resource play interface .
  • the resource play interface may include a play area and a comment area, the play area may include the play options of the multimedia resource, and the comment area may include other users' viewing comments on the multimedia resource.
  • detailed information of the multimedia resource may also be included in the play area.
  • the detailed information may include at least one of the title, brief introduction, keyword, publisher information, or current popularity of the multimedia resource.
  • the publisher information may include the nickname of the publisher, the avatar of the publisher, the number of followers of the publisher, etc. The embodiment of the present disclosure does not specifically limit the content of the detailed information or the publisher information.
  • the play area may also include a barrage input area and barrage setting options.
  • the user can use the barrage setting options to control whether to display the barrage, the movement speed of the barrage, the barrage display area, or the barrage display mode. At least one of (transparency, font size, etc.), the user can also input the content that he wants to comment by clicking on the barrage input area.
  • the form of barrage is not limited to text or emoticon images. The embodiment of the present disclosure does not set options for barrage.
  • the content or the form of the barrage input by the user is specifically limited.
  • the play area may also include a favorite option and a follow option. If the user clicks on the favorite option, the terminal can be triggered to send a favorite request to the server. The server responds to the favorite request and adds the multimedia resource to the user's site. In the corresponding favorites, if the user clicks the follow option, the terminal can be triggered to send a follow request to the server, and the server responds to the follow request and adds the publisher of the multimedia resource to the follow list corresponding to the user.
  • the virtual gift gift option may also be included in the playing area. If the user clicks on the gift option, the gift category of the virtual gift and the selection column of the gift amount can be displayed. The user has selected a certain category and a certain amount. After the virtual gift, you can click the confirm button to trigger the terminal to send a virtual gift gift request to the server. The server settles the gift request, deducts a certain value from the user’s account, and distributes a certain value to the host’s account. After the settlement is completed, the terminal can display the special effect animation of the virtual gift in a floating layer in the play area.
  • the resource playback interface may have more or fewer layout modes.
  • the embodiment of the present disclosure does not specifically limit the layout of the resource playback interface.
  • the terminal downloads the media description file of the multimedia resource from the server in response to the user's touch operation on the playback option, and the media description file is used to provide address information of the multimedia resource with multiple code rates.
  • the media description file is used to provide address information of multimedia resources with different code rates.
  • the terminal can download the multimedia resource only when the user clicks the play option for the first time.
  • Media description files may change and cause version changes, the terminal can also download the media description files again every time the user clicks the play option.
  • the embodiment of the present disclosure does not perform the download timing of the media description files. Specific restrictions.
  • the media description file may be a JSON (JavaScript Object Notation, JS object notation) format file, of course, it may also be a file in other formats, and the embodiment of the present disclosure does not specifically limit the format of the media description file.
  • the media description file may include a version number (@version) and a media description set (@adaptationSet), which are described in detail below:
  • the version number may include the media description file At least one of the version number or the version number of the resource transmission standard (FAS standard).
  • the version number can only include the version number of the FAS standard, or only the version number of the media description file, or the version number can also be The combination between the media description file and the version number of the FAS standard.
  • the media description set is used to represent the meta-information of multimedia resources.
  • the media description set may include multiple media description meta-information.
  • Each media description meta-information corresponds to a multimedia resource with a bit rate.
  • the media description meta information may include the length of the group of pictures (@gopDuration) and the attribute information (@representation) of the multimedia resource of the bit rate corresponding to the media description meta information.
  • the length of the Group Of Pictures here refers to the distance between two key frames.
  • the key frame refers to the intra-coded picture (Intra-coded picture, also known as "I frame") in the video coding sequence. ), the coding and decoding of the I frame does not need to refer to other image frames, and can be realized only by using the information of this frame.
  • P frame Predictive-coded picture
  • B frame Bidirectionally predicted picture
  • the encoding and decoding of the encoded image frame all need to refer to other image frames, and the encoding and decoding cannot be completed by using only the information of this frame.
  • each attribute information may include the identification information of the multimedia resource, the encoding method of the multimedia resource, and the location of the multimedia resource.
  • the supported code rate and the address information of the multimedia resource of the code rate may include the identification information of the multimedia resource, the encoding method of the multimedia resource, and the location of the multimedia resource.
  • Identification information Refers to the unique identifier of each multimedia resource.
  • the identification information can be allocated by the server.
  • Coding method Refers to the codec standard that the multimedia resource complies with, such as H.263, H.264, H.265, MPEG, etc.
  • Bitrate supported by multimedia resources refers to the number of bits of data transmitted per unit time during resource transmission, also known as bitrate.
  • Bitrate refers to the number of bits of data transmitted per unit time during resource transmission, also known as bitrate.
  • Audio resources Take audio resources as an example. The higher the bitrate, the more compressed the audio resources. The smaller the sound quality loss, the closer the sound quality to the sound source (the better the sound quality).
  • Video resources are the same as audio resources. However, since video resources are assembled from image resources and audio resources, the code rate is calculated in addition to In addition to the audio resources, the corresponding image resources must be added.
  • Address information (@url) of a multimedia resource with a certain bit rate refers to the URL (Uniform Resource Locator) of the multimedia resource with the bit rate after the server transcodes the multimedia resource and obtains the multimedia resource with the bit rate. Resource locator) or domain name (Domain Name).
  • each attribute information may further include at least one of the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate, as well as the quality type of the multimedia resource, and the multimedia resource's address information. Any one of the hidden option of the resource, the first adaptive function option, or the default playback function option, or at least the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option A combination of two items.
  • QualityType Includes quality evaluation indicators such as the resolution or frame rate of multimedia resources.
  • the hidden option of multimedia resources (@hiden): It is used to indicate whether the multimedia resources are exposed. If set to true, the multimedia resources corresponding to the bit rate are not exposed. At this time, the user cannot manually select the multimedia resources corresponding to the bit rate.
  • the multimedia resource of this code rate can be selected through the adaptive function. If set to false, the multimedia resource of the corresponding code rate is displayed. At this time, in addition to the multimedia resource of this code rate, the user can also select the multimedia resource of the code rate through the adaptive function.
  • the adaptive function involved in this application refers to the function of the terminal to dynamically adjust the frame rate of the played media stream according to the current network bandwidth situation, which will not be described in detail later.
  • the first adaptive function option (@enableAdaptive): used to indicate whether the multimedia resource is visible relative to the adaptive function, if set to true, it means that the multimedia resource of the corresponding bit rate is visible to the adaptive function, and the multimedia resource of the corresponding bit rate can If selected by the adaptive function, if it is set to false, it means that the multimedia resource of the corresponding bit rate is not visible to the adaptive function, and the multimedia resource of the corresponding bit rate cannot be selected by the adaptive function.
  • Default play function option (@defaultSelect): It is used to indicate whether to play the multimedia resource of the corresponding code rate by default when the broadcast is started. If set to true, it means that the multimedia resource of the corresponding code rate is played by default when the broadcast is started, if it is set to false , Means that the multimedia resource of the corresponding code rate is not played by default when the broadcast is started. Since the media player component cannot play the multimedia resource of two code rates by default (there is a play conflict), therefore, in the attribute information of all media description meta-information, at most only The default playback function option (@defaultSelect) of multimedia resources with a bit rate is true.
  • the media description file may also include at least one of a service type, a second adaptive function option, or a third adaptive function option, for example, the media description file It also includes any one of the service type, the second adaptive function option, or the third adaptive function option, or a combination of at least two of the service type, the second adaptive function option, or the third adaptive function option.
  • Service type used to specify the service type of multimedia resources, including at least one of live or on-demand, for example, when set to “dynamic”, it means live broadcast, when it is set to “static”, it means on-demand, if not When specified, "dynamic" can be used as the default value.
  • the second adaptive function option (@hideAuto): used to indicate whether to turn on the adaptive function, if it is set to true, it means to turn off the adaptive function, and the adaptive option is not displayed, if it is set to false, it means to turn on the adaptive function , And the adaptive option is displayed. If it is not specified, "false" can be set as the default value.
  • the third adaptive function option (@autoDefaultSelect): used to indicate whether the adaptive function is turned on by default when the broadcast is started. If set to true, it means that the adaptive function will be played by default when the broadcast is started. If it is false, it means that the playback is not based on the adaptive function by default when starting to play, that is, the adaptive function is turned off by default when starting to play. It should be noted that the third adaptive function option here is the premise of the above default playback function option, that is, only when the third adaptive function option is set to flase (the adaptive function is turned off by default when the broadcast is started), the default The playback function option will be effective.
  • the multimedia resource with the bit rate corresponding to @defaultSelect set to true will be played by default when the broadcast is started. Otherwise, if the third adaptive function option is set to true, the The adaptation function selects the multimedia resources with the most suitable bit rate for the current network bandwidth situation.
  • the above-mentioned media description file is a data file provided by the server to the terminal based on business requirements, and is pre-configured by the server according to the business requirements. It is used to provide a set of data sets and business-related descriptions for streaming media services to the terminal.
  • Media description files Including the encoded and transmittable media stream and the corresponding meta-information description, so that the terminal can construct a frame acquisition request (FAS request) based on the media description file, so that the server responds to the frame acquisition request according to the processing specifications of the FAS standard, and sends it to the terminal Provide streaming services.
  • FAS request frame acquisition request
  • the terminal determines the target bit rate.
  • the above target code rate refers to the code rate of the multimedia resource requested this time. According to different business scenarios, if the user plays the multimedia resource for the first time, the target code rate refers to the code rate at the start of the broadcast. If the user chooses to switch during the playback process If the code rate or the terminal adjusts the code rate based on the adaptive strategy, the target code rate may also be a code rate to be switched.
  • the terminal may provide a code rate selection list to the user.
  • the user clicks on any value in the code rate selection list it triggers the generation of a code rate selection instruction carrying the value, and the terminal responds to the code rate selection instruction to The value carried by the code rate selection instruction is determined as the target code rate.
  • the terminal can also adjust the target bit rate to the bit rate corresponding to the current network bandwidth information through the adaptive function.
  • the target bitrate with the best playback effect can be dynamically selected based on the playback status information of the terminal.
  • the terminal obtains the target address information of the multimedia resource of the target code rate from the address information of the multimedia resource of the multiple code rates included in the media description file.
  • the terminal obtains the target address information of the multimedia resource with the target code rate from the address information of the multimedia resource with different code rates included in the media description file.
  • the terminal after determining the target code rate, can index the target code rate in the media description file, and query to obtain the media description meta information corresponding to the multimedia resource of the target code rate.
  • the target address information stored in the @url field is extracted from the information.
  • the terminal is equivalent to the media description file based on the multimedia resource, and determines the target address information of the multimedia resource with the target code rate.
  • the terminal sends a frame acquisition request carrying the target address information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target code rate.
  • the terminal may generate a frame obtaining request carrying the target address information, and then send the frame obtaining request carrying the target address to the server.
  • the frame acquisition request may also include an extended parameter (@extParam), which is used to specify different request methods to achieve different functions.
  • the parameter may include at least one of the first extended parameter or the second extended parameter, which will be described in detail below:
  • the first extended parameter is an audio parameter, which is used to indicate whether the media frame is an audio frame. If set to true, it means that the media frame pulled by the terminal is an audio frame, that is, only pure audio streams are pulled Otherwise, if it is set to false, it means that the media frame pulled by the terminal is an audio and video frame, that is, the audio stream and the video picture stream are pulled. If not specified, "false" can be used as the default value.
  • the terminal can obtain the type of multimedia resource. If the type of multimedia resource is video, the first extended parameter can be set to "false" or a default value. If the type of multimedia resource is audio, the first extended parameter can be set to The parameter is set to "true”.
  • the terminal can also detect the type of the application. If the type of the application is a video application, the first extended parameter can be set to "false" or a default value. If the type of the application is an audio application, you can set The first extended parameter is set to "true”.
  • the second extension parameter belongs to a pull position parameter and is used to indicate that the media frame of the multimedia resource is transmitted from the target timestamp indicated by the second extension parameter.
  • the second extension The data type of the parameter may be an int64_t type, of course, it may also be another data type.
  • the embodiment of the present disclosure does not specifically limit the data type of the second extended parameter.
  • the second extended parameter can be specified in the frame acquisition request. If the second extended parameter is not specified in the frame acquisition request, the server configures the default value of the second extended parameter by the server.
  • the terminal based on the second extended parameter being greater than zero (@fasSpts>0), at this time the target timestamp pts is greater than the current time, then the terminal will start from the media frame whose pts is equal to @fasSpts (sometime in the future) Start to pull media streams;
  • the target timestamp pts is the timestamp of the key frame or audio frame closest to the current time.
  • the terminal When pulling audio frames (in pure audio mode), the terminal starts to pull the media stream from the latest audio frame, or when pulling audio and video frames (not in pure audio mode), the terminal starts to pull media from the latest video I frame flow;
  • the target timestamp is smaller than the current time
  • the media frame includes media frames that have been buffered starting from the target timestamp, that is, Yes
  • the terminal pulls media streams with a buffer length of
  • the terminal can determine the second extended parameter according to the service type (@type) field in the multimedia description file. If the service type is "dynamic" (live) and the user does not specify the playback progress, the terminal can The second extended parameter is set to 0; if the query server type is "dynamic" (live) and the user specifies the playback progress, the terminal can set the second extended parameter to the timestamp (target timestamp) corresponding to the playback progress; if If the service type is "static" (on-demand) and the user has not specified the playback progress, the terminal can detect the historical playback progress of the multimedia resource when it was closed last time and set the second extended parameter to the timestamp ( Target timestamp), it should be noted that if the user views the multimedia resource for the first time, and no historical playback progress can be queried at this time, the terminal can set the second extended parameter to the timestamp of the first media frame (target timestamp); When the service type is "static" (on-demand) and the user specifies the playback progress, the terminal can set
  • the format is the url address of the multimedia resource with the target code rate plus the extension field, which can be visually expressed as "url&extParam".
  • the server After the server receives the frame acquisition request, it can In accordance with the processing specifications stipulated by FAS, to respond to the frame acquisition request, please refer to the following 406.
  • the server returns the media frame of the multimedia resource to the terminal at the target bit rate in response to the frame acquisition request carrying the target address information.
  • the server can parse the frame acquisition request to obtain target address information. Based on the target address information, the server locates the media frame of the multimedia resource with the target code rate from the resource library.
  • the server may also parse and obtain at least one of the first extended parameter or the second extended parameter. At least one of the first extended parameter or the second extended parameter, the server can configure the default value of at least one of the first extended parameter or the second extended parameter, whether it is parsed from the frame acquisition request or configured by itself, the server Both the first extended parameter and the second extended parameter can be determined.
  • the server determines whether to send a media frame in audio form based on the first extended parameter, and determines from which timestamp to start pulling the media frame based on the second extended parameter, so that the server can Starting from the target timestamp indicated by the second extension parameter, the media frame in the form indicated by the first extension parameter is returned to the terminal at the target bit rate.
  • the target address information can be a domain name
  • the terminal can send a frame acquisition request to the central platform of the CDN server, and the central platform calls DNS (Domain Name System, domain name system), which is essentially A domain name resolution library) parses the domain name to obtain the CNAME (alias) record corresponding to the domain name. Based on the geographic location information of the terminal, the CNAME record is parsed again to obtain the IP (Internet Protocol) of the edge server closest to the terminal. Interconnection protocol) address.
  • the central platform directs the frame acquisition request to the above-mentioned edge server, and the edge server responds to the frame acquisition request to provide the terminal with the media frame of the multimedia resource at the target bit rate.
  • the terminal receives the media frame of the multimedia resource of the target code rate, it plays the media frame of the multimedia resource of the target code rate.
  • the terminal can store the media frame in the buffer area and call the media codec component to The media frame is decoded to obtain the decoded media frame, and the media playback component is called to play the media frames in the buffer area according to the time stamp (pts) from small to large.
  • the terminal can determine the encoding method of the multimedia resource from the @codec field of the media description file, and determine the corresponding decoding method according to the encoding method, so as to decode the media frame according to the determined decoding method.
  • the audio parameters and pull position parameters can be specified by defining the extended parameters. If the network condition changes and the bit rate needs to be switched, the following 408-412 can be used to set the bit rate. Sew switch.
  • the terminal determines the code rate to be switched in response to the code rate switching instruction.
  • the above code rate switching instruction may be triggered by the user in the code rate switching list, or automatically triggered by the terminal according to an adaptive strategy.
  • the embodiment of the present disclosure does not specifically limit the trigger condition of the code rate switching instruction.
  • the above 408 refers to the above 403.
  • the terminal obtains the address information to be switched of the multimedia resource of the code rate to be switched from the address information of the multimedia resource of the multiple code rates included in the media description file.
  • the terminal obtains the to-be-switched address information of the multimedia resource with the to-be-switched code rate from the address information of the multimedia resource with different code rates included in the media description file.
  • the above 409 refers to the above 404.
  • the terminal sends a frame acquisition request carrying the address information to be switched to the server.
  • the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the code rate to be switched.
  • the above 410 refers to the above 405.
  • the server responds to the frame acquisition request carrying the address information to be switched, and returns the media frame of the multimedia resource to the terminal at the code rate to be switched.
  • the above 411 refers to the above 406.
  • the terminal if the terminal receives the media frame of the multimedia resource whose code rate is to be switched, it plays the media frame of the multimedia resource whose code rate is to be switched.
  • the above 412 refers to the above 407.
  • Fig. 5 is a block diagram showing a logical structure of a resource transmission device according to an embodiment.
  • the device includes a determining unit 501 and a sending unit 502:
  • the determining unit 501 is configured to execute a multimedia resource-based media description file to determine target address information of the multimedia resource with a target code rate, and the media description file is used to provide address information of the multimedia resource with different code rates;
  • the sending unit 502 is configured to send a frame acquisition request carrying the target address information to the server, where the frame acquisition request is used to instruct the server to return the media frame of the multimedia resource at the target code rate.
  • the determining unit 501 includes:
  • the determining subunit is configured to perform the determination of the target code rate
  • the obtaining subunit is configured to obtain the target address information from the address information of the multimedia resources with different code rates included in the media description file.
  • the determining sub-unit is configured to perform:
  • the value carried by the code rate selection instruction is determined as the target code rate; or,
  • the target code rate is adjusted to the code rate corresponding to the current network bandwidth information.
  • the determining unit 501 is further configured to execute: in response to the code rate switching instruction, determine the code rate to be switched; and obtain the address information of the multimedia resource with different code rates included in the media description file The to-be-switched address information of the multimedia resource of the to-be-switched code rate;
  • the sending unit 502 is further configured to execute: sending a frame acquisition request carrying the information of the address to be switched to the server.
  • the frame acquisition request further includes at least one of a first extended parameter or a second extended parameter, the first extended parameter is used to indicate whether the media frame is an audio frame, and the second extended parameter is used to indicate The media frame of the multimedia resource is transmitted from the target timestamp indicated by the second extension parameter.
  • the target timestamp is greater than the current time
  • the target timestamp is the timestamp of the key frame or audio frame closest to the current moment
  • the target timestamp is smaller than the current time, and the media frame includes media frames that have been buffered starting from the target timestamp.
  • the media description file includes a version number and a media description set, where the version number includes at least one of the version number of the media description file or the version number of the resource transmission standard, and the media description set includes multiple Media description meta-information, each media description meta-information corresponds to a multimedia resource of one bit rate, and each media description meta-information includes the length of the picture group and attribute information of the multimedia resource of the bit rate corresponding to the media description meta-information.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate.
  • each attribute information further includes at least one of the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option, wherein the first adaptive function option is Y indicates whether the multimedia resource is visible relative to the adaptive function.
  • the media description file further includes at least one of a service type, a second adaptive function option, or a third adaptive function option, where the service type includes at least one of live broadcast or on-demand, and the second The adaptive function option is used to indicate whether to turn on the adaptive function, and the third adaptive function option is used to indicate whether to turn on the adaptive function by default when starting playback.
  • FIG. 6 shows a structural block diagram of a terminal 600 provided by an embodiment of the present disclosure.
  • the terminal 600 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, moving picture experts compressing standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, moving picture experts compressing standard audio Level 4) Player, laptop or desktop computer.
  • the terminal 600 may also be called user equipment, portable terminal, laptop terminal, desktop terminal and other names.
  • the terminal 600 includes a processor 601 and a memory 602.
  • the processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on.
  • the processor 601 can adopt at least one hardware form among DSP (Digital Signal Processing), FPGA (Field-Programmable Gate Array), and PLA (Programmable Logic Array, Programmable Logic Array). accomplish.
  • the processor 601 may also include a main processor and a coprocessor.
  • the main processor is a processor used to process data in the awake state, also called a CPU (Central Processing Unit, central processing unit); the coprocessor is A low-power processor used to process data in the standby state.
  • the processor 601 may be integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing content that needs to be displayed on the display screen.
  • the processor 601 may further include an AI (Artificial Intelligence) processor, and the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence
  • the memory 602 may include one or more computer-readable storage media, which may be non-transitory.
  • the memory 602 may also include a high-speed random access memory and a non-volatile memory, such as one or more magnetic disk storage devices and flash memory storage devices.
  • the non-transitory computer-readable storage medium in the memory 602 is used to store at least one instruction, and the at least one instruction is used to be executed by the processor 601 to implement the resource transmission provided by the various embodiments of the present disclosure. method.
  • the terminal 600 may optionally further include: a peripheral device interface 603 and at least one peripheral device.
  • the processor 601, the memory 602, and the peripheral device interface 603 may be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 603 through a bus, a signal line, or a circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 604, a display screen 605, a camera component 606, an audio circuit 607, a positioning component 608, and a power supply 609.
  • the peripheral device interface 603 can be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 601 and the memory 602.
  • the processor 601, the memory 602, and the peripheral device interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one of the processor 601, the memory 602, and the peripheral device interface 603 or The two can be implemented on separate chips or circuit boards, which are not limited in the embodiments of the present disclosure.
  • the radio frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 604 communicates with a communication network and other communication devices through electromagnetic signals.
  • the radio frequency circuit 604 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • the radio frequency circuit 604 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so on.
  • the radio frequency circuit 604 can communicate with other terminals through at least one wireless communication protocol.
  • the wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity, wireless fidelity) networks.
  • the radio frequency circuit 604 may also include a circuit related to NFC (Near Field Communication), which is not limited in the present disclosure.
  • the display screen 605 is used to display a UI (User Interface, user interface).
  • the UI can include graphics, text, icons, videos, and any combination thereof.
  • the display screen 605 also has the ability to collect touch signals on or above the surface of the display screen 605.
  • the touch signal can be input to the processor 601 as a control signal for processing.
  • the display screen 605 may also be used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 605 there may be one display screen 605, which is provided with the front panel of the terminal 600; in other embodiments, there may be at least two display screens 605, which are respectively provided on different surfaces of the terminal 600 or in a folded design; In still other embodiments, the display screen 605 may be a flexible display screen, which is disposed on the curved surface or the folding surface of the terminal 600. Furthermore, the display screen 605 can also be set as a non-rectangular irregular figure, that is, a special-shaped screen.
  • the display screen 605 may be made of materials such as LCD (Liquid Crystal Display) and OLED (Organic Light-Emitting Diode).
  • the camera assembly 606 is used to capture images or videos.
  • the camera assembly 606 includes a front camera and a rear camera.
  • the front camera is set on the front panel of the terminal, and the rear camera is set on the back of the terminal.
  • the camera assembly 606 may also include a flash.
  • the flash can be a single-color flash or a dual-color flash. Dual color temperature flash refers to a combination of warm light flash and cold light flash, which can be used for light compensation under different color temperatures.
  • the audio circuit 607 may include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals and input them to the processor 601 for processing, or input to the radio frequency circuit 604 to implement voice communication. For the purpose of stereo collection or noise reduction, there may be multiple microphones, which are respectively set in different parts of the terminal 600.
  • the microphone can also be an array microphone or an omnidirectional collection microphone.
  • the speaker is used to convert the electrical signal from the processor 601 or the radio frequency circuit 604 into sound waves.
  • the speaker can be a traditional thin-film speaker or a piezoelectric ceramic speaker.
  • the speaker When the speaker is a piezoelectric ceramic speaker, it can not only convert the electrical signal into human audible sound waves, but also convert the electrical signal into human inaudible sound waves for distance measurement and other purposes.
  • the audio circuit 607 may also include a headphone jack.
  • the positioning component 608 is used to locate the current geographic location of the terminal 600 to implement navigation or LBS (Location Based Service, location-based service).
  • the positioning component 608 may be a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China, the Granus system of Russia, or the Galileo system of the European Union.
  • the power supply 609 is used to supply power to various components in the terminal 600.
  • the power source 609 may be alternating current, direct current, disposable batteries, or rechargeable batteries.
  • the rechargeable battery may support wired charging or wireless charging.
  • the rechargeable battery can also be used to support fast charging technology.
  • the terminal 600 further includes one or more sensors 610.
  • the one or more sensors 610 include, but are not limited to: an acceleration sensor 611, a gyroscope sensor 612, a pressure sensor 613, a fingerprint sensor 614, an optical sensor 615, and a proximity sensor 616.
  • the acceleration sensor 611 can detect the magnitude of acceleration on the three coordinate axes of the coordinate system established by the terminal 600.
  • the acceleration sensor 611 can be used to detect the components of the gravitational acceleration on three coordinate axes.
  • the processor 601 may control the display screen 605 to display the user interface in a horizontal view or a vertical view according to the gravity acceleration signal collected by the acceleration sensor 611.
  • the acceleration sensor 611 may also be used for the collection of game or user motion data.
  • the gyroscope sensor 612 can detect the body direction and rotation angle of the terminal 600, and the gyroscope sensor 612 can cooperate with the acceleration sensor 611 to collect the user's 3D actions on the terminal 600.
  • the processor 601 can implement the following functions according to the data collected by the gyroscope sensor 612: motion sensing (for example, changing the UI according to the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 613 may be disposed on the side frame of the terminal 600 and/or the lower layer of the display screen 605.
  • the processor 601 performs left and right hand recognition or quick operation according to the holding signal collected by the pressure sensor 613.
  • the processor 601 controls the operability controls on the UI interface according to the user's pressure operation on the display screen 605.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 614 is used to collect the user's fingerprint.
  • the processor 601 can identify the user's identity based on the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 can identify the user's identity based on the collected fingerprints. When it is recognized that the user's identity is a trusted identity, the processor 601 authorizes the user to perform related sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings.
  • the fingerprint sensor 614 may be provided on the front, back or side of the terminal 600. When a physical button or a manufacturer logo is provided on the terminal 600, the fingerprint sensor 614 can be integrated with the physical button or the manufacturer logo.
  • the optical sensor 615 is used to collect the ambient light intensity.
  • the processor 601 may control the display brightness of the display screen 605 according to the ambient light intensity collected by the optical sensor 615. In some embodiments, when the ambient light intensity is high, the display brightness of the display screen 605 is increased; when the ambient light intensity is low, the display brightness of the display screen 605 is decreased. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.
  • the proximity sensor 616 also called a distance sensor, is usually arranged on the front panel of the terminal 600.
  • the proximity sensor 616 is used to collect the distance between the user and the front of the terminal 600.
  • the processor 601 controls the display screen 605 to switch from the on-screen state to the off-screen state; when the proximity sensor 616 detects When the distance between the user and the front of the terminal 600 gradually increases, the processor 601 controls the display screen 605 to switch from the rest screen state to the bright screen state.
  • the terminal includes one or more processors, and one or more memories for storing executable instructions of the one or more processors, wherein the one or more processors are configured to execute The instruction to achieve the following operations:
  • the one or more processors are configured to execute the instructions to implement the following operations:
  • the one or more processors are configured to execute the instructions to implement the following operations:
  • the value carried in the code rate selection instruction is determined as the target code rate; or,
  • the target code rate is adjusted to the code rate corresponding to the current network bandwidth information.
  • the one or more processors are further configured to execute the instructions to implement the following operations:
  • the frame acquisition request further includes at least one of a first extended parameter or a second extended parameter, the first extended parameter is used to indicate whether the media frame is an audio frame, and the second extended parameter is used to indicate The media frame of the multimedia resource is transmitted from the target timestamp indicated by the second extension parameter.
  • the target timestamp is greater than the current time
  • the target timestamp is the timestamp of the key frame or audio frame closest to the current moment
  • the target timestamp is smaller than the current time, and the media frame includes media frames that have been buffered starting from the target timestamp.
  • the media description file includes a version number and a media description set, where the version number includes at least one of the version number of the media description file or the version number of the resource transmission standard, and the media description set includes multiple Media description meta-information, each media description meta-information corresponds to a multimedia resource of one bit rate, and each media description meta-information includes the length of the picture group and attribute information of the multimedia resource of the bit rate corresponding to the media description meta-information.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate.
  • each attribute information further includes at least one of the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option, wherein the first adaptive function option is Y indicates whether the multimedia resource is visible relative to the adaptive function.
  • the media description file further includes at least one of a service type, a second adaptive function option, or a third adaptive function option, where the service type includes at least one of live broadcast or on-demand, and the second The adaptive function option is used to indicate whether to turn on the adaptive function, and the third adaptive function option is used to indicate whether to turn on the adaptive function by default when starting playback.
  • a storage medium including at least one instruction such as a memory including at least one instruction
  • the foregoing at least one instruction may be executed by a processor in a terminal to complete the resource transmission method in the foregoing embodiment.
  • the aforementioned storage medium may be a non-transitory computer-readable storage medium.
  • the non-transitory computer-readable storage medium may include ROM (Read-Only Memory, read-only memory), RAM (Random-Access Memory) , Random Access Memory), CD-ROM (Compact Disc Read-Only Memory, CD-ROM), magnetic tape, floppy disk and optical data storage devices, etc.
  • the terminal when at least one instruction in the storage medium is executed by one or more processors of the terminal, the terminal can perform the following operations:
  • one or more processors of the terminal are used to perform the following operations:
  • one or more processors of the terminal are used to perform the following operations:
  • the value carried by the code rate selection instruction is determined as the target code rate; or,
  • the target code rate is adjusted to the code rate corresponding to the current network bandwidth information.
  • one or more processors of the terminal are further configured to perform the following operations:
  • the frame acquisition request further includes at least one of a first extended parameter or a second extended parameter, the first extended parameter is used to indicate whether the media frame is an audio frame, and the second extended parameter is used to indicate The media frame of the multimedia resource is transmitted from the target timestamp indicated by the second extension parameter.
  • the target timestamp is greater than the current time
  • the target timestamp is the timestamp of the key frame or audio frame closest to the current moment
  • the target timestamp is smaller than the current time, and the media frame includes media frames that have been buffered starting from the target timestamp.
  • the media description file includes a version number and a media description set, where the version number includes at least one of the version number of the media description file or the version number of the resource transmission standard, and the media description set includes multiple Media description meta-information, each media description meta-information corresponds to a multimedia resource of one bit rate, and each media description meta-information includes the length of the picture group and attribute information of the multimedia resource of the bit rate corresponding to the media description meta-information.
  • each attribute information includes the identification information of the multimedia resource, the encoding method of the multimedia resource, the code rate supported by the multimedia resource, and the address information of the multimedia resource of the code rate.
  • each attribute information further includes at least one of the quality type of the multimedia resource, the hidden option of the multimedia resource, the first adaptive function option, or the default playback function option, wherein the first adaptive function option is Y indicates whether the multimedia resource is visible relative to the adaptive function.
  • the media description file further includes at least one of a service type, a second adaptive function option, or a third adaptive function option, where the service type includes at least one of live broadcast or on-demand, and the second The adaptive function option is used to indicate whether to turn on the adaptive function, and the third adaptive function option is used to indicate whether to turn on the adaptive function by default when starting playback.
  • a computer program product including one or more instructions, and the one or more instructions can be executed by the processor of the terminal to complete the resource transmission methods provided in the foregoing embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本公开关于一种资源传输方法及终端,属于通信技术领域。本公开可以通过基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息,从而向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示该服务器以该目标码率返回该多媒体资源的媒体帧,无需对多媒体资源进行分片传输。

Description

资源传输方法及终端
本申请要求于2020年01月17日提交的申请号为202010054781.8、发明名称为“资源传输方法、装置、终端及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及通信技术领域,特别涉及一种资源传输方法及终端。
背景技术
随着通信技术的发展,用户可以随时随地在终端上浏览音视频资源,目前,在服务器向终端传输音视频资源(俗称为“拉流阶段”)时,可以采用基于分片的媒体传输方式。
基于分片的媒体传输方式包括常见的DASH(Dynamic Adaptive Streaming over HTTP,MPEG制定的基于HTTP的自适应流媒体传输标准,其中,MPEG的英文全称为Moving Picture Experts Group,中文全称为动态图像专家组)、HLS(HTTP Live Streaming,苹果公司制定的基于HTTP的自适应流媒体传输标准)等,服务器将音视频资源切分成一段一段的音视频片段,每个音视频片段都可以转码成不同的码率,终端在播放音视频资源时,分别访问音视频资源所切分成的各个音视频片段的网址,不同的音视频片段之间可以对应于相同或不同的码率,使得终端能够方便地在不同码率的音视频资源中进行切换,这种过程也称为基于终端自身带宽情况自适应调整码率。
发明内容
本公开提供一种资源传输方法及终端。本公开的技术方案如下:
根据本公开实施例的一方面,提供一种资源传输方法,包括:基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
根据本公开实施例的另一方面,提供一种资源传输装置,包括:确定单元,被配置为执行基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;发送单元,被配置为执行向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
根据本公开实施例的另一方面,提供一种终端,包括:一个或多个处理器;用于存储所述一个或多个处理器可执行指令的一个或多个存储器;其中,所述一个或多个处理器被配置为执行如下操作:基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
根据本公开实施例的另一方面,提供一种存储介质,当所述存储介质中的至少一条指令由终端的一个或多个处理器执行时,使得终端能够执行如下操作:基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
根据本公开实施例的另一方面,提供一种计算机程序产品,包括一条或多条指令,所述一条或多条指令可以由终端的一个或多个处理器执行,使得终端能够执行上述资源传输方法。
附图说明
图1是根据一实施例示出的一种资源传输方法的实施环境示意图;
图2是本公开实施例提供的一种FAS框架的原理性示意图;
图3是根据一实施例示出的一种资源传输方法的流程图;
图4是根据一实施例示出的一种资源传输方法的交互流程图;
图5是根据一实施例示出的一种资源传输装置的逻辑结构框图;
图6示出了本公开一个实施例提供的终端600的结构框图。
具体实施方式
本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。
本公开所涉及的用户信息可以为经用户授权或者经过各方充分授权的信息。
以下,对本公开所涉及的术语进行解释。
一、FLV(Flash Video)
FLV是一种流媒体格式,FLV流媒体格式是随着Flash MX(一种动画制作软件)的推出发展而来的视频格式。由于它形成的文件极小、加载速度极快,使得网络观看视频文件(也即在线浏览视频)成为可能,它的出现有效地解决了视频文件导入Flash后导出的SWF(一种Flash的专用文件格式)文件体积庞大,以致不能在网络上很好的使用等问题。
二、流媒体(Streaming Media)
流媒体采用流式传输方法,是指将一连串的多媒体资源压缩后,通过网络发送资源包,从而在网上即时传输多媒体资源以供观赏的一种技术与过程,此技术使得资源包得以像流水一样发送;如果不使用此技术,就必须在使用前下载整个媒体文件,从而仅能进行离线观看多媒体资源。流式传输可传送现场多媒体资源或预存于服务器上的多媒体资源,当观众用户在收看这些多媒体资源时,多媒体资源在送达观众用户的观众终端后可以由特定播放软件进行播放。
三、FAS(FLV Adaptive Streaming,基于FLV的自适应流媒体传输标准)
FAS是本公开所提出的流式资源传输标准(或称为资源传输协议),与传统的基于分片的媒体传输方式不同,FAS标准能够达到帧级别的多媒体资源传输,服务器无需等待一个完整的视频片段到达之后才能向终端发送资源包,而是在解析终端的帧获取请求之后,确定目标时间戳,若目标时间戳小于零,那么将从目标时间戳开始已缓存的所有媒体帧打包发送至终端(无需分片),此后,若目标时间戳大于或等于零或者除了缓存的媒体帧之外还存在实时流,那么将多媒体资源的媒体帧逐帧发送至终端。需要说明的是,帧获取请求中指定目标码率,当终端自身的网络带宽情况发生变化时,可以适应性调整待切换码率,重新发送与待切换码率对应的帧获取请求,从而能够达到自适应调整多媒体资源 码率的效果。FAS标准能够实现帧级传输、降低端到端延迟,只有码率发生切换时才需要发送新的帧获取请求,极大减小请求数量,降低资源传输过程的通信开销。
四、直播与点播
直播:多媒体资源是实时录制的,主播用户通过主播终端将媒体流“推流”(指基于流式传输方式推送)到服务器上,观众用户在观众终端上触发进入主播用户的直播界面之后,将媒体流从服务器“拉流”(指基于流式传输方式拉取)到观众终端,观众终端解码并播放多媒体资源,从而实时地进行视频播放。
点播:也称为Video On Demand(VOD),多媒体资源预存在服务器上,服务器能够根据观众用户的要求来提供观众用户指定的多媒体资源,在一些实施例中,观众终端向服务器发送点播请求,服务器查询到点播请求所指定的多媒体资源之后,将多媒体资源发送至观众终端,也即是说,观众用户能够选择性地播放某个特定的多媒体资源。
点播的内容可以任意控制播放进度,而直播则不然,直播的内容播放速度取决于主播用户的实时直播进度。
图1是根据一实施例示出的一种资源传输方法的实施环境示意图,参见图1,在该实施环境中可以包括至少一个终端101和服务器102,下面进行详述:
在一些实施例中,终端101用于进行多媒体资源传输,在每个终端上可以安装有媒体编解码组件以及媒体播放组件,该媒体编解码组件用于在接收多媒体资源(例如分片传输的资源包、帧级传输的媒体帧)之后进行多媒体资源的解码,该媒体播放组件用于在解码多媒体资源之后进行多媒体资源的播放。
按照用户身份的不同,终端101可以划分为主播终端以及观众终端,主播终端对应于主播用户,观众终端对应于观众用户,需要说明的是,对同一个终端而言,该终端即可以是主播终端,也可以是观众终端,比如,用户在录制直播时该终端为主播终端,用户在观看直播时该终端为观众终端。
终端101和服务器102可以通过有线网络或无线网络相连。
在一些实施例中,服务器102用于提供待传输的多媒体资源,服务器102可以包括一台服务器、多台服务器、云计算平台或者虚拟化中心中的至少一种。在一些实施例中,服务器102承担主要计算工作,终端101承担次要计算工作;或者,服务器102承担次要计算工作,终端101承担主要计算工作;或者,终 端101和服务器102两者之间采用分布式计算架构进行协同计算。
在一些实施例中,服务器102可以是集群式的CDN(Content Delivery Network,内容分发网络)服务器,CDN服务器包括中心平台以及部署在各地的边缘服务器,通过中心平台的负载均衡、内容分发、调度等功能模块,使得用户所在终端能够依靠当地的边缘服务器来就近获取所需内容(即多媒体资源)。CDN服务器在终端与中心平台之间增加了一个缓存机制,该缓存机制也即是部署在不同地理位置的边缘服务器(比如WEB服务器),在性能优化时,中心平台会根据终端与边缘服务器的距离远近,调度与终端之间距离最近的边缘服务器来向终端提供服务,能够更加有效地向终端发布内容。
本公开实施例所涉及的多媒体资源,包括但不限于:视频资源、音频资源、图像资源或者文本资源中至少一项,本公开实施例不对多媒体资源的类型进行具体限定。比如,该多媒体资源为网络主播的直播视频流,或者为预存在服务器上的历史点播视频,或者为电台主播的直播音频流,或者为预存在服务器上的历史点播音频。
在一些实施例中,终端101的设备类型包括但不限于:电视机、智能手机、智能音箱、车载终端、平板电脑、电子书阅读器、MP3(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)播放器、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机或者台式计算机中的至少一种。以下实施例,以终端101包括智能手机来进行举例说明。
本领域技术人员可以知晓,上述终端101的数量可以仅为一个,或者终端101的数量为几十个或几百个,或者更多数量。本公开实施例对终端101的数量和设备类型不加以限定。
图2是本公开实施例提供的一种FAS框架的原理性示意图,请参考图2,本公开实施例提供一种FAS(基于流式的多码率自适应)框架,在该框架内,终端101与服务器102之间通过FAS协议进行多媒体资源传输。
以任一终端为例进行说明,在终端上可以安装有应用程序(亦称为FAS客户端),该应用程序用于浏览多媒体资源,例如,该应用程序可以为短视频应用、直播应用、视频点播应用、社交应用、购物应用等,本公开实施例不对应用程序的类型进行具体限定。
用户可以在终端上启动应用程序,显示资源推送界面(例如应用程序的首页或者功能界面),在该资源推送界面中包括至少一个多媒体资源的缩略信息,该缩略信息包括标题、简介、发布者、海报、预告片或者精彩片段中至少一项,响应于用户对任一多媒体资源的缩略信息的触控操作,终端可以从资源推送界面跳转至资源播放界面,在该资源播放界面中包括该多媒体资源的播放选项,响应于用户对该播放选项的触控操作,终端从服务器中下载该多媒体资源的媒体描述文件(Media Presentation Description,MPD),基于该媒体描述文件,确定目标码率的多媒体资源的目标地址信息,向服务器发送携带目标地址信息的帧获取请求(或称为FAS请求),使得服务器基于一定的规范(FAS请求的处理规范)来处理该帧获取请求,服务器定位到该多媒体资源的媒体帧(连续的媒体帧可以构成媒体流)之后,以目标码率向终端返回该多媒体资源的媒体帧(也即以目标码率向终端返回媒体流)。终端接收到媒体流之后,调用媒体编解码组件对媒体流进行解码,得到解码后的媒体流,调用媒体播放组件播放解码后的媒体流。
在一些直播场景下,终端所请求的媒体流通常为主播用户实时推流到服务器的直播视频流,这时服务器可以在接收到主播用户的直播视频流之后,对该直播视频流进行转码,得到多种码率的直播视频流,为不同码率的直播视频流分配不同的地址信息,记录在媒体描述文件中,从而能够针对携带不同地址信息的帧获取请求,以不同的码率返回对应的直播视频流。
进一步地,提供一种自适应调整码率的机制,当终端当前的网络带宽情况发生变化时,适应性调整与当前网络带宽情况相匹配的待切换码率。比如,在需要切换码率时,终端可以断开当前码率的媒体流传输链接,向服务器发送携带待切换码率所对应待切换地址信息的帧获取请求,建立基于待切换码率的媒体流传输链接,当然,终端也可以不断开当前码率的媒体流传输链接,而是直接重新发起携带待切换地址信息的帧获取请求,建立基于待切换码率的媒体流传输链接(用于传输新的媒体流),将原有的媒体流作为备用流,一旦新的媒体流出现传输异常,那么可以继续播放备用流。
图3是根据一实施例示出的一种资源传输方法的流程图,所述资源传输方法应用于上述实施环境涉及的FAS框架中的终端。
在301中,终端基于多媒体资源的媒体描述文件,确定具备目标码率的该 多媒体资源的目标地址信息,该媒体描述文件用于提供具备不同码率的该多媒体资源的地址信息。
在302中,终端向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示该服务器以该目标码率返回该多媒体资源的媒体帧。
在一些实施例中,基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息包括:
确定该目标码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该目标地址信息。
在一些实施例中,确定该目标码率包括:
响应于码率选择指令,将该码率选择指令所携带的数值确定为该目标码率;或,
将该目标码率调整为与当前的网络带宽信息对应的码率。
在一些实施例中,该方法还包括:
响应于码率切换指令,确定待切换码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该待切换码率的该多媒体资源的待切换地址信息;
向该服务器发送携带该待切换地址信息的帧获取请求。
在一些实施例中,该帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,该第一扩展参数用于表示该媒体帧是否为音频帧,该第二扩展参数用于表示从该第二扩展参数所指示的目标时间戳开始传输该多媒体资源的媒体帧。例如,该帧获取请求仅包括第一扩展参数,或者,该帧获取请求仅包括第二扩展参数,或者,该帧获取请求同时包括第一扩展参数和第二扩展参数。
在一些实施例中,基于该第二扩展参数大于零,该目标时间戳大于当前时刻;
基于该第二扩展参数等于零,该目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
基于该第二扩展参数小于零,该目标时间戳小于当前时刻,且该媒体帧包括从该目标时间戳开始已缓存的媒体帧。
在一些实施例中,该媒体描述文件包括版本号和媒体描述集合,其中,该版本号包括该媒体描述文件的版本号或者资源传输标准的版本号中至少一项, 该媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括该媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。例如,该版本号仅包括媒体描述文件的版本号,或者,该版本号仅包括资源传输标准的版本号,或者,该版本号同时包括该媒体描述文件的版本号以及资源传输标准的版本号。
在一些实施例中,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息。
在一些实施例中,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,该第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。例如,每个属性信息除了包括上述多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息之外,还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中任一项,或者,还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少两项的组合。
在一些实施例中,该媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,该服务类型包括直播或者点播中至少一项,该第二自适应功能选项用于表示是否打开自适应功能,该第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。例如,该媒体描述文件除了版本号和媒体描述集合之外,还包括服务类型、第二自适应功能选项或者第三自适应功能选项中任一项,或者,还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少两项的组合。
图4是根据一实施例示出的一种资源传输方法的交互流程图,所述资源传输方法可以用于上述实施环境涉及的FAS框架中,该实施例包括以下内容。
在401中,终端显示资源播放界面,该资源播放界面中包括多媒体资源的播放选项。
其中,终端上可以安装有应用程序,该应用程序用于浏览多媒体资源,例如,该应用程序可以包括短视频应用、直播应用、视频点播应用、社交应用或者购物应用中至少一项,本公开实施例不对应用程序的类型进行具体限定。
本公开实施例所涉及的多媒体资源,包括但不限于:视频资源、音频资源、 图像资源或者文本资源中至少一项,本公开实施例不对多媒体资源的类型进行具体限定。比如,该多媒体资源为网络主播的直播视频流,或者为预存在服务器上的历史点播视频,或者为电台主播的直播音频流,或者为预存在服务器上的历史点播音频。
在上述过程中,用户可以在终端上启动应用程序,该应用程序显示资源推送界面,例如该资源推送界面可以是应用程序的首页或者功能界面,本公开实施例不对资源推送界面的类型进行具体限定。在该资源推送界面中可以包括至少一个多媒体资源的缩略信息,该缩略信息包括多媒体资源的标题、简介、海报、预告片或者精彩片段中至少一项。用户在浏览资源推送界面的过程中,可以点击感兴趣的多媒体资源的缩略信息,响应于用户对该多媒体资源的缩略信息的触控操作,终端可以从资源推送界面跳转至资源播放界面。
在该资源播放界面中可以包括播放区域和评论区域,在播放区域内可以包括该多媒体资源的播放选项,在评论区域内可以包括其他用户针对该多媒体资源的观看评论。
在一些实施例中,在播放区域内还可以包括该多媒体资源的详情信息,该详情信息可以包括该多媒体资源的标题、简介、关键词、发布者信息或者当前热度中至少一项,其中,该发布者信息可以包括发布者昵称、发布者头像、发布者粉丝量等,本公开实施例不对详情信息或者发布者信息的内容进行具体限定。
在一些实施例中,在播放区域内还可以包括弹幕输入区和弹幕设置选项,用户可以通过弹幕设置选项控制是否显示弹幕、弹幕移动速度、弹幕显示区域或者弹幕显示方式(透明度、字号大小等)中至少一项,用户还可以通过点击弹幕输入区,输入自己想要评论的内容,弹幕形式不限于文本或者表情图像,本公开实施例不对弹幕设置选项的内容或者用户输入的弹幕形式进行具体限定。
在一些实施例中,在播放区域内还可以包括收藏选项和关注选项,若用户点击收藏选项,可以触发终端向服务器发送收藏请求,服务器响应于该收藏请求,将该多媒体资源添加至该用户所对应的收藏夹内,若用户点击关注选项,可以触发终端向服务器发送关注请求,服务器响应于该关注请求,将该多媒体资源的发布者添加至该用户所对应的关注列表内。
在一些实施例中,在播放区域内还可以包括虚拟礼物的赠送选项,若用户 点击赠送选项,可以显示虚拟礼物的赠送类别以及赠送数量的选择栏,用户在选择好某一类别以及某一数量的虚拟礼物之后,可以通过点击确认按钮,触发终端向服务器发送虚拟礼物的赠送请求,服务器对该赠送请求进行结算,分别从用户的账户中扣除一定数值,并向主播的账户中发放一定数值,在结算完毕之后,终端可以在播放区域中以浮层的方式展示虚拟礼物的特效动画。
上述各种可能实施方式,提供了资源播放界面的不同布局,在实际应用中资源播放界面可以具有更多或者更少的布局方式,本公开实施例不对资源播放界面的布局方式进行具体限定。
在402中,终端响应于用户对该播放选项的触控操作,从服务器中下载该多媒体资源的媒体描述文件,该媒体描述文件用于提供多种码率的多媒体资源的地址信息。
也即是说,该媒体描述文件用于提供具备不同码率的多媒体资源的地址信息。
在上述过程中,由于在某个多媒体资源的播放过程中,用户可能会多次暂停播放,然后通过点击播放选项触发进行继续播放,那么终端可以仅在用户首次点击播放选项时,下载多媒体资源的媒体描述文件,当然,由于媒体描述文件可能会发生变化造成版本更迭,那么终端也可以每当用户点击播放选项时,均重新下载一次媒体描述文件,本公开实施例不对媒体描述文件的下载时机进行具体限定。
在一些实施例中,该媒体描述文件可以是JSON(JavaScript Object Notation,JS对象简谱)格式的文件,当然也可以是其他格式的文件,本公开实施例不对媒体描述文件的格式进行具体限定。该媒体描述文件可以包括版本号(@version)和媒体描述集合(@adaptationSet),下面进行详述:
在一些实施例中,由于媒体描述文件本身可能会由于转码方式的变换而产生不同的版本,而FAS标准也会随着技术的发展而进行版本更迭,因此该版本号可以包括该媒体描述文件的版本号或者资源传输标准(FAS标准)的版本号中至少一项,比如,该版本号可以仅包括FAS标准的版本号,或者仅包括媒体描述文件的版本号,或者该版本号还可以是媒体描述文件与FAS标准的版本号之间的组合。
在一些实施例中,该媒体描述集合用于表示多媒体资源的元信息,该媒体描述集合可以包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率 的多媒体资源,每个媒体描述元信息可以包括该媒体描述元信息所对应码率的多媒体资源的画面组长度(@gopDuration)以及属性信息(@representation)。
这里的画面组(Group Of Pictures,GOP)长度是指两个关键帧之间的距离,关键帧是指视频编码序列中的帧内编码图像帧(Intra-coded picture,也称为“I帧”),I帧的编解码不需要参考其他图像帧,仅利用本帧信息即可实现,而相对地,P帧(Predictive-coded picture,预测编码图像帧)和B帧(Bidirectionally predicted picture,双向预测编码图像帧)的编解码均需要参考其他图像帧,仅利用本帧信息无法完成编解码。
在一些实施例中,对每个媒体描述元信息所包括的属性信息(也即每个属性信息)而言,每个属性信息可以包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息。
标识信息(@id):指每个多媒体资源独一无二的标识符,标识信息可以由服务器进行分配。
编码方式(@codec):指多媒体资源遵从的编解码标准,例如H.263、H.264、H.265、MPEG等。
多媒体资源所支持的码率(@bitrate):指资源传输时单位时间内传送的数据位数,也称为比特率,以音频资源为例,码率越高,则音频资源被压缩的比例越小,音质损失越小,那么与音源的音质就越接近(音质越好),视频资源与音频资源同理,但由于视频资源由图像资源和音频资源组装而成,因此在计算码率时除了音频资源之外还要加上对应的图像资源。
某种码率的多媒体资源的地址信息(@url):指服务器在针对多媒体资源进行转码,得到该码率的多媒体资源之后,对外提供该码率的多媒体资源的URL(Uniform Resource Locator,统一资源定位符)或域名(Domain Name)。
在一些实施例中,每个属性信息还可以包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项。例如,每个属性信息除了包括上述多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息之外,还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中任一项,或者,还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少两项的组合。
质量类型(@qualityType):包括多媒体资源的分辨率或者帧率等质量评价指标。
多媒体资源的隐藏选项(@hiden):用于表示多媒体资源是否外显,若设定为true,表示对应码率的多媒体资源不外显,此时用户无法手动选择对应码率的多媒体资源,只能通过自适应功能来选中该码率的多媒体资源,若设定为false,表示对应码率的多媒体资源外显,此时除了能够通过自适应功能选中该码率的多媒体资源之外,用户还能够手动选择对应码率的多媒体资源。需要说明的是,本申请所涉及的自适应功能,是指终端根据当前的网络带宽情况对所播放的媒体流进行动态帧率调整的功能,后文不做赘述。
第一自适应功能选项(@enableAdaptive):用于表示多媒体资源是否相对于自适应功能可见,若设定为true,表示对应码率的多媒体资源对于自适应功能可见,对应码率的多媒体资源能够被自适应功能选中,若设定为false,表示对应码率的多媒体资源对于自适应功能不可见,对应码率的多媒体资源不能被自适应功能选中。
默认播放功能选项(@defaultSelect):用于表示是否在启播时默认播放对应码率的多媒体资源,若设定为true,表示启播时默认播放对应码率的多媒体资源,若设定为false,表示启播时不默认播放对应码率的多媒体资源,由于媒体播放组件无法默认播放两种码率的多媒体资源(存在播放冲突),因此,在所有媒体描述元信息的属性信息中,最多只能出现一个码率的多媒体资源的默认播放功能选项(@defaultSelect)为true。
在一些实施例中,除了版本号和媒体描述集合之外,该媒体描述文件还可以包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,例如,该媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中任一项,或者,还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少两项的组合。
服务类型(@type):用于指定多媒体资源的业务类型,包括直播或者点播中至少一项,比如,设定为“dynamic”时表示直播,设定为“static”时表示点播,若不做规定时,可以将“dynamic”作为默认值。
第二自适应功能选项(@hideAuto):用于表示是否打开自适应功能,若设定为true,代表关闭自适应功能,且不显示自适应选项,若设定为false,代表开启自适应功能,且显示自适应选项,若不做规定时,可以将“false”作为默认 值。
第三自适应功能选项(@autoDefaultSelect):用于表示是否在启播时默认打开自适应功能,若设定为true,代表在开始播放(启播)时默认基于自适应功能播放,若设定为false,代表在开始播放时默认不基于自适应功能播放,即启播时默认关闭自适应功能。需要说明的是,这里的第三自适应功能选项是上述默认播放功能选项的前提,也即是,只有在第三自适应功能选项设置为flase(启播时默认关闭自适应功能)时,默认播放功能选项才会有效,这时在启播时会默认播放@defaultSelect设置为true所对应码率的多媒体资源,否则,若第三自适应功能选项设置为true,那么在启播时会根据自适应功能选中最适合当前网络带宽情况的码率的多媒体资源。
上述媒体描述文件,是由服务器基于业务需求提供给终端的数据文件,由服务器按照业务需求进行预先配置,用于向终端提供流媒体服务的一组数据的集合以及业务相关的描述,媒体描述文件包括已编码并可传输的媒体流以及相应的元信息描述,使得终端能够基于媒体描述文件来构建帧获取请求(FAS请求),从而由服务器根据FAS标准的处理规范来响应帧获取请求,向终端提供流媒体服务。
在403中,终端确定目标码率。
上述目标码率是指本次请求的多媒体资源的码率,根据业务场景的不同,若用户首次播放多媒体资源,那么目标码率是指启播时的码率,若用户在播放过程中选择切换码率或者终端基于自适应策略调整了码率,那么目标码率也可以是一种待切换码率。
在一些实施例中,终端可以向用户提供码率选择列表,用户在点击码率选择列表中任一数值时,触发生成携带该数值的码率选择指令,终端响应于码率选择指令,将该码率选择指令所携带的数值确定为目标码率。
在一些实施例中,终端还可以通过自适应功能,将目标码率调整为与当前的网络带宽信息对应的码率,在进行自适应调整的过程中,除了当前的网络带宽信息之外,还可以结合终端的播放状态信息,动态选择播放效果最佳的目标码率。
在404中,终端从媒体描述文件包括的多种码率的多媒体资源的地址信息中,获取目标码率的多媒体资源的目标地址信息。
也即是说,终端从媒体描述文件包括的具备不同码率的多媒体资源的地址 信息中,获取具备目标码率的多媒体资源的目标地址信息。
在上述过程中,终端在确定目标码率之后,可以在媒体描述文件中以目标码率为索引,查询得到与目标码率的多媒体资源对应的媒体描述元信息,在该媒体描述元信息的属性信息中提取出@url字段内所存储的目标地址信息。
在上述403-404中,终端相当于基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息。
在405中,终端向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示该服务器以该目标码率返回该多媒体资源的媒体帧。
在上述过程中,终端在从媒体描述文件中获取目标地址信息之后,可以生成携带该目标地址信息的帧获取请求,进而向服务器发送携带该目标地址的帧获取请求。
在一些实施例中,除了目标地址信息(@url)之外,该帧获取请求还可以包括扩展参数(@extParam),该扩展参数用于指定不同的请求方式,从而实现不同的功能,该扩展参数可以包括第一扩展参数或者第二扩展参数中至少一项,下面进行详述:
第一扩展参数(@onlyAudio)属于一种音频参数,用于表示该媒体帧是否为音频帧,若设定为true,表示终端拉取的媒体帧为音频帧,也即只拉取纯音频流,否则,若设定为false,表示终端拉取的媒体帧为音视频帧,也即拉取音频流和视频画面流,若不做规定时,可以将“false”作为默认值。
在一些实施例中,终端可以获取多媒体资源的类型,若多媒体资源的类型为视频,可以将第一扩展参数置为“false”或者默认值,若多媒体资源的类型为音频,可以将第一扩展参数置为“true”。
在一些实施例中,终端还可以检测应用程序的类型,若应用程序的类型为视频应用,可以将第一扩展参数置为“false”或者默认值,若应用程序的类型为音频应用,可以将第一扩展参数置为“true”。
第二扩展参数(@fasSpts)属于一种拉取位置参数,用于表示从该第二扩展参数所指示的目标时间戳开始传输该多媒体资源的媒体帧,在一些实施例中,该第二扩展参数的数据类型可以为int64_t类型,当然,也可以为其他数据类型,本公开实施例不对第二扩展参数的数据类型进行具体限定。在帧获取请求中可以指定第二扩展参数,若帧获取请求中未指定第二扩展参数,那么服务器由服务器来配置第二扩展参数的默认值。
在一些实施例中,基于该第二扩展参数大于零(@fasSpts>0),此时该目标时间戳pts大于当前时刻,那么终端将从pts等于@fasSpts的媒体帧(未来的某个时刻)开始拉取媒体流;
在一些实施例中,基于该第二扩展参数等于零(@fasSpts=0),此时该目标时间戳pts为距离当前时刻最接近的关键帧或音频帧的时间戳,在一些实施例中,在拉取音频帧(纯音频模式)时,终端从最新的音频帧开始拉取媒体流,或者,在拉取音视频帧(非纯音频模式)时,终端从最新的视频I帧开始拉取媒体流;
在一些实施例中,基于该第二扩展参数小于零(@fasSpts<0),此时该目标时间戳小于当前时刻,且该媒体帧包括从该目标时间戳开始已缓存的媒体帧,也即是,终端拉取缓存长度为|@fasSpts|毫秒的媒体流。
在一些实施例中,终端可以根据多媒体描述文件中的服务类型(@type)字段来确定第二扩展参数,若查询到服务类型为“dynamic”(直播)且用户未指定播放进度,终端可以将第二扩展参数置为0;若查询到服务器类型为“dynamic”(直播)且用户指定了播放进度,终端可以将第二扩展参数置为播放进度所对应的时间戳(目标时间戳);若查询到服务类型为“static”(点播)且用户未指定播放进度,终端可以检测多媒体资源在上一次关闭时的历史播放进度,将第二扩展参数置为该历史播放进度所对应的时间戳(目标时间戳),需要说明的是,若用户首次观看多媒体资源,此时查询不到任何历史播放进度,终端可以将第二扩展参数置为首个媒体帧的时间戳(目标时间戳);若查询到服务类型为“static”(点播)且用户指定了播放进度,终端可以将第二扩展参数置为播放进度所对应的时间戳(目标时间戳)。
对帧获取请求而言,可以认为其格式为目标码率的多媒体资源的url地址加上扩展字段,可以形象地表示为“url&extParam”,在FAS标准中,服务器在接收到帧获取请求之后,能够按照FAS所规定的处理规范,对帧获取请求进行响应处理,请参考下述406。
在406中,服务器响应于携带目标地址信息的帧获取请求,以目标码率向终端返回多媒体资源的媒体帧。
在上述过程中,服务器在接收到帧获取请求之后,可以解析该帧获取请求,得到目标地址信息,服务器基于目标地址信息,从资源库中定位到目标码率的多媒体资源的媒体帧。
在一些实施例中,若帧获取请求中还携带第一扩展参数或者第二扩展参数 中至少一项,那么服务器还可以解析得到第一扩展参数或者第二扩展参数中至少一项,若未携带第一扩展参数或者第二扩展参数中至少一项,服务器可以配置第一扩展参数或者第二扩展参数中至少一项的默认值,不管是从帧获取请求中解析,还是自行配置默认值,服务器均能够确定第一扩展参数以及第二扩展参数,进一步地,服务器基于第一扩展参数确定是否发送音频形式的媒体帧,基于第二扩展参数确定从哪个时间戳开始拉取媒体帧,从而服务器能够从第二扩展参数所指示的目标时间戳开始,以目标码率向终端返回第一扩展参数所指示形式的媒体帧。
在一些实施例中,若服务器为CDN服务器,那么该目标地址信息可以是一个域名,终端可以向CDN服务器的中心平台发送帧获取请求,中心平台调用DNS(Domain Name System,域名系统,本质上是一个域名解析库)对域名进行解析,可以得到域名对应的CNAME(别名)记录,基于终端的地理位置信息对CNAME记录再次进行解析,可以得到一个距离终端最近的边缘服务器的IP(Internet Protocol,网际互连协议)地址,这时中心平台将帧获取请求导向至上述边缘服务器,由边缘服务器响应于帧获取请求,以目标码率向终端提供多媒体资源的媒体帧。
在407中,若终端接收到目标码率的多媒体资源的媒体帧,播放该目标码率的多媒体资源的媒体帧。
在上述过程中,若终端接收到目标码率的多媒体资源的媒体帧(连续接收到的媒体帧即可构成媒体流),终端可以将该媒体帧存入缓存区中,调用媒体编解码组件对媒体帧进行解码,得到解码后的媒体帧,调用媒体播放组件按照时间戳(pts)从小到大的顺序来对缓存区内的媒体帧进行播放。
在解码过程中,终端可以从媒体描述文件的@codec字段中确定多媒体资源的编码方式,根据编码方式确定对应的解码方式,从而按照确定的解码方式对媒体帧进行解码。
由于帧获取请求采用了“url&extParam”的格式,通过定义扩展参数能够指定音频参数以及拉取位置参数,若网络状况改变而需要切换码率时,可以通过下述408-412来进行码率的无缝切换。
在408中,终端响应于码率切换指令,确定待切换码率。
上述码率切换指令,可以由用户在码率切换列表中进行触发,也可以由终端根据自适应策略自动地触发,本公开实施例不对码率切换指令的触发条件进 行具体限定。
上述408参考上述403。
在409中,终端从该媒体描述文件包括的多种码率的该多媒体资源的地址信息中,获取该待切换码率的该多媒体资源的待切换地址信息。
也即是说,终端从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取具备该待切换码率的该多媒体资源的待切换地址信息。
上述409参考上述404。
在410中,终端向该服务器发送携带该待切换地址信息的帧获取请求。
其中,该帧获取请求用于指示服务器以待切换码率返回多媒体资源的媒体帧。
上述410参考上述405。
在411中,服务器响应于携带待切换地址信息的帧获取请求,以待切换码率向终端返回多媒体资源的媒体帧。
上述411参考上述406。
在412中,若终端接收到待切换码率的多媒体资源的媒体帧,播放该待切换码率的多媒体资源的媒体帧。
上述412参考上述407。
图5是根据一实施例示出的一种资源传输装置的逻辑结构框图,所述装置包括确定单元501和发送单元502:
确定单元501,被配置为执行基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息,该媒体描述文件用于提供具备不同码率的该多媒体资源的地址信息;
发送单元502,被配置为执行向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示该服务器以该目标码率返回该多媒体资源的媒体帧。
在一些实施例中,基于图5的装置组成,该确定单元501包括:
确定子单元,被配置为执行确定该目标码率;
获取子单元,被配置为执行从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该目标地址信息。
在一些实施例中,该确定子单元被配置为执行:
响应于码率选择指令,将该码率选择指令所携带的数值确定为该目标码率; 或,
将该目标码率调整为与当前的网络带宽信息对应的码率。
在一些实施例中,该确定单元501还被配置为执行:响应于码率切换指令,确定待切换码率;从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该待切换码率的该多媒体资源的待切换地址信息;
该发送单元502还被配置为执行:向该服务器发送携带该待切换地址信息的帧获取请求。
在一些实施例中,该帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,该第一扩展参数用于表示该媒体帧是否为音频帧,该第二扩展参数用于表示从该第二扩展参数所指示的目标时间戳开始传输该多媒体资源的媒体帧。
在一些实施例中,基于该第二扩展参数大于零,该目标时间戳大于当前时刻;
基于该第二扩展参数等于零,该目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
基于该第二扩展参数小于零,该目标时间戳小于当前时刻,且该媒体帧包括从该目标时间戳开始已缓存的媒体帧。
在一些实施例中,该媒体描述文件包括版本号和媒体描述集合,其中,该版本号包括该媒体描述文件的版本号或者资源传输标准的版本号中至少一项,该媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括该媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。
在一些实施例中,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息。
在一些实施例中,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,该第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
在一些实施例中,该媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,该服务类型包括直播或者点播中至少一项,该第二自适应功能选项用于表示是否打开自适应功能,该第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
图6示出了本公开一个实施例提供的终端600的结构框图。该终端600可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。终端600还可能被称为用户设备、便携式终端、膝上型终端、台式终端等其他名称。
通常,终端600包括有:处理器601和存储器602。
处理器601可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器601可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器601也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器601可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器601还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器602可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器602还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器602中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器601所执行以实现本公开中各个实施例提供的资源传输方法。
在一些实施例中,终端600还可选包括有:外围设备接口603和至少一个外围设备。处理器601、存储器602和外围设备接口603之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口603相连。在一些实施例中,外围设备包括:射频电路604、显示屏605、摄像头组件606、音频电路607、定位组件608和电源609中的至少一种。
外围设备接口603可被用于将I/O(Input/Output,输入/输出)相关的至少 一个外围设备连接到处理器601和存储器602。在一些实施例中,处理器601、存储器602和外围设备接口603被集成在同一芯片或电路板上;在一些其他实施例中,处理器601、存储器602和外围设备接口603中的任意一个或两个可以在单独的芯片或电路板上实现,本公开实施例对此不加以限定。
射频电路604用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路604通过电磁信号与通信网络以及其他通信设备进行通信。射频电路604将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。在一些实施例中,射频电路604包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路604可以通过至少一种无线通信协议来与其它终端进行通信。该无线通信协议包括但不限于:城域网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路604还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本公开对此不加以限定。
显示屏605用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏605是触摸显示屏时,显示屏605还具有采集在显示屏605的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器601进行处理。此时,显示屏605还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏605可以为一个,设置终端600的前面板;在另一些实施例中,显示屏605可以为至少两个,分别设置在终端600的不同表面或呈折叠设计;在再一些实施例中,显示屏605可以是柔性显示屏,设置在终端600的弯曲表面上或折叠面上。甚至,显示屏605还可以设置成非矩形的不规则图形,也即异形屏。显示屏605可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件606用于采集图像或视频。在一些实施例中,摄像头组件606包括前置摄像头和后置摄像头。通常,前置摄像头设置在终端的前面板,后置摄像头设置在终端的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。 在一些实施例中,摄像头组件606还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路607可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器601进行处理,或者输入至射频电路604以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在终端600的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器601或射频电路604的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路607还可以包括耳机插孔。
定位组件608用于定位终端600的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件608可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统、俄罗斯的格雷纳斯系统或欧盟的伽利略系统的定位组件。
电源609用于为终端600中的各个组件进行供电。电源609可以是交流电、直流电、一次性电池或可充电电池。当电源609包括可充电电池时,该可充电电池可以支持有线充电或无线充电。该可充电电池还可以用于支持快充技术。
在一些实施例中,终端600还包括有一个或多个传感器610。该一个或多个传感器610包括但不限于:加速度传感器611、陀螺仪传感器612、压力传感器613、指纹传感器614、光学传感器615以及接近传感器616。
加速度传感器611可以检测以终端600建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器611可以用于检测重力加速度在三个坐标轴上的分量。处理器601可以根据加速度传感器611采集的重力加速度信号,控制显示屏605以横向视图或纵向视图进行用户界面的显示。加速度传感器611还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器612可以检测终端600的机体方向及转动角度,陀螺仪传感器612可以与加速度传感器611协同采集用户对终端600的3D动作。处理器601根据陀螺仪传感器612采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器613可以设置在终端600的侧边框和/或显示屏605的下层。当压力传感器613设置在终端600的侧边框时,可以检测用户对终端600的握持信号,由处理器601根据压力传感器613采集的握持信号进行左右手识别或快捷操作。当压力传感器613设置在显示屏605的下层时,由处理器601根据用户对显示屏605的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器614用于采集用户的指纹,由处理器601根据指纹传感器614采集到的指纹识别用户的身份,或者,由指纹传感器614根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器601授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器614可以被设置终端600的正面、背面或侧面。当终端600上设置有物理按键或厂商Logo时,指纹传感器614可以与物理按键或厂商Logo集成在一起。
光学传感器615用于采集环境光强度。在一个实施例中,处理器601可以根据光学传感器615采集的环境光强度,控制显示屏605的显示亮度。在一些实施例中,当环境光强度较高时,调高显示屏605的显示亮度;当环境光强度较低时,调低显示屏605的显示亮度。在另一个实施例中,处理器601还可以根据光学传感器615采集的环境光强度,动态调整摄像头组件606的拍摄参数。
接近传感器616,也称距离传感器,通常设置在终端600的前面板。接近传感器616用于采集用户与终端600的正面之间的距离。在一个实施例中,当接近传感器616检测到用户与终端600的正面之间的距离逐渐变小时,由处理器601控制显示屏605从亮屏状态切换为息屏状态;当接近传感器616检测到用户与终端600的正面之间的距离逐渐变大时,由处理器601控制显示屏605从息屏状态切换为亮屏状态。
在一些实施例中,该终端包括一个或多个处理器,以及用于存储该一个或多个处理器可执行指令的一个或多个存储器,其中,该一个或多个处理器被配置为执行该指令,以实现如下操作:
基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息,该媒体描述文件用于提供具备不同码率的该多媒体资源的地址信息;
向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示 该服务器以该目标码率返回该多媒体资源的媒体帧。
在一些实施例中,该一个或多个处理器被配置为执行该指令,以实现如下操作:
确定该目标码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该目标地址信息。
在一些实施例中,该一个或多个处理器被配置为执行该指令,以实现如下操作:
响应于码率选择指令,将该码率选择指令所携带的数值确定为该目标码率;或,
将该目标码率调整为与当前的网络带宽信息对应的码率。
在一些实施例中,该一个或多个处理器还被配置为执行该指令,以实现如下操作:
响应于码率切换指令,确定待切换码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该待切换码率的该多媒体资源的待切换地址信息;
向该服务器发送携带该待切换地址信息的帧获取请求。
在一些实施例中,该帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,该第一扩展参数用于表示该媒体帧是否为音频帧,该第二扩展参数用于表示从该第二扩展参数所指示的目标时间戳开始传输该多媒体资源的媒体帧。
在一些实施例中,基于该第二扩展参数大于零,该目标时间戳大于当前时刻;
基于该第二扩展参数等于零,该目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
基于该第二扩展参数小于零,该目标时间戳小于当前时刻,且该媒体帧包括从该目标时间戳开始已缓存的媒体帧。
在一些实施例中,该媒体描述文件包括版本号和媒体描述集合,其中,该版本号包括该媒体描述文件的版本号或者资源传输标准的版本号中至少一项,该媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括该媒体描述元信息所对应码率的多 媒体资源的画面组长度以及属性信息。
在一些实施例中,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息。
在一些实施例中,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,该第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
在一些实施例中,该媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,该服务类型包括直播或者点播中至少一项,该第二自适应功能选项用于表示是否打开自适应功能,该第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
在一些实施例中,还提供了一种包括至少一条指令的存储介质,例如包括至少一条指令的存储器,上述至少一条指令可由终端中的处理器执行以完成上述实施例中资源传输方法。在一些实施例中,上述存储介质可以是非临时性计算机可读存储介质,例如,该非临时性计算机可读存储介质可以包括ROM(Read-Only Memory,只读存储器)、RAM(Random-Access Memory,随机存取存储器)、CD-ROM(Compact Disc Read-Only Memory,只读光盘)、磁带、软盘和光数据存储设备等。
在一些实施例中,当该存储介质中的至少一条指令由终端的一个或多个处理器执行时,使得终端能够执行如下操作:
基于多媒体资源的媒体描述文件,确定具备目标码率的该多媒体资源的目标地址信息,该媒体描述文件用于提供具备不同码率的该多媒体资源的地址信息;
向服务器发送携带该目标地址信息的帧获取请求,该帧获取请求用于指示该服务器以该目标码率返回该多媒体资源的媒体帧。
在一些实施例中,该终端的一个或多个处理器用于执行如下操作:
确定该目标码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该目标地址信息。
在一些实施例中,该终端的一个或多个处理器用于执行如下操作:
响应于码率选择指令,将该码率选择指令所携带的数值确定为该目标码率; 或,
将该目标码率调整为与当前的网络带宽信息对应的码率。
在一些实施例中,该终端的一个或多个处理器还用于执行如下操作:
响应于码率切换指令,确定待切换码率;
从该媒体描述文件包括的具备不同码率的该多媒体资源的地址信息中,获取该待切换码率的该多媒体资源的待切换地址信息;
向该服务器发送携带该待切换地址信息的帧获取请求。
在一些实施例中,该帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,该第一扩展参数用于表示该媒体帧是否为音频帧,该第二扩展参数用于表示从该第二扩展参数所指示的目标时间戳开始传输该多媒体资源的媒体帧。
在一些实施例中,基于该第二扩展参数大于零,该目标时间戳大于当前时刻;
基于该第二扩展参数等于零,该目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
基于该第二扩展参数小于零,该目标时间戳小于当前时刻,且该媒体帧包括从该目标时间戳开始已缓存的媒体帧。
在一些实施例中,该媒体描述文件包括版本号和媒体描述集合,其中,该版本号包括该媒体描述文件的版本号或者资源传输标准的版本号中至少一项,该媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括该媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。
在一些实施例中,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及该码率的多媒体资源的地址信息。
在一些实施例中,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,该第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
在一些实施例中,该媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,该服务类型包括直播或者点播中至少一项,该第二自适应功能选项用于表示是否打开自适应功能,该第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
在一些实施例中,还提供了一种计算机程序产品,包括一条或多条指令,该一条或多条指令可以由终端的处理器执行,以完成上述各个实施例提供的资源传输方法。

Claims (30)

  1. 一种资源传输方法,包括:
    基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;
    向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
  2. 根据权利要求1所述的资源传输方法,所述基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息包括:
    确定所述目标码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述目标地址信息。
  3. 根据权利要求2所述的资源传输方法,所述确定所述目标码率包括:
    响应于码率选择指令,将所述码率选择指令所携带的数值确定为所述目标码率;或,
    将所述目标码率调整为与当前的网络带宽信息对应的码率。
  4. 根据权利要求1所述的资源传输方法,所述方法还包括:
    响应于码率切换指令,确定待切换码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述待切换码率的所述多媒体资源的待切换地址信息;
    向所述服务器发送携带所述待切换地址信息的帧获取请求。
  5. 根据权利要求1所述的资源传输方法,所述帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,所述第一扩展参数用于表示所述媒体帧是否为音频帧,所述第二扩展参数用于表示从所述第二扩展参数所指示的目标时间戳开始传输所述多媒体资源的媒体帧。
  6. 根据权利要求5所述的资源传输方法,基于所述第二扩展参数大于零,所述目标时间戳大于当前时刻;
    基于所述第二扩展参数等于零,所述目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
    基于所述第二扩展参数小于零,所述目标时间戳小于当前时刻,且所述媒体帧包括从所述目标时间戳开始已缓存的媒体帧。
  7. 根据权利要求1所述的资源传输方法,所述媒体描述文件包括版本号和媒体描述集合,其中,所述版本号包括所述媒体描述文件的版本号或者资源传输标准的版本号中至少一项,所述媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括所述媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。
  8. 根据权利要求7所述的资源传输方法,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及所述码率的多媒体资源的地址信息。
  9. 根据权利要求8所述的资源传输方法,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,所述第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
  10. 根据权利要求7所述的资源传输方法,所述媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,所述服务类型包括直播或者点播中至少一项,所述第二自适应功能选项用于表示是否打开自适应功能,所述第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
  11. 一种终端,包括:
    一个或多个处理器;
    用于存储所述一个或多个处理器可执行指令的一个或多个存储器;
    其中,所述一个或多个处理器被配置为执行所述指令,以实现如下操作:
    基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;
    向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
  12. 根据权利要求11所述的终端,所述一个或多个处理器被配置为执行所述指令,以实现如下操作:
    确定所述目标码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述目标地址信息。
  13. 根据权利要求12所述的终端,所述一个或多个处理器被配置为执行所述指令,以实现如下操作:
    响应于码率选择指令,将所述码率选择指令所携带的数值确定为所述目标码率;或,
    将所述目标码率调整为与当前的网络带宽信息对应的码率。
  14. 根据权利要求11所述的终端,所述一个或多个处理器还被配置为执行所述指令,以实现如下操作:
    响应于码率切换指令,确定待切换码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述待切换码率的所述多媒体资源的待切换地址信息;
    向所述服务器发送携带所述待切换地址信息的帧获取请求。
  15. 根据权利要求11所述的终端,所述帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,所述第一扩展参数用于表示所述媒体帧是否为音频帧,所述第二扩展参数用于表示从所述第二扩展参数所指示的目标时间戳开 始传输所述多媒体资源的媒体帧。
  16. 根据权利要求15所述的终端,基于所述第二扩展参数大于零,所述目标时间戳大于当前时刻;
    基于所述第二扩展参数等于零,所述目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
    基于所述第二扩展参数小于零,所述目标时间戳小于当前时刻,且所述媒体帧包括从所述目标时间戳开始已缓存的媒体帧。
  17. 根据权利要求11所述的终端,所述媒体描述文件包括版本号和媒体描述集合,其中,所述版本号包括所述媒体描述文件的版本号或者资源传输标准的版本号中至少一项,所述媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括所述媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。
  18. 根据权利要求17所述的终端,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及所述码率的多媒体资源的地址信息。
  19. 根据权利要求18所述的终端,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,所述第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
  20. 根据权利要求17所述的终端,所述媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,所述服务类型包括直播或者点播中至少一项,所述第二自适应功能选项用于表示是否打开自适应功能,所述第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
  21. 一种存储介质,当所述存储介质中的至少一条指令由终端的一个或多个处理器执行时,使得终端能够执行如下操作:
    基于多媒体资源的媒体描述文件,确定具备目标码率的所述多媒体资源的目标地址信息,所述媒体描述文件用于提供具备不同码率的所述多媒体资源的地址信息;
    向服务器发送携带所述目标地址信息的帧获取请求,所述帧获取请求用于指示所述服务器以所述目标码率返回所述多媒体资源的媒体帧。
  22. 根据权利要求21所述的存储介质,所述终端的一个或多个处理器用于执行如下操作:
    确定所述目标码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述目标地址信息。
  23. 根据权利要求22所述的存储介质,所述终端的一个或多个处理器用于执行如下操作:
    响应于码率选择指令,将所述码率选择指令所携带的数值确定为所述目标码率;或,
    将所述目标码率调整为与当前的网络带宽信息对应的码率。
  24. 根据权利要求21所述的存储介质,所述终端的一个或多个处理器还用于执行如下操作:
    响应于码率切换指令,确定待切换码率;
    从所述媒体描述文件包括的具备不同码率的所述多媒体资源的地址信息中,获取所述待切换码率的所述多媒体资源的待切换地址信息;
    向所述服务器发送携带所述待切换地址信息的帧获取请求。
  25. 根据权利要求21所述的存储介质,所述帧获取请求还包括第一扩展参数或者第二扩展参数中至少一项,所述第一扩展参数用于表示所述媒体帧是否为音频帧,所述第二扩展参数用于表示从所述第二扩展参数所指示的目标时间 戳开始传输所述多媒体资源的媒体帧。
  26. 根据权利要求25所述的存储介质,基于所述第二扩展参数大于零,所述目标时间戳大于当前时刻;
    基于所述第二扩展参数等于零,所述目标时间戳为距离当前时刻最接近的关键帧或音频帧的时间戳;
    基于所述第二扩展参数小于零,所述目标时间戳小于当前时刻,且所述媒体帧包括从所述目标时间戳开始已缓存的媒体帧。
  27. 根据权利要求21所述的存储介质,所述媒体描述文件包括版本号和媒体描述集合,其中,所述版本号包括所述媒体描述文件的版本号或者资源传输标准的版本号中至少一项,所述媒体描述集合包括多个媒体描述元信息,每个媒体描述元信息对应于一种码率的多媒体资源,每个媒体描述元信息包括所述媒体描述元信息所对应码率的多媒体资源的画面组长度以及属性信息。
  28. 根据权利要求27所述的存储介质,每个属性信息包括多媒体资源的标识信息、多媒体资源的编码方式、多媒体资源所支持的码率以及所述码率的多媒体资源的地址信息。
  29. 根据权利要求28所述的存储介质,每个属性信息还包括多媒体资源的质量类型、多媒体资源的隐藏选项、第一自适应功能选项或者默认播放功能选项中至少一项,其中,所述第一自适应功能选项用于表示多媒体资源是否相对于自适应功能可见。
  30. 根据权利要求27所述的存储介质,所述媒体描述文件还包括服务类型、第二自适应功能选项或者第三自适应功能选项中至少一项,其中,所述服务类型包括直播或者点播中至少一项,所述第二自适应功能选项用于表示是否打开自适应功能,所述第三自适应功能选项用于表示是否在开始播放时默认打开自适应功能。
PCT/CN2020/131840 2020-01-17 2020-11-26 资源传输方法及终端 WO2021143362A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20914247.0A EP3930335A4 (en) 2020-01-17 2020-11-26 Resource transmission method and terminal
US17/473,189 US11528311B2 (en) 2020-01-17 2021-09-13 Method for transmitting multimedia resource and terminal

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010054781.8A CN113141524B (zh) 2020-01-17 2020-01-17 资源传输方法、装置、终端及存储介质
CN202010054781.8 2020-01-17

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/473,189 Continuation US11528311B2 (en) 2020-01-17 2021-09-13 Method for transmitting multimedia resource and terminal

Publications (1)

Publication Number Publication Date
WO2021143362A1 true WO2021143362A1 (zh) 2021-07-22

Family

ID=76809560

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131840 WO2021143362A1 (zh) 2020-01-17 2020-11-26 资源传输方法及终端

Country Status (4)

Country Link
US (1) US11528311B2 (zh)
EP (1) EP3930335A4 (zh)
CN (1) CN113141524B (zh)
WO (1) WO2021143362A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904992A (zh) * 2021-09-28 2022-01-07 咪咕文化科技有限公司 带宽资源调度方法、装置、计算设备及存储介质
CN115086714A (zh) * 2022-06-13 2022-09-20 京东科技信息技术有限公司 数据处理方法、装置、设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949697B (zh) * 2021-09-24 2023-05-09 北京达佳互联信息技术有限公司 数据分发方法、装置、电子设备及存储介质
CN114584561A (zh) * 2022-03-15 2022-06-03 联想(北京)有限公司 一种信息处理方法、装置和电子设备
CN115086300B (zh) * 2022-06-16 2023-09-08 乐视云网络技术(北京)有限公司 一种视频文件调度方法和装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102957672A (zh) * 2011-08-25 2013-03-06 中国电信股份有限公司 自适应播放flv媒体流的方法、客户端和系统
JP2013214799A (ja) * 2012-03-30 2013-10-17 Ntt Communications Kk ストリーミングメディア再生装置、メディアビットレート変更判定方法、及びプログラム
CN106454395A (zh) * 2016-09-20 2017-02-22 北京百度网讯科技有限公司 在服务器中用于自适应提供多码率流媒体的方法及装置
CN110267100A (zh) * 2019-07-12 2019-09-20 北京达佳互联信息技术有限公司 Flv视频的码率切换方法、装置、电子设备及存储介质
CN110290402A (zh) * 2019-07-31 2019-09-27 腾讯科技(深圳)有限公司 一种视频码率调整方法、装置、服务器及存储介质
CN110636346A (zh) * 2019-09-19 2019-12-31 北京达佳互联信息技术有限公司 一种码率自适应切换方法、装置、电子设备及存储介质

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3980592B2 (ja) * 2004-12-22 2007-09-26 日本電信電話株式会社 通信装置、符号化列送信装置、符号化列受信装置、これらの装置として機能させるプログラムとこれを記録した記録媒体、および符号列受信復号方法、通信装置の制御方法
US8914835B2 (en) * 2009-10-28 2014-12-16 Qualcomm Incorporated Streaming encoded video data
CN102149005A (zh) * 2011-04-29 2011-08-10 四川长虹电器股份有限公司 自适应带宽控制网络视频质量的方法
EP2730072B1 (en) * 2011-07-07 2016-09-07 Telefonaktiebolaget LM Ericsson (publ) Network-capacity optimized adaptive streaming
CN103002272A (zh) * 2011-09-15 2013-03-27 上海聚力传媒技术有限公司 一种切换音视频信息的码率的方法、装置和设备
EP2769523A4 (en) * 2011-10-17 2015-07-22 Ericsson Telefon Ab L M ADAPTIVE CONTINUOUS DIFFUSION METHOD, LOCAL STORAGE AND ENHANCED POST-STORAGE QUALITY OF A CONTENT FILE
WO2013090280A2 (en) * 2011-12-15 2013-06-20 Dolby Laboratories Licensing Corporation Bandwidth adaptation for dynamic adaptive transferring of multimedia
US20130223509A1 (en) * 2012-02-28 2013-08-29 Azuki Systems, Inc. Content network optimization utilizing source media characteristics
US10148716B1 (en) * 2012-04-09 2018-12-04 Conviva Inc. Dynamic generation of video manifest files
US20150156243A1 (en) * 2012-11-06 2015-06-04 Telefonaktiebolagel LM Ericsson (publ) Controlling resource usage of adaptive streaming sessions for transmission via a radio access network
US9015470B2 (en) * 2012-11-08 2015-04-21 Morega Systems, Inc Adaptive video server with fast initialization and methods for use therewith
CN103856806B (zh) * 2012-11-28 2018-05-01 腾讯科技(北京)有限公司 视频流切换方法、装置及系统
US9015779B2 (en) * 2013-01-30 2015-04-21 Morega Systems, Inc Streaming video server with segment length control and methods for use therewith
US9438652B2 (en) * 2013-04-15 2016-09-06 Opentv, Inc. Tiered content streaming
CN105357591B (zh) * 2015-11-16 2018-10-12 北京理工大学 一种自适应码率视频直播的QoE监控和优化方法
US20170147587A1 (en) * 2015-11-23 2017-05-25 Le Holdings (Beijing) Co., Ltd. Method for subtitle data fusion and electronic device
CN105744342B (zh) 2016-01-28 2019-04-12 腾讯科技(深圳)有限公司 移动终端的数据传输方法和装置
EP3416396B1 (en) * 2016-02-12 2021-11-24 Sony Group Corporation Information processing device and information processing method
US10820063B2 (en) * 2016-06-10 2020-10-27 Arris Enterprises Llc Manifest customization in adaptive bitrate streaming
CN106657143A (zh) * 2017-01-20 2017-05-10 中兴通讯股份有限公司 一种流媒体传输方法、装置、服务器及终端
CN108989880B (zh) * 2018-06-21 2020-04-14 北京邮电大学 一种码率自适应切换方法及系统
CN108769826A (zh) * 2018-06-22 2018-11-06 广州酷狗计算机科技有限公司 直播媒体流获取方法、装置、终端及存储介质
CN109040801B (zh) * 2018-07-19 2019-07-09 北京达佳互联信息技术有限公司 媒体码率自适应方法、装置、计算机设备及存储介质
CN109769125B (zh) * 2018-12-06 2021-06-15 北京东方广视科技股份有限公司 流媒体码率的动态调整方法、媒体服务器及转码服务器

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102957672A (zh) * 2011-08-25 2013-03-06 中国电信股份有限公司 自适应播放flv媒体流的方法、客户端和系统
JP2013214799A (ja) * 2012-03-30 2013-10-17 Ntt Communications Kk ストリーミングメディア再生装置、メディアビットレート変更判定方法、及びプログラム
CN106454395A (zh) * 2016-09-20 2017-02-22 北京百度网讯科技有限公司 在服务器中用于自适应提供多码率流媒体的方法及装置
CN110267100A (zh) * 2019-07-12 2019-09-20 北京达佳互联信息技术有限公司 Flv视频的码率切换方法、装置、电子设备及存储介质
CN110290402A (zh) * 2019-07-31 2019-09-27 腾讯科技(深圳)有限公司 一种视频码率调整方法、装置、服务器及存储介质
CN110636346A (zh) * 2019-09-19 2019-12-31 北京达佳互联信息技术有限公司 一种码率自适应切换方法、装置、电子设备及存储介质

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113904992A (zh) * 2021-09-28 2022-01-07 咪咕文化科技有限公司 带宽资源调度方法、装置、计算设备及存储介质
CN113904992B (zh) * 2021-09-28 2023-10-17 咪咕文化科技有限公司 带宽资源调度方法、装置、计算设备及存储介质
CN115086714A (zh) * 2022-06-13 2022-09-20 京东科技信息技术有限公司 数据处理方法、装置、设备及存储介质

Also Published As

Publication number Publication date
EP3930335A1 (en) 2021-12-29
US11528311B2 (en) 2022-12-13
CN113141524A (zh) 2021-07-20
US20210409470A1 (en) 2021-12-30
EP3930335A4 (en) 2022-06-29
CN113141524B (zh) 2023-09-29

Similar Documents

Publication Publication Date Title
WO2021143386A1 (zh) 资源传输方法及终端
WO2021143479A1 (zh) 媒体流传输方法及系统
WO2021143362A1 (zh) 资源传输方法及终端
WO2018059352A1 (zh) 直播视频流远程控制方法及装置
US9628145B2 (en) Method and system for transfering data between plurality of devices
US10423320B2 (en) Graphical user interface for navigating a video
CN108769726B (zh) 多媒体数据推送方法、装置、存储介质及设备
WO2022121775A1 (zh) 一种投屏方法及设备
CN109874043B (zh) 视频流发送方法、播放方法及装置
CN113835649B (zh) 一种投屏方法和终端
CN112995759A (zh) 互动业务处理方法、系统、装置、设备及存储介质
CN110996117A (zh) 视频转码方法、装置、电子设备和存储介质
USRE44989E1 (en) Method and system for distributed streaming service of portable devices
CN112969093A (zh) 互动业务处理方法、装置、设备及存储介质
CN111010588A (zh) 直播处理方法、装置、存储介质及设备
US20220095020A1 (en) Method for switching a bit rate, and electronic device
KR20160074234A (ko) 디스플레이 장치 및 디스플레이 장치의 콘텐트 출력 제어방법
CN113873187A (zh) 跨终端录屏方法、终端设备及存储介质
CN113709524B (zh) 选择音视频流的比特率的方法及其装置
WO2021258608A1 (zh) 带宽确定方法、装置、终端及存储介质
CN114840283A (zh) 多媒体资源显示方法、装置、终端及介质
CN104067630B (zh) 在可变时间帧中播放线性视频的方法及系统
CN117812341A (zh) 一种显示设备及媒资播放方法
CN116347184A (zh) 一种视频水印添加方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20914247

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020914247

Country of ref document: EP

Effective date: 20210920

NENP Non-entry into the national phase

Ref country code: DE