WO2019170073A1 - 媒体播放 - Google Patents

媒体播放 Download PDF

Info

Publication number
WO2019170073A1
WO2019170073A1 PCT/CN2019/076963 CN2019076963W WO2019170073A1 WO 2019170073 A1 WO2019170073 A1 WO 2019170073A1 CN 2019076963 W CN2019076963 W CN 2019076963W WO 2019170073 A1 WO2019170073 A1 WO 2019170073A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
media
key frame
target time
frame
Prior art date
Application number
PCT/CN2019/076963
Other languages
English (en)
French (fr)
Inventor
刘吉昊
Original Assignee
青岛海信传媒网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海信传媒网络技术有限公司 filed Critical 青岛海信传媒网络技术有限公司
Priority to EP19764165.7A priority Critical patent/EP3739895A4/en
Priority to US16/360,588 priority patent/US10705709B2/en
Publication of WO2019170073A1 publication Critical patent/WO2019170073A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4396Processing of audio elementary streams by muting the audio signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440236Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by media transcoding, e.g. video is transformed into a slideshow of still pictures, audio is converted into text
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection

Definitions

  • the present invention relates to the field of multimedia technologies, and more particularly to media playback.
  • video viewers may skip some of the unwatched video by fast-forward or look back at the partially played video by rewinding. Usually the video viewer will click or drag the progress bar or slide the player's screen. Ways to achieve fast forward or fallback.
  • an embodiment of the present invention provides a method for playing media, including:
  • the play progress change instruction includes an adjustment manner, a start position, and a target position of the input;
  • the decoded media data starting from the target time is played in response to the system clock advancing to the target time.
  • an embodiment of the present invention provides a media playback device, including:
  • a memory for storing processor executable instructions
  • the executable instructions cause the processor to implement:
  • the play progress change instruction includes an adjustment manner, a start position, and a target position of the input
  • the decoded media data starting from the target time is played in response to the system clock advancing to the target time.
  • one or more computer readable medium having stored thereon instructions that, when executed by one or more processors, perform the method described in the first aspect above.
  • FIG. 1 is a schematic structural diagram of a media playing system according to an embodiment of the present disclosure
  • 2A is an application scenario of media click-back playback according to an embodiment of the present disclosure
  • FIG. 2B is an application scenario of media click-forward playback according to an embodiment of the present disclosure
  • FIG. 2 is an application scenario of media sliding back playback according to an embodiment of the present disclosure
  • FIG. 2-D is an application scenario of media sliding fast forward playback according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a method for playing media according to an embodiment of the present disclosure
  • FIG. 4 is a flowchart of another method for playing media according to an embodiment of the present disclosure.
  • FIG. 5 is a sequence diagram of a media playing method according to an embodiment of the present disclosure.
  • FIG. 6 is another timing diagram of a media playing method according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an apparatus for playing media according to an embodiment of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure.
  • the embodiment of the present disclosure is applicable to any terminal device that can play a media file, and the terminal device can be a smart TV, a smart phone, a personal computer (PC), a portable multimedia player, etc.
  • the terminal equipment is not limited.
  • FIG. 1 is a structural diagram of a media playing system according to an embodiment of the present disclosure. As shown in FIG. 1, it may include a control module 101, an analysis layer 102, an audio-video separation layer 103, a decoding layer 104, a rendering layer 105, and a player 106.
  • the control module 101 can communicate with the analysis layer 102, the audio-video separation layer 103, the decoding layer 104, and the rendering layer 105, respectively, and the analysis layer 102 can communicate with the audio-video separation layer 103, and the audio-video separation layer 103 can be coupled to the decoding layer 104.
  • the decoding layer 104 can communicate with the rendering layer 105.
  • the player 106 can communicate with the control module 101, the parsing layer 102, the audio-video separation layer 103, the decoding layer 104, and the rendering layer 105, respectively.
  • the application scenarios can be divided into the following four types: the click reverse operation shown in Figure 2-A, the click fast forward operation shown in Figure 2-B, and the sliding shown in Figure 2-C.
  • the reverse operation and the sliding fast forward operation shown in Figure 2-D corresponds to the corresponding position of the click progress bar
  • the slide operation may correspond to the slide progress bar, and may also correspond to the slide player screen in some scenarios, which is not limited in the disclosure.
  • the initial position of the progress bar is B. After the user clicks on the A position with the mouse or finger, the playback progress is retracted from the B position to the A position; in Figure 2-B, the initial position of the progress bar is A, the user After clicking on the B position with the mouse or finger, the playback progress is fast forwarded from the A position to the B position; in Figure 2-C, the initial position of the progress is B, and the user slides back or slides the finger to make the playback progress fall back. To the A position. In Figure 2-D, the initial position of the progress is A, and the user moves the playback progress to the B position by sliding the mouse or sliding the finger.
  • the application scenarios can be classified into the following types:
  • Type 1 playing a media stream through a web browser
  • Type 3 playing local media through a web browser
  • Type 4 playing local media through a local player
  • Media stream refers to media formats such as audio/video that are played on the network by streaming, such as video or multimedia files. It should be noted that the present disclosure is applicable to include, and is not limited to, the types of scenes described above. The following is an example of playing a media stream through a web browser.
  • the data is compression encoded.
  • the most commonly used compression coding method is the IPB mode, where I represents an I frame, P represents a P frame, and B represents a B frame.
  • An I frame (also known as an Intra picture) is a key frame and belongs to intraframe compression.
  • the P frame is a predictive frame, that is, when the P frame is decoded. It is necessary to refer to the information of the previous related frame to decode; the B frame is a bi-directional interpolated prediction frame, and the decoding needs to refer to both the previous frame and the frame to be decoded later; the P frame and the B frame are based on I frame to compress data.
  • IDR Intelligent Decoding Refresh
  • the decoder can clear the buffer, and subsequent frames will not be referenced.
  • the IDR frame is a subset of the I frame. The difference between it and the normal I frame is that it can ensure that the frames after it will not refer to the frame before it. In other words, as long as there is an IDR frame, the video must be from this.
  • the frame begins to decode and plays.
  • IDR frames are also written directly into I frames. IDR frames are in the most commonly used H264 encoding format with a default interval of 250 frames.
  • the decoder When playing a media stream, if the target location does not correspond to an I frame or an IDR frame, the decoder cannot start decoding directly from the target location. Therefore, according to the protocol specification, the browser will actively find the I frame or IDR frame that is the shortest before the target time corresponding to the target location and the target time interval, and then send data to the player from the frame. In most scenarios, the player will perform pause() to pause playback, flush() to clear the cache, and then passively receive the data sent by the browser, and finally decode and play. For different media sources, the interval between adjacent I-frames or IDR frames is variable, ranging from 0.1 seconds to 10 seconds, mostly between 0.5 seconds and 2 seconds.
  • the video When the playback resumes, the video will be broadcast from the 8th second, resulting in a 1 second error compared to the 9th second of the user's target position. This gives the user a feeling that the video can never be targeted to the target location.
  • the browser since the browser will play a media stream, there will be a progress bar, the above seek (fast forward and rewind) will appear in the browser interface to the 9th second position, jump to the 8th second, then The phenomenon of playing from the 8th second affects the user experience.
  • FIG. 3 is a flowchart of a method for playing media according to an embodiment of the present disclosure, where the method is applied to a terminal device (for example, a television, a personal computer, a laptop, etc.) that can perform media playback.
  • a terminal device for example, a television, a personal computer, a laptop, etc.
  • the method includes the following steps:
  • S301 Receive an input of a user for triggering adjustment of a play progress of the media.
  • the operation includes the click-back operation shown in FIG. 2-A, the click-forward operation shown in FIG. 2-B, the sliding-back operation shown in FIG. 2-C, and the operation shown in FIG. 2-D.
  • S302 Generate a play progress change instruction based on the input, where the play progress change instruction includes the input adjustment mode, the start position, and the target position.
  • the mode of adjustment is a click operation.
  • the adjustment mode is a click operation
  • the target position corresponds to the click operation position
  • the adjustment mode is a sliding operation.
  • the target position corresponds to the sliding operation end position.
  • S303 Determine a target key frame that has the shortest target time interval corresponding to the target location and is located before the target time.
  • the target key frame is an I frame or an IDR frame.
  • the PAT table Program Association Table
  • PID Packet Identifier
  • the information of the PMT table (Program Map Table) carried in the PMT table is found, and the TS packet containing the PES (Packetized Elementary Streams) header in the PMT table is obtained (ie, the payload-unit-start in the packet header) -indicator (load unit start flag) is 1 TS packet), after reading the picture_header information (starting at 0x00000100) contained in the TS packet, and then 10 bits (Temporal_reference (time reference) is 10 bits) ) to the image encoding type byte picture_coding_type (placeholder 3 bits), if picture_coding_type is 001 (binary) is an I frame, 010 is a P frame, and 011 is a B frame. In this way, the target key frame whose target time interval corresponding to the target location information is the shortest and before the target time can be determined.
  • PES Packetized Elementary Streams
  • S304 Decode the media data from the target key frame to prevent the presentation of the media data decoded from the target key frame.
  • preventing presentation of media data decoded from the target key frame comprises:
  • the target fixed frame image is occluded for the video media stream to be played after decoding
  • the media stream is a media stream of audio and video
  • the media stream is first separated into an audio media stream and a video media stream by the audio and video splitter, and then the audio media stream to be played after being decoded is muted and decoded.
  • the played video media stream is occluded with the target fixed frame image.
  • the fixed frame image is to ensure the continuity of the image during the seek process, and the player will always display the last frame image before the seek.
  • the image to which the frame is fixed is the fixed frame image.
  • the adjustment mode is a click operation
  • the image of the initial position is obtained as a target fixed frame image
  • the adjustment mode is a sliding operation
  • the image of the start position, the time corresponding to the start position, and the target are sequentially acquired.
  • a fixed-frame key frame (ie, I-frame) image between time is fixed as a fixed-frame image.
  • the key frame (ie, I frame) image correspondingly, the I frame image between the time corresponding to the start position and the target time and closest to the target time is finally displayed on the screen.
  • S305 Play the decoded media data starting from the target time in response to the player system clock advancing to the target time.
  • the media stream is a video media stream, cancel the framing of the target fixed frame image and the occlusion of the video media stream to be played after decoding;
  • the media stream is a mixed media stream of audio and video
  • the mute processing of the audio media stream to be played after decoding is cancelled, and at the same time, the framing of the target fixed frame image and the occlusion of the video media stream to be played after decoding are cancelled.
  • the terminal device after receiving the input of the user for triggering the adjustment of the play progress of the media, the terminal device includes:
  • Steps S302 to S305 are performed in response to the time interval between adjacent key frames being less than a preset value
  • the preset value may be 1 second or 2 seconds.
  • a playback progress change instruction is generated by receiving an input of the user for triggering playback progress adjustment on the media; determining a target key interval that is the shortest target time interval corresponding to the target location and located before the target time;
  • the target key frame begins decoding to prevent rendering of the media data decoded from the target key frame; in response to the system clock advancing to the target time, the decoded media data from the target time is played. Therefore, the problem that the media cannot be accurately played according to the progress bar due to the decoding dependency is solved, and the user experience is improved.
  • FIG. 4 is a flowchart of a method for playing media according to an embodiment of the present disclosure.
  • the method is applied to a terminal device capable of playing media, and the terminal device can deploy the system framework shown in FIG.
  • the media stream in which the media stream is mixed by audio and video is taken as an example.
  • the embodiment shown in FIG. 3 will be described in detail with reference to FIG. 1 .
  • the type of the media stream can be determined by the parsing layer 102.
  • the method includes the following steps:
  • S401 Receive an input of a user for triggering playback progress adjustment.
  • S402 Generate a play progress change instruction based on the input.
  • the play progress change instruction includes an adjustment manner, a start position, and a target position of the input.
  • S403 Determine a target key frame that has the shortest target time interval corresponding to the target location and is located before the target time.
  • the browser invokes the pause() interface of the player to pause the current media stream, and transmits the target location to the player through the seek() interface, and simultaneously parses the layer 102 to the media stream.
  • the parsing is performed to determine a target key frame pos_I_frame that has the shortest target time interval corresponding to the target location and is located before the target time, and the browser requests data from the webpage.
  • S404 Perform audio and video separation on the audio and video mixed media stream.
  • the audio stream is separated into the audio media stream and the video media stream by the audio and video splitter in the audio and video separation layer 103, and then separately sent to the decoding layer 104 for video decoding. And audio decoder.
  • S405 Acquire a fixed frame image and perform frame fixation.
  • the player acquires the fixed frame image, and uses the freeze() interface to frame the fixed frame image, and then waits for the browser to send the data.
  • S406 Decode the video media stream from the target key frame, and simultaneously decode the audio media stream at a corresponding frame of the audio media stream.
  • the decoded layer 104 decodes the video media stream from the target key frame and encodes the audio media stream at the corresponding frame of the audio media stream. Since the browser requests data from the webpage according to the target key frame pos_I_frame, the frame in which the audio media stream starts decoding corresponds to the target key frame. For convenience of expression, the text describes that the audio media stream is decoded from the target key frame, in other words, the audio. The data stream is decoded starting from the point in time when it is synchronized with the target key frame.
  • step S405 and step S406 are related to each other in some application scenarios. Specifically, the description will be made in conjunction with FIG. 2-A, FIG. 2-B, FIG. 2-C, and FIG. 2-D.
  • the application scenario is played back in the click mode.
  • the progress bar is rolled back from the B position to the A position.
  • the start position B is determined according to the play progress change instruction in step S402, and the image at the B position is acquired as a fixed frame image for frame fixation, and then step S406 is performed. Decoding is performed from the I frame 1 of the video media stream and the corresponding frame of the audio media stream.
  • FIG. 2-B it is an application scenario that uses the click mode to play fast forward.
  • the progress bar advances from the A position to the B position.
  • the start position A is determined according to the play progress change instruction in step S402, and the image at the A position is acquired as a fixed frame image for frame fixation, and then step S406 is performed.
  • Decoding is performed from the I frame 7 of the video media stream and the corresponding frame of the audio media stream.
  • the application scenario is played back in the sliding mode.
  • the playback progress is rolled back from the B position to the A position.
  • the start position B is determined according to the play progress change instruction in step S402, and the image at the B position is acquired as a fixed frame image for frame fixation; then, as the play progress changes, I frame 7, I frame 6, and I frame 5 are sequentially acquired.
  • the image at the position of I frame 4, I frame 3, and I frame 2 is framed as a fixed frame image, and finally, step S406 is performed to decode from the I frame 1 of the video media stream and the corresponding frame of the audio media stream.
  • the fixed frame image is sequentially acquired and the frame is fixed, the previous fixed frame image is updated. Therefore, the fixed frame image when step S406 is executed in the application scenario is the image at the position of the I frame 2, that is, the target in step S407. Fixed frame image.
  • the application scenario is fast playback using the sliding mode.
  • the playback progresses from the A position to the B position.
  • the playback progress change command determines the start position A, acquires the image at the A position as a fixed frame image for frame fixation; and then sequentially acquires I frame 2, I frame 3, and I frame 4 as the playback progress changes.
  • the image at the position of I frame 5, I frame 6, and I frame 7 is framed as a fixed frame image
  • step S406 is performed to decode from the I frame 7 of the video media stream and the corresponding frame of the audio media stream.
  • the fixed frame image is sequentially acquired and framed, the previous fixed frame image is updated. Therefore, the fixed frame image when step S406 is executed in the application scenario is the image at the position of the I frame 7, and the fixed frame image is S407. Targeted image in the target.
  • S407 Perform mute processing on the audio media stream to be played after decoding and block the video media stream to be played after decoding by using the target fixed frame image.
  • the player calls mute() to the rendering layer 105, specifically calling mute() to the audio renderer and the video renderer in the rendering layer 105 respectively, for audio output.
  • the video output is muted separately, so that it plays in the background, and the content is not displayed to the user.
  • the rendered media file is unmute(), and unfreeze() is called to deframe the target fixed image.
  • the method steps shown in FIG. 4 will be further described in conjunction with the timing diagram of the media playing method shown in FIG. 5.
  • the method includes:
  • the Browser Browser calls the pause() interface of the Player Player to pause playback.
  • flushIfNeed() is called to clear the cached data, and the target position pos_target_seek included in the playback progress change instruction in step S402 is sent to the Player through the seek() interface.
  • request data to the web page via requestData().
  • the Player After receiving the seek() call, the Player frames the image through the freeze() interface, sends a pause decoding instruction pauseDecode() to the decoder Decoder, and then waits for the Webpage to send the data Data().
  • the Brower requests the data from the Webpage
  • the data is written to the Player through writeData().
  • the Player sends an instruction to cancel the decoding to the Decoder, ResumeDecode(), and the Decoder starts the writeData from the target key frame determined in step S403. () Write the data of the Player for decoding.
  • the data is decoded and sent to the renderer, while mute() is called to mute the rendered data.
  • the player can be accurately played at the playback progress target position.
  • the effect is illustrated by the example: when the user watches to the 3rd second, the playback progress is changed to the 9th second (pos_target_seek) by clicking, the player freezes the 3rd second screen, and then the background from the 8th second position. (pos_I_frame, assuming that the 8th second position is the position of the key frame) starts decoding and plays in the background.
  • the background is played until the 9th second, the player will cancel the 3rd second frame and output the video starting at the 9th second.
  • the display is performed, and the audio is played back to achieve the effect of playing directly from the ninth second.
  • the embodiments of the present disclosure include corresponding hardware structures and/or software modules for performing the respective functions in order to implement the above functions.
  • the embodiments of the present disclosure can be implemented in a combination of hardware or hardware and computer software in combination with the units (means, devices) and algorithm steps of the various examples described in the embodiments of the present disclosure. Whether a function is implemented in hardware or computer software to drive hardware depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the technical solutions of the embodiments of the present disclosure.
  • the embodiments of the present disclosure may perform the division of functional units (devices, devices) on the apparatus for performing the above method according to the above method example.
  • each functional unit (device, device) may be divided according to each function, or two or two may be divided.
  • the above functions are integrated in one processing unit (device, device).
  • the above integrated units (devices, devices) can be implemented in the form of hardware or in the form of software functional units (devices, devices). It should be noted that the division of units (devices, devices) in the embodiments of the present disclosure is schematic, and is only a logical function division, and may be further divided in actual implementation.
  • FIG. 7 shows a device for media playback provided by an embodiment of the present disclosure.
  • the apparatus includes a receiving module 701, a generating module 702, a first determining module 703, a first processing module 704, and a second processing module 705.
  • the receiving module 701 is configured to receive an input of the user for triggering a progress adjustment of the media
  • a generating module 702 configured to generate a play progress change instruction based on the input, where the play progress change instruction includes an adjustment manner, a start position, and a target position of the input;
  • a first determining module 703, configured to determine a target key frame that has a shortest target time interval corresponding to the target location and is located before the target time;
  • a first processing module 704 configured to start decoding from the target key frame, and prevent presentation of media data obtained by decoding from the target key frame;
  • the second processing module 705 is configured to play the decoded media data starting from the target time in response to the player system clock advancing to the target time.
  • the progress adjustment manner includes a click operation and a sliding operation
  • the target location corresponds to a click operation location
  • the target position corresponds to a sliding operation termination position.
  • the device further comprises:
  • the second determining module is configured to determine, after receiving the input of the user for triggering the adjustment of the playing progress of the media, whether the spacing between adjacent key frames is less than a preset value.
  • the device further includes:
  • a third processing module configured to: when a distance between adjacent key frames is greater than a preset value, determine a target key frame corresponding to the target location that is the shortest and is located before the target time, and from the target key frame Start decoding for media playback.
  • FIG. 8 is a schematic structural diagram of a computer device 800 according to an embodiment of the present application, that is, another structure diagram of the media playback device 700.
  • computer device 800 includes a processor 801 and a network interface 802.
  • the processor 801 can also be a controller.
  • the processor 801 is configured to support the media playback device 700 to perform the functions involved in Figures 3 through 5.
  • the network interface 802 is configured to perform the function of the media playback device 700 to send and receive messages.
  • Computer device 800 can also include a memory 803 for coupling with processor 801 that holds the program instructions and data necessary for the device.
  • the processor 801, the network interface 802, and the memory 803 are connected by an internal bus 804, where the memory 803 is used to store instructions, and the processor 801 is configured to execute the instructions stored in the memory 803 to control the network interface 802 to send and receive messages.
  • the media playback device 700 and the computer device 800 are referred to the foregoing methods or other embodiments. Description, no further description here.
  • the processor involved in the foregoing embodiments of the present application may be a central processing unit (CPU), a general-purpose processor, a digital signal processor (DSP), and an application specific integrated circuit (Application-Specific). Integrated Circuit (ASIC), Field Programmable Gate Array (FPGA) or other programmable logic device, transistor logic device, hardware component, or any combination thereof. It is possible to implement or carry out the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor can also be a combination of computing functions, for example, including one or more microprocessor combinations, a combination of a DSP and a microprocessor, and the like.
  • the memory may be integrated in the processor or may be separately provided from the processor.
  • the embodiment of the present application further provides a computer storage medium for storing instructions that, when executed, may cause the processor to complete any of the methods involved in the foregoing media playing device.
  • the embodiment of the present application further provides a computer program product for storing a computer program, which is used to execute the media playing method involved in the foregoing method embodiment.
  • embodiments of the present application can be provided as a method, system, or computer program product. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开提供了一种媒体播放的方法及设备。根据一实施例,该方法包括:通过接收用户的用于触发对媒体的播放进度调节的输入,生成播放进度变更指令;确定与播放进度调节的目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;从所述目标关键帧开始进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;响应于系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。

Description

媒体播放
相关申请的交叉引用
本专利申请要求于2018年3月5日提交的、申请号为201810180598.5、发明名称为“一种媒体播放的方法及装置”的中国专利申请的优先权,该申请以引用全文的方式并入本文中。
技术领域
本发明涉及多媒体技术领域,尤其涉及媒体播放。
背景技术
视频播放过程中,视频观看者可能会通过快进来跳过部分未观看的视频或者通过回退来回看已播放的部分视频,通常视频观看者会通过点击或者拖动进度条或者滑动播放器屏幕的方式来实现快进或回退。
发明内容
第一方面,本发明实施例提供一种媒体播放的方法,包括:
接收用户的用于触发对所述媒体的播放进度调节的输入;
基于所述输入生成播放进度变更指令,所述播放进度变更指令中包括所述输入的调节方式、起始位置和目标位置;
确定与所述目标位置相对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;
从所述目标关键帧开始对媒体的数据进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;
响应于系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
第二方面,本发明实施例提供了一种媒体播放设备,包括:
处理器;及
用于存储处理器可执行指令的存储器;
所述可执行指令促使所述处理器实现:
接收用户的用于触发对所述媒体的播放进度调节的输入;
基于所述输入生成播放进度变更指令,所述播放进度变更指令中包括所述输入的调节方式、起始位置和目标位置,
确定与所述目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;
从所述目标关键帧开始对媒体的数据进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;
响应于系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
第三方面,提供一个或多个计算机可读介质,所述可读介质上存储有指令,所述指令被一个或多个处理器执行时,执行上述第一方面中所述的方法。
附图说明
图1为本公开实施例提供的一种媒体播放系统的架构图;
图2-A为本公开实施例提供的一种媒体点击回退播放的应用场景;
图2-B为本公开实施例提供的一种媒体点击快进播放的应用场景;
图2-C为本公开实施例提供的一种媒体滑动回退播放的应用场景;
图2-D为本公开实施例提供的一种媒体滑动快进播放的应用场景;
图3为本公开实施例提供的一种媒体播放的方法流程图;
图4为本公开实施例提供的另一种媒体播放的方法流程图;
图5为本公开实施例提供的媒体播放方法的时序图;
图6为本公开实施例提供的媒体播放方法的另一时序图;
图7为本公开实施例提供的一种媒体播放的装置结构示意图;
图8为本公开实施例提供的计算机设备的结构示意图。
具体实施方式
为使本公开的目的、技术方案和优点更加清楚,下面将结合附图对本公开实施方式作进一步地详细描述。
本公开实施例适用于任何可以播放媒体文件的终端设备,该终端设备可以为智能电视、智能电话、PC(personal computer,个人电脑)、便携式多媒体播放器等,可以理解,本公开实施例对于具体的终端设备不加限制。
为了便于理解,在对本公开实施例进行详细地解释说明之前,先对本公开实施例涉及的系统架构和应用场景进行介绍。
首先,对系统架构进行介绍。
图1是本公开实施例提供的一种媒体播放系统的架构图。如图1所示,其可以包括控制模块101、解析层102、音视频分离层103、解码层104、渲染层105和播放器106。控制模块101能够分别与解析层102、音视频分离层103、解码层104和渲染层105进行通信,且解析层102可以与音视频分离层103进行通信,音视频分离层103可以与解码层104进行通信,解码层104可以与渲染层105进行通信。播放器106可以分别与控制模块101、解析层102、音视频分离层103、解码层104和渲染层105进行通信。
其次对应用场景进行介绍。
根据进度调节方式的不同,应用场景可以分为下述四种类型:如图2-A所示的点击倒退操作、图2-B所示的点击快进操作、图2-C所示的滑动倒退操作和图2-D所示的滑动快进操作。其中,点击操作对应于点击进度条相应位置,而滑动操作既可对应于滑动进度条,在一些场景下也可对应于滑动播放器屏幕,本公开对此不作限定。
在图2-A中进度条初始位置为B,用户通过鼠标或手指在A位置进行点击后,播放进度由B位置回退至A位置;在图2-B中进度条初始位置为A,用户通过鼠标或手指在B位置进行点击后,播放进度由A位置快进至B位置;在图2-C中,进度初始位置为B,用户通过鼠标滑动或手指滑动的方式,使播放进度回退至A位置。在图2-D中,进度初始位置为A,用户通过鼠标滑动或手指滑动的方式,使播放进度移动到B位置。
根据媒体类型和播放方式,应用场景可以分为下述几种类型:
类型1,通过网页浏览器进行媒体流的播放;
类型2,通过网页应用进行媒体流播放;
类型3,通过网页浏览器播放本地媒体;
类型4,通过本地播放器播放本地媒体;
类型5,通过DLNA(Digital Living Network Alliance,数字生活网络联盟)进行媒体播放。
媒体流是指采用流式传输的方式在网络中播放的音频/视频等媒体格式,如视频或多媒体文件。需要说明的,对于包括并不限于上述的几种场景类型,本公开均适用。以下以通过网页浏览器播放媒体流来进行示例说明。
为了提高视频传输和存储效率,会对数据进行压缩编码。最常用的压缩编码方式为IPB方式,其中,I代表I帧,P代表P帧,B代表B帧。I帧(也称为Intra picture,内部画面)是关键帧,属于帧内压缩,解码时单独的该帧便可完成解码;P帧为前向预测编码帧(predictive frame),即P帧解码时需要参考前面相关帧的信息才能解码;B帧为双向预测编码帧(bi-directional interpolated prediction frame),解码时既需要参考前面的帧又需要参考后面待解码的帧;P帧和B帧都基于I帧来压缩数据。此外,还有一类可理解为特殊的I帧的IDR(Instantaneous Decoding Refresh,即时解码刷新)帧,当解码到该帧时解码器就可以把缓存全清了,后续帧不会再去参照。IDR帧是I帧的子集,其与普通I帧的区别是,它可以确保它之后的帧都不会参考它前面的帧,也就是说理论上只要有IDR帧,视频一定可以从这一帧开始解码并进行播放。但由于功能相像,有时IDR帧也直接写作I帧。IDR帧在最常用的H264编码格式中,默认间隔为250帧。
在播放媒体流时,如果目标位置不对应于I帧或IDR帧,那么解码器无法直接从目标位置开始解码。因此,按照协议规范,浏览器会主动找到目标位置对应的目标时间之前且与目标时间间隔最短的I帧或IDR帧,然后从此帧开始向播放器下发数据。在大多数方案中,播放器会执行pause()暂停播放,flush()清空缓存,然后被动地接收浏览器下发的数据,最后解码播放。对于不同的媒体源,其相邻I帧或IDR帧的间隔时间(Interval)不定,从0.1秒到10秒不等,多数在0.5秒至2秒之间。由于在interval区间内,解码器都无法直接进行解码,需要找到离目标位置对应的目标时间间隔最短的前一个I帧,才能将数据解出进行续播。因此会导致起播位置较目标位置提前,引起定位不精准的问题。例如,用户播放interval=2s的视频流,手动将进度条拖至第9秒处,即目标位置对应的目标时间为9s。此时9s附近的I帧分别在第8秒和第10秒的位置,那么播放器在请求第9秒的数据时,浏览器会寻找离第9秒最近的前一个I帧,即第8秒的I帧。在恢复播放时,视频会从第8秒起播,与用户目标位置第9秒相比,产生了 1秒的误差。这会给用户一种视频永远无法定位到目标位置的感觉。另外,由于浏览器在播放媒体流时,会有进度条出现,上述的seek(快进快退)在浏览器界面会出现进度条点到第9秒的位置,自己跳到第8秒,然后从第8秒开始播放的现象,影响用户体验。
有鉴于上述问题,本公开提供了一种媒体播放方法及设备。在介绍完本公开实施例涉及的系统架构和应用场景之后,下面对本公开实施例进行详细的解释说明。图3是本公开实施例提供的一种媒体播放的方法流程图,该方法应用于可进行媒体播放的终端设备(例如,电视、个人电脑和膝上电脑等)中。参见图3,该方法包括如下步骤:
S301:接收用户的用于触发对媒体的播放进度调节的输入。
可选择的,所述操作包括图2-A所示的点击回退操作、图2-B所示的点击快进操作、图2-C所示的滑动回退操作和如图2-D所示的滑动快进操作。
S302:基于所述输入生成播放进度变更指令,其中,播放进度变更指令中包括所述输入的调节方式、起始位置和目标位置。
在一些实施方式中,调节方式为点击操作。当调节方式为点击操作时,目标位置对应点击操作位置;
在一些实施方式中,调节方式为滑动操作。当调节方式为滑动操作时,目标位置对应滑动操作终止位置。
S303:确定与目标位置对应的目标时间间隔最短且位于该目标时间之前的目标关键帧。
在一些实施方式中,对于视频媒体流而言,该目标关键帧为I帧或IDR帧。在一些实施方式中,通过TS流(Transport Stream,传输流)中的PID(Packet Identifier,包标识)(0x00)找到TS流里的PAT表(Program Association Table,节目关联表),然后根据PAT表中携带的PMT表(Program Map Table,节目映射表)的信息找到PMT表,获取PMT表中包含PES(Packetized Elementary Streams,分组的基本码流)头的TS包(即包头中payload-unit-start-indicator(负载单元开始标志)为1的TS包),读取该TS包中包含的picture_header(图片头部)信息(以0x00000100开始)后,再过10位(Temporal_reference(时间参考)占10位)就到图像编码类型字节picture_coding_type(占位3位),如果picture_coding_type为001(二进制)就是I帧,为010就是P帧,为011就是B帧。通过该种方式即可确定与目标位置信息对应的目标时间间隔最短且位于该目标时间之前的目标关键帧。
S304:从目标关键帧开始对媒体数据进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现。
在一个实施例中,阻止从所述目标关键帧开始解码得到的媒体数据的呈现包括:
当媒体流为视频媒体流时,对解码后待播放的视频媒体流采用目标定帧图像进行遮挡;
当媒体流为音视频混合的媒体流时,首先将该媒体流经音视频分离器分离成音频媒体流和视频媒体流,然后将解码后待播放的音频媒体流进行静音处理并对解码后待播放的视频媒体流采用目标定帧图像进行遮挡。
定帧图像是为了保证seek过程中图像的连续性,播放器会一直保持显示seek前的最后一帧图像,这帧被定住的图像就是定帧图像。具体的,当调节方式为点击操作时,获取起始位置的图像为目标定帧图像进行定帧;当调节方式为滑动操作时,依次获取起始位置的图像、起始位置对应的时间与目标时间之间的固定间隔的关键帧(即I帧)图像作为定帧图像进行定帧,在滑动的过程中,屏幕上依次显示起始位置的图像及起始位置对应的时间与目标时间之间的关键帧(即I帧)图像,相应地,屏幕上最终显示起始位置对应的时间与所述目标时间之间且离目标时间最近的I帧图像。
S305:响应于播放器系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
在执行步骤S301-S304的同时,播放器的系统时钟(System Time Clock,STC)正常推移。当播放器系统时钟推进到目标时间时,播放从所述目标时间开始的被解码的媒体数据,具体包括:
若媒体流为视频媒体流,取消对目标定帧图像的定帧及对解码后待播放的视频媒体流的遮挡;
若媒体流为音视频混合的媒体流,取消对解码后待播放的音频媒体流的静音处理,同时,取消对目标定帧图像的定帧及对解码后待播放的视频媒体流的遮挡。
在一些实施方式中,终端设备在接收用户的用于触发对媒体的播放进度调节的输入之后,包括:
确定相邻关键帧之间的时间间距是否小于预设数值;
响应于相邻关键帧之间的时间间距小于预设数值,执行步骤S302至S305;
响应于相邻关键帧之间的时间间距不小于预设数值,确定与目标位置对应的目标时间间隔最短且位于目标时间之前的目标关键帧;从该目标关键帧开始进行解码并播放。
在一些实施方式中,上述预设数值可为1秒或者2秒。
综上所述,通过接收用户的用于触发对媒体的播放进度调节的输入,生成播放进度变更指令;确定与目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;从该目标关键帧开始进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;响应于系统时钟推进到该目标时间,播放从所述目标时间开始的被解码的媒体数据。从而解决了因为解码依赖导致的媒体不能准确按照进度条变更位置进行播放的问题,提升了用户体验。
图4是本公开实施例提供的一种媒体播放的方法流程图。该方法应用于可进行媒体播放的终端设备,且该终端设备可以部署上图1所示的系统框架。下面以媒体流为音视频混合的媒体流为例,结合图1对上述图3所示的实施例进行详细说明。需要说明的是,可以通过解析层102来判断媒体流的类型。参见图4,该方法包括如下步骤:
S401:接收用户的用于触发播放进度调节的输入。
S402:基于所述输入生成播放进度变更指令。
其中,所述播放进度变更指令中包括所述输入的调节方式、起始位置和目标位置。
S403:确定与目标位置对应的目标时间间隔最短且位于目标时间之前的目标关键帧。
具体的,在接收到播放进度变更指令后,浏览器调用播放器的pause()接口对当前媒体流暂停播放,通过seek()接口将该目标位置传送给播放器,同时解析层102对媒体流进行解析,确定与目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧pos_I_frame,浏览器向网页请求数据。
S404:对该音视频混合媒体流进行音视频分离。
具体的,浏览器从网页上请求到数据后,通过音视频分离层103中的音视频分离器将该媒体流分离为音频媒体流和视频媒体流,之后再分别发送至解码层104的视频解码器和音频解码器。
S405:获取定帧图像并进行定帧。
具体的,播放器接收到seek()调用之后,获取定帧图像,并使用freeze()接口对定帧 图像进行定帧,然后等待浏览器下发数据。
S406:从目标关键帧开始对视频媒体流进行解码,同时在音频媒体流的相应帧开始对音频媒体流进行解码。
具体的,经解码层104从目标关键帧开始对视频媒体流进行解码并在音频媒体流的相应帧开始对音频媒体流进行编码。由于浏览器根据目标关键帧pos_I_frame向网页请求数据,因此,音频媒体流开始解码的帧对应于目标关键帧,为便于表述,文中表述为从目标关键帧开始对音频媒体流进行解码,换言之,音频数据流从与目标关键帧同步的时间点开始解码。
需要注意的是,步骤S405和步骤S406在某些应用场景下是相互关联的。具体的,结合图2-A、图2-B、图2-C和图2-D进行说明。
如图2-A所示,为采用点击方式进行播放回退的应用场景。在该应用场景下,进度条从B位置回退到A位置,首先根据步骤S402播放进度变更指令确定起始位置B,获取该B位置处的图像作为定帧图像进行定帧,然后执行步骤S406分别从视频媒体流的I帧1及音频媒体流与之对应的帧开始进行解码。
如图2-B所示,为采用点击方式进行播放快进的应用场景。在该应用场景下,进度条从A位置快进到B位置,首先根据步骤S402播放进度变更指令确定起始位置A,获取该A位置处的图像作为定帧图像进行定帧,然后执行步骤S406分别从视频媒体流的I帧7及音频媒体流与之对应的帧开始进行解码。
如图2-C所示,为采用滑动方式进行播放回退的应用场景。在一些实施方式中,在该应用场景下,播放进度从B位置回退到A位置。首先根据步骤S402播放进度变更指令确定起始位置B,获取该B位置处的图像作为定帧图像进行定帧;然后随着播放进度的改变,依次获取I帧7、I帧6、I帧5、I帧4、I帧3、I帧2位置处的图像作为定帧图像进行定帧,最后执行步骤S406分别从视频媒体流的I帧1及音频媒体流与之对应的帧开始进行解码。在依次获取定帧图像进行定帧时,对前一次定帧图像进行更新,因此在该应用场景下执行步骤S406时的定帧图像为I帧2位置处的图像,即对应步骤S407中的目标定帧图像。
如图2-D所示,为采用滑动方式进行播放快进的应用场景。在一些实施方式中,在该应用场景下,播放进度从A位置快进到B位置。首先根据步骤S402播放进度变更指令确定起始位置A,获取该A位置处的图像作为定帧图像进行定帧;然后随着播放进度 的改变,依次获取I帧2、I帧3、I帧4、I帧5、I帧6、I帧7位置处的图像作为定帧图像进行定帧,最后执行步骤S406分别从视频媒体流的I帧7及音频媒体流与之对应的帧开始进行解码。在依次获取定帧图像进行定帧时,对前一次定帧图像进行更新,因此在该应用场景下执行步骤S406时的定帧图像为I帧7位置处的图像,该定帧图像即为S407中的目标定帧图像。
S407:对解码后待播放的音频媒体流进行静音处理和对解码后待播放的视频媒体流采用目标定帧图像进行遮挡。
在本公开实施例中媒体流经解码进入渲染层105之后,播放器向渲染层105调用mute(),具体向渲染层105中的音频渲染器和视频渲染器分别调用mute(),对音频输出和视频输出分别进行静默(mute),使其在后台播放,对用户而言不会显示播放内容。
S408:响应于播放器系统时钟推进到目标时间,取消对此时待播放的音频媒体流的静音处理并取消所述目标定帧图像的定帧及对此时待播放的视频媒体流的遮挡。
当STC推进到与目标位置pos_target_seek对应的目标时间时,对此时渲染后的媒体文件进行unmute(),并调用unfreeze()对此时目标定帧图像进行解定帧。
S409:播放媒体流。
为了更清晰地理解图4所示各步骤的先后关系,现结合图5所示的媒体播放方法的时序图,对图4所示方法步骤进行进一步说明。参见图5,在执行完步骤S401至S403之后,包括:
浏览器Browser调用播放器Player的pause()接口暂停播放,可选择的,调用flushIfNeed()对缓存数据进行清除,通过seek()接口将步骤S402中播放进度变更指令包含的目标位置pos_target_seek发送给Player,同时通过requestData()向网页Webpage请求数据。
Player在收到seek()调用后,通过freeze()接口对图像进行定帧,向解码器Decoder发送暂停解码指令pauseDecode(),然后等待Webpage下发数据Data()。
Brower从Webpage请求到数据后,将数据通过writeData()写入Player,Player在接收到数据后,向Decoder发送解除暂停解码的指令ResumeDecode(),Decoder从步骤S403确定的目标关键帧开始对经writeData()写入Player的数据进行解码。
数据经解码后发送至渲染器,与此同时调用mute()对渲染后的数据进行静音处理。
当Player的系统时钟STC推进到与pos_target_seek对应的目标时间时,通过unmute()对渲染后的数据进行解静音处理同时通过unfreeze()对目标定帧图像解定帧。
综上,经过上述处理后,可实现播放器精确的在播放进度目标位置进行播放。以所述示例进行效果说明:用户观看至第3秒时,通过点击操作将播放进度变更到第9秒(pos_target_seek),播放器会冻结第3秒的画面,然后在后台从第8秒的位置(pos_I_frame,假设第8秒位置为关键帧的位置)开始解码并进行后台播放,在后台播放进行到第9秒时,播放器会解定帧第3秒的画面,并将第9秒开始的视频输出进行显示,音频进行播放,从而达到直接从第9秒进行播放的效果。
此外,在本方案中,如图6所示,播放器在收到进度变更指令后,在播放pos_I_frame到pos_target_seek位置之间的视频时,不会向浏览器返回一直向前推移的STC时间,而是返回pos_target_seek,直到STC推进到与pos_target_seek对应的时间时,才继续返回STC时间,这样,在播放pos_I_frame到pos_target_seek位置之间的视频时,进度条不会跳到pos_I_frame的位置而是保持在目标位置。因此,本方案同时还可以解决seek后进度条向回跳动的问题。
可以理解的是,本公开实施例为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。结合本公开的实施例描述的各示例的单元(器、器件)及算法步骤,本公开实施例能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以对每个特定的应用使用不同的方法来实现所描述的功能,但是这种实现不应认为超出本公开实施例的技术方案的范围。
本公开实施例可以根据上述方法示例对执行上述方法的装置进行功能单元(器、器件)的划分,例如,可以对应各个功能划分各个功能单元(器、器件),也可以将两个或两个以上的功能集成在一个处理单元(器、器件)中。上述集成的单元(器、器件)既可以采用硬件的形式实现,也可以采用软件功能单元(器、器件)的形式实现。需要说明的是,本公开实施例中对单元(器、器件)的划分是示意性的,仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用集成的单元(器、器件)的情况下,图7示出了本公开实施例提供的一种媒体播放的装置。参见图7,该装置包括接收模块701、生成模块702、第一确定模块703、第一处理模块704、第二处理模块705。
接收模块701,用于接收用户的用于触发对媒体的播放进度调节的输入;
生成模块702,用于基于所述输入生成播放进度变更指令,所述播放进度变更指令中包括所述输入的调节方式、起始位置和目标位置;
第一确定模块703,用于确定与目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;
第一处理模块704,用于从所述目标关键帧开始进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;
第二处理模块705,用于响应于播放器系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
可选择的,所述进度调节方式包括点击操作和滑动操作;
当所述进度调节方式为点击操作时,所述目标位置对应点击操作位置;
当所述进度调节方式为滑动操作时,所述目标位置对应滑动操作终止位置。
可选择的,该装置还包括:
第二确定模块,用于在接收到用户的用于触发对媒体的播放进度调节的输入后,确定相邻关键帧之间的间距是否小于预设数值。
可选择的,所述装置还包括:
第三处理模块,用于当相邻关键帧之间的间距大于预设数值时,确定与目标位置对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧,并从该目标关键帧开始解码进行媒体播放。
图8示出了本申请实施例提供的计算机设备800的结构示意图,即示出了媒体播放设备700的另一结构示意图。参阅图8所示,计算机设备800包括处理器801、网络接口802。其中,处理器801也可以为控制器。所述处理器801被配置为支持媒体播放装置700执行图3至图5中涉及的功能。网络接口802被配置为执行媒体播放装置700收发消息的功能。计算机设备800还可以包括存储器803,存储器803用于与处理器801耦合,其保存该设备必要的程序指令和数据。其中,处理器801、网络接口802和存储器803通过内部总线804相连,该存储器803用于存储指令,该处理器801用于执行该存储器803存储的指令,以控制网络接口802收发消息,完成上述方法中媒体播放装置700执行相应功能的步骤。
本申请实施例中,媒体播放装置700以及计算机设备800所涉及的与本申请实施例提供的技术方案相关的概念、解释和详细说明及其他步骤请参见前述方法或其他实施例中关于这些内容的描述,此处不做赘述。
需要说明的是,本申请实施例上述涉及的处理器可以是中央处理器(Central Processing Unit,CPU)、通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application-Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、硬件部件或者其任意组合。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框、模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合、DSP和微处理器的组合等等。其中,所述存储器可以集成在所述处理器中,也可以与所述处理器分开设置。
本申请实施例还提供一种计算机存储介质,用于存储一些指令,这些指令被执行时,可以促使处理器完成前述媒体播放装置所涉及的任意一种方法。
本申请实施例还提供一种计算机程序产品,用于存储计算机程序,该计算机程序用于执行上述方法实施例中涉及的媒体播放方法。
本领域内的技术人员应明白,本申请实施例可提供为方法、系统、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请实施例是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括 指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。

Claims (18)

  1. 一种媒体播放的方法,包括:
    接收用户的用于触发对所述媒体的播放进度调节的输入;
    基于所述输入生成播放进度变更指令,所述播放进度变更指令中包括所述输入的目标位置、调节方式及起始位置;
    确定与所述目标位置相对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;
    对从所述目标关键帧开始的媒体的数据进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;
    响应于系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
  2. 根据权利要求1所述的方法,
    所述调节方式包括点击操作,所述目标位置对应点击位置。
  3. 根据权利要求1所述的方法,
    所述调节方式包括滑动操作时,所述目标位置对应滑动操作终止位置。
  4. 根据权利要求2所述的方法,所述阻止从所述目标关键帧开始解码得到的媒体数据的呈现包括:
    在所述媒体数据的播放界面上对接收所述操作时的第一画面进行定帧。
  5. 根据权利要求3所述的方法,所述阻止从所述目标关键帧开始解码得到的媒体数据的呈现包括:
    在所述媒体数据的播放界面上依次显示接收所述操作时的所述第一画面、所述第一画面对应的时间与所述目标时间之间的一个或多个关键帧,最后,对所述第一画面对应的时间与所述目标时间之间且与所述目标时间间隔最小的关键帧对应的第二画面进行定帧。
  6. 根据权利要求4或5所述的方法,还包括:
    对从所述目标关键帧开始到所述目标时间的所述媒体数据中的音频数据的输出进行静音处理。
  7. 根据权利要求4所述的方法,还包括:
    响应于系统时钟推进到所述目标时间,取消对所述第一画面的定帧。
  8. 根据权利要求5所述的方法,还包括:
    响应于系统时钟推进到所述目标时间,取消对所述第二画面的定帧。
  9. 根据权利要求7或8所述的方法,还包括:
    取消对所述音频输出的静音处理。
  10. 根据权利要求1所述的方法,接收用户输入的所述操作之后,包括:
    确定相邻关键帧之间的时间间距是否小于预设数值;
    响应于相邻关键帧之间的所述时间间距不小于预设数值,确定与所述目标位置信息对应的目标时间间隔最短且位于所述目标时间之前的所述目标关键帧;并
    从所述目标关键帧开始解码进行媒体播放。
  11. 一种媒体播放设备,包括:
    处理器;及
    用于存储处理器可执行指令的存储器;
    所述可执行指令促使所述处理器实现:
    接收用户的用于触发对所述媒体的播放进度调节的输入;
    基于所述输入生成播放进度变更指令,所述播放进度变更指令中包括所述输入的目标位置、调节方式及起始位置;
    确定与所述目标位置相对应的目标时间间隔最短且位于所述目标时间之前的目标关键帧;
    从所述目标关键帧开始对媒体的数据进行解码,阻止从所述目标关键帧开始解码得到的媒体数据的呈现;
    响应于系统时钟推进到所述目标时间,播放从所述目标时间开始的被解码的媒体数据。
  12. 根据权利要求11所述的设备,当阻止从所述目标关键帧开始解码得到的媒体数据的呈现时,所述可执行指令促使所述处理器实现:
    在所述媒体数据的播放界面上对接收所述操作时的第一画面进行定帧。
  13. 根据权利要求11所述的设备,当阻止从所述目标关键帧开始解码得到的媒体数据的呈现时,所述可执行指令促使所述处理器实现:
    在所述媒体数据的播放界面上依次显示接收所述操作时的所述第一画面、所述第一画面对应的时间与所述目标时间之间的一个或多个关键帧,最后,对所述第一画面对应的时间与所述目标时间之间且与所述目标时间间隔最小的关键帧对应的第二画面进行定帧。
  14. 根据权利要求12或13所述的设备,所述可执行指令还促使所述处理器实现:
    对从所述目标关键帧开始至所述目标时间的所述媒体数据中的音频数据的输出进 行静音处理。
  15. 根据权利要求12所述的设备,所述可执行指令还促使所述处理器实现:
    响应于系统时钟推进到所述目标时间,取消对所述第一画面的定帧。
  16. 根据权利要求13所述的设备,所述可执行指令还促使所述处理器实现:
    响应于系统时钟推进到所述目标时间,取消对所述第二画面的定帧。
  17. 根据权利要求10所述的设备,在接收用户输入的所述操作之后,所述可执行指令促使所述处理器实现:
    确定相邻关键帧之间的时间间距是否小于预设数值;
    响应于相邻关键帧之间的所述时间间距不小于预设数值,确定与所述目标位置信息对应的目标时间间隔最短且位于所述目标时间之前的所述目标关键帧;并从所述目标关键帧开始解码进行媒体播放。
  18. 一种计算机可读非易失性存储介质,所述计算机可读存储介质上存储有指令,所述指令被处理器执行时实现权利要求1-10任一项所述的方法。
PCT/CN2019/076963 2018-03-05 2019-03-05 媒体播放 WO2019170073A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19764165.7A EP3739895A4 (en) 2018-03-05 2019-03-05 PLAYING MULTIMEDIA CONTENT
US16/360,588 US10705709B2 (en) 2018-03-05 2019-03-21 Playing media

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810180598.5A CN110234031A (zh) 2018-03-05 2018-03-05 一种媒体播放的方法及装置
CN201810180598.5 2018-03-05

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/360,588 Continuation US10705709B2 (en) 2018-03-05 2019-03-21 Playing media

Publications (1)

Publication Number Publication Date
WO2019170073A1 true WO2019170073A1 (zh) 2019-09-12

Family

ID=67846848

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/076963 WO2019170073A1 (zh) 2018-03-05 2019-03-05 媒体播放

Country Status (3)

Country Link
EP (1) EP3739895A4 (zh)
CN (1) CN110234031A (zh)
WO (1) WO2019170073A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114786065A (zh) * 2022-03-29 2022-07-22 广州埋堆堆科技有限公司 一种基于FFmpeg的HLS视频播放进度精准切换方法
CN115174924A (zh) * 2022-07-20 2022-10-11 天翼数字生活科技有限公司 一种机顶盒、视频起播时延计算方法、系统、设备和介质
CN115225970A (zh) * 2021-04-16 2022-10-21 海信视像科技股份有限公司 显示设备及显示设备视频跳转方法

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110913272A (zh) * 2019-12-03 2020-03-24 腾讯科技(深圳)有限公司 视频播放方法、装置、计算机可读存储介质和计算机设备
CN111970486B (zh) * 2020-07-15 2022-04-19 浙江大华技术股份有限公司 视频遮挡方法、设备及存储介质
CN112601127B (zh) * 2020-11-30 2023-03-24 Oppo(重庆)智能科技有限公司 视频显示方法及装置、电子设备、计算机可读存储介质
CN113395581B (zh) * 2021-06-15 2023-07-25 北京字跳网络技术有限公司 音频播放方法、装置、电子设备及存储介质
CN113726778A (zh) * 2021-08-30 2021-11-30 咪咕视讯科技有限公司 流媒体seek方法、装置、计算设备及计算机存储介质
CN115243063B (zh) * 2022-07-13 2024-04-19 广州博冠信息科技有限公司 视频流的处理方法、处理装置以及处理系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618794A (zh) * 2014-04-29 2015-05-13 腾讯科技(北京)有限公司 播放视频的方法和装置
US20150179224A1 (en) * 2013-12-24 2015-06-25 JBF Interlude 2009 LTD - ISRAEL Methods and systems for seeking to non-key frames
CN104918120A (zh) * 2014-03-12 2015-09-16 联想(北京)有限公司 一种播放进度调节方法及电子设备
CN105208463A (zh) * 2015-08-31 2015-12-30 北京暴风科技股份有限公司 针对m3u8文件进行帧确定的方法和系统
CN107566918A (zh) * 2017-09-21 2018-01-09 中国电子科技集团公司第二十八研究所 一种视频分发场景下的低延时取流秒开方法
CN107801100A (zh) * 2017-09-27 2018-03-13 北京潘达互娱科技有限公司 一种视频定位播放方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5959690A (en) * 1996-02-20 1999-09-28 Sas Institute, Inc. Method and apparatus for transitions and other special effects in digital motion video
CN103596059A (zh) * 2013-11-21 2014-02-19 乐视致新电子科技(天津)有限公司 智能电视媒体播放器及其播放进度调整方法、智能电视
CN104661083A (zh) * 2015-02-06 2015-05-27 南京传唱软件科技有限公司 视频播放方法、系统、流媒体播放方法、装置及系统
US10142707B2 (en) * 2016-02-25 2018-11-27 Cyberlink Corp. Systems and methods for video streaming based on conversion of a target key frame

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150179224A1 (en) * 2013-12-24 2015-06-25 JBF Interlude 2009 LTD - ISRAEL Methods and systems for seeking to non-key frames
CN104918120A (zh) * 2014-03-12 2015-09-16 联想(北京)有限公司 一种播放进度调节方法及电子设备
CN104618794A (zh) * 2014-04-29 2015-05-13 腾讯科技(北京)有限公司 播放视频的方法和装置
CN105208463A (zh) * 2015-08-31 2015-12-30 北京暴风科技股份有限公司 针对m3u8文件进行帧确定的方法和系统
CN107566918A (zh) * 2017-09-21 2018-01-09 中国电子科技集团公司第二十八研究所 一种视频分发场景下的低延时取流秒开方法
CN107801100A (zh) * 2017-09-27 2018-03-13 北京潘达互娱科技有限公司 一种视频定位播放方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3739895A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225970A (zh) * 2021-04-16 2022-10-21 海信视像科技股份有限公司 显示设备及显示设备视频跳转方法
CN114786065A (zh) * 2022-03-29 2022-07-22 广州埋堆堆科技有限公司 一种基于FFmpeg的HLS视频播放进度精准切换方法
CN115174924A (zh) * 2022-07-20 2022-10-11 天翼数字生活科技有限公司 一种机顶盒、视频起播时延计算方法、系统、设备和介质
CN115174924B (zh) * 2022-07-20 2024-05-28 天翼数字生活科技有限公司 一种机顶盒、视频起播时延计算方法、系统、设备和介质

Also Published As

Publication number Publication date
CN110234031A (zh) 2019-09-13
EP3739895A4 (en) 2021-02-17
EP3739895A1 (en) 2020-11-18

Similar Documents

Publication Publication Date Title
WO2019170073A1 (zh) 媒体播放
US10705709B2 (en) Playing media
US20210183408A1 (en) Gapless video looping
US8321905B1 (en) Fast switching of media streams
US7870281B2 (en) Content playback device, content playback method, computer-readable storage medium, and content playback system
EP3560205B1 (en) Synchronizing processing between streams
US8244897B2 (en) Content reproduction apparatus, content reproduction method, and program
US9379990B2 (en) System and method for streaming a media file from a server to a client device
US11356739B2 (en) Video playback method, terminal apparatus, and storage medium
WO2017005098A1 (zh) 一种实现视频流快进或快退的方法及装置
US9872054B2 (en) Presentation of a multi-frame segment of video content
JP2009065451A (ja) コンテンツ再生装置、コンテンツ再生方法、プログラム、およびコンテンツ再生システム
US11388474B2 (en) Server-side scene change content stitching
US11589119B2 (en) Pseudo seamless switching method, device and media for web playing different video sources
US11799943B2 (en) Method and apparatus for supporting preroll and midroll during media streaming and playback
US20230224557A1 (en) Auxiliary mpds for mpeg dash to support prerolls, midrolls and endrolls with stacking properties
CN117834963A (zh) 一种显示设备及流媒体的播放方法
EP4229854A1 (en) Method and apparatus for mpeg dash to support preroll and midroll content during media playback
CN117834982A (zh) 视频播放处理方法、装置、电子设备及存储介质
US20150030070A1 (en) Adaptive decoding of a video frame in accordance with initiation of non-sequential playback of video data associated therewith

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19764165

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019764165

Country of ref document: EP

Effective date: 20200813

NENP Non-entry into the national phase

Ref country code: DE