WO2022247014A1 - 基于墨水屏设备的音视频帧同步方法、装置和计算机设备 - Google Patents

基于墨水屏设备的音视频帧同步方法、装置和计算机设备 Download PDF

Info

Publication number
WO2022247014A1
WO2022247014A1 PCT/CN2021/111592 CN2021111592W WO2022247014A1 WO 2022247014 A1 WO2022247014 A1 WO 2022247014A1 CN 2021111592 W CN2021111592 W CN 2021111592W WO 2022247014 A1 WO2022247014 A1 WO 2022247014A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
frames
video frames
data
key video
Prior art date
Application number
PCT/CN2021/111592
Other languages
English (en)
French (fr)
Inventor
邵清
郑勇
袁健
戴志涛
Original Assignee
深圳市沃特沃德信息有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市沃特沃德信息有限公司 filed Critical 深圳市沃特沃德信息有限公司
Publication of WO2022247014A1 publication Critical patent/WO2022247014A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09FDISPLAYING; ADVERTISING; SIGNS; LABELS OR NAME-PLATES; SEALS
    • G09F27/00Combined visual and audible advertising or displaying, e.g. for public address
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09FDISPLAYING; ADVERTISING; SIGNS; LABELS OR NAME-PLATES; SEALS
    • G09F9/00Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements
    • G09F9/30Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements
    • G09F9/37Indicating arrangements for variable information in which the information is built-up on a support by selection or combination of individual elements in which the desired character or characters are formed by combining individual elements being movable elements
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G3/00Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
    • G09G3/20Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix no fixed position being assigned to or needed to be assigned to the individual characters or partial characters
    • G09G3/34Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix no fixed position being assigned to or needed to be assigned to the individual characters or partial characters by control of light from an independent source
    • G09G3/3433Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix no fixed position being assigned to or needed to be assigned to the individual characters or partial characters by control of light from an independent source using light modulating elements actuated by an electric field and being other than liquid crystal devices and electrochromic devices
    • G09G3/344Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix no fixed position being assigned to or needed to be assigned to the individual characters or partial characters by control of light from an independent source using light modulating elements actuated by an electric field and being other than liquid crystal devices and electrochromic devices based on particles moving in a fluid or in a gas, e.g. electrophoretic devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed

Definitions

  • the present application relates to the technical field of media playing, and in particular to an audio and video frame synchronization method, device and computer equipment based on an ink screen device.
  • Ink screen also known as electronic paper display
  • Ink screen is an innovative information display method. Compared with traditional display screens, one of its advantages is that it is easy to read. Its display medium, electronic ink, looks more like printed text , thus making it easier on the user's eyes. However, due to the low display refresh rate of the ink screen, when users use the ink screen device to watch videos or live broadcasts, the audio and video frames will be out of sync, which will affect the user's viewing experience.
  • the main purpose of this application is to provide an audio and video frame synchronization method, device, and computer equipment based on an ink screen device, aiming to solve the drawbacks of existing ink screen devices that are out of sync with audio and video frames when watching videos or live broadcasting.
  • the present application provides an audio and video frame synchronization method based on an ink screen device, including:
  • Buffer audio data and video data originate from the same media data
  • the audio data includes a plurality of audio frames with a first time stamp
  • the video data includes a plurality of audio frames with a second time stamp Timestamped video frames
  • the key video frames represent video frames with preset characteristics
  • the playing time of each key video frame is set;
  • the present application also provides an audio and video frame synchronization device based on an ink screen device, including:
  • a cache module configured to cache audio data and video data, the audio data and the video data originate from the same media data, the audio data includes a plurality of audio frames with a first time stamp, and the video data includes multiple a video frame with a second timestamp;
  • a screening module configured to screen out several key video frames from each of the video frames, and the key video frames represent video frames with preset features
  • the first setting module is used to set the playback duration of each key video frame according to the number of the key video frames and the total playback duration of the audio data;
  • the first synchronization module is used to sequentially play each of the audio frames according to their corresponding first time stamps, and simultaneously control each of the key video frames to play sequentially according to their corresponding second time stamps and playback duration, so as to realize the synchronous playback of the audio frame and the video frame.
  • the present application also provides a computer device, including a memory and a processor, and a computer program is stored in the memory, wherein, when the processor executes the computer program, an audio and video based on an ink screen device is realized. frame synchronization method;
  • the audio and video frame synchronization method based on the ink screen device includes:
  • Buffer audio data and video data originate from the same media data
  • the audio data includes a plurality of audio frames with a first time stamp
  • the video data includes a plurality of audio frames with a second time stamp Timestamped video frames
  • the key video frames represent video frames with preset characteristics
  • the playing time of each key video frame is set;
  • the present application also provides a computer-readable storage medium, on which a computer program is stored, wherein, when the computer program is executed by a processor, an audio and video frame synchronization method based on an ink screen device is implemented, and the The audio and video frame synchronization method based on the ink screen device includes the following steps:
  • Buffer audio data and video data originate from the same media data
  • the audio data includes a plurality of audio frames with a first time stamp
  • the video data includes a plurality of audio frames with a second time stamp Timestamped video frames
  • the key video frames represent video frames with preset characteristics
  • the playing time of each key video frame is set;
  • the system first buffers audio data and video data, wherein the audio data and video data originate from the same media data, and the audio data includes multiple An audio frame with a first time stamp, and the video data includes a plurality of video frames with a second time stamp.
  • the system screens out several key video frames from each video frame, and the key video frames represent video frames with preset characteristics; then, according to the number of key video frames and the total playing time of audio data, the playing time of each key video frame is set.
  • the system plays each audio frame in sequence according to its corresponding first time stamp, and controls each key video frame to play sequentially according to its corresponding second time stamp and playback duration, so as to realize synchronous playback of audio frames and video frames.
  • the system separates the audio data and video data of the media data, then screens out the key video frames from the video frames of the video data, and sets the playback duration of each key video frame according to the total playback duration, and finally integrates with the audio
  • the frames are played at the same time, realizing the synchronization of video frames and audio frames, and improving user experience.
  • Fig. 1 is a schematic diagram of steps of an audio and video frame synchronization method based on an ink screen device in an embodiment of the present application;
  • FIG. 2 is a block diagram of the overall structure of an audio and video frame synchronization device based on an ink screen device in an embodiment of the present application;
  • Fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • an embodiment of the present application provides an audio and video frame synchronization method based on an ink screen device, including:
  • S1 cache audio data and video data, the audio data and the video data originate from the same media data, the audio data includes a plurality of audio frames with a first time stamp, and the video data includes a plurality of audio frames with a first time stamp the video frame of the second timestamp;
  • S2 Select several key video frames from each of the video frames, and the key video frames represent video frames with preset features;
  • each described audio frame is played sequentially according to the first time stamp corresponding respectively, controls each described key video frame to play sequentially according to the second time stamp corresponding respectively and playing duration simultaneously, realizes described audio frame and all Synchronized playback of the above video frames.
  • the key video frame is an intra-coded frame.
  • the control system of the ink screen device (hereinafter referred to as the system) first caches the media data in a pre-built data buffer area after receiving media data such as live online classes. Then, by demultiplexing the media data, it is separated into audio data and video data and buffered separately. The system then decodes the audio data and the video data respectively, and obtains each audio frame included in the audio data and the first time stamp corresponding to each audio frame, and each video frame included in the video data and the corresponding first time stamp of each video frame respectively. Second timestamp.
  • the system screens each video frame to obtain a number of key video frames, wherein the key video frame represents a video frame with preset characteristics; preferably, the key video frame is an intra-frame coded frame, which is a kind of frame with all information
  • An independent frame can best express the behavior information in a video frame, and can be decoded independently without referring to other images, and can reconstruct a complete image independently, which can be simply understood as a static picture.
  • the system calculates the average according to the number of key video frames screened out from video data to obtain the playing time of each key video frame.
  • the system plays each audio frame in order according to their corresponding first time stamps; at the same time, controls each key video frame to play in order according to their corresponding second time stamps, and the playing time of each key video frame is obtained from the above calculation
  • the playback time is long, so that when the ink screen device outputs media data, audio frames and video frames can be played synchronously.
  • the system separates the audio data and video data of the media data, then screens out key video frames from the video frames of the video data, and sets the playback duration of each key video frame according to the total playback duration, and finally with Audio frames are played at the same time to realize the synchronization of video frames and audio frames and improve user experience.
  • the step of setting the playback duration of each key video frame according to the number of the key video frames and the total playback duration of the audio data includes:
  • S301 Divide the total playing time by the number of key video frames to obtain the playing time of each key video frame.
  • video data is made up of three kinds of frames of I frame, P frame, B frame, and video compression encoding end compresses and sends data packet with 25 frames as the frame number of video in one second, then decodes to decoding end, video
  • the frame will also consist of three types of frames: I frame, P frame, and B frame.
  • the I frame can best express the behavior information in the video frame, and can reconstruct the complete image alone.
  • There is a restriction on the encoding side that the number of frames between two I frames cannot exceed 12-15 frames, and a data stream starts from the I frame to the end of the I frame, then corresponds to the video data frame rate of the decoding end, and the first frame is used as the drainage frame It must be an I frame.
  • a total of 3 I frames can be screened out of the 25 video frames, that is, 3 key video frames .
  • the system divides the total playing time of audio data (the total playing time of audio data is the same as the total playing time of video data, and both originate from the same media data) by the number of key video frames, The playing duration of each key video frame is calculated. For example, the total playback time of audio data is 10s.
  • 3 key video frames can be screened from the video data per second, so a total of 30 key video frames can be screened from 10s of video data; through the total playback time and key video frames
  • the average calculation of each key video frame can be obtained as 1/3s of playing time.
  • the system can also calculate the total playing time of each key video frame according to the frame rate of the ink screen. Since the frame rate of the ink screen is 3 frames per second, and the video data per second corresponds to 3 key video frames, it can be passed
  • the average calculation shows that the playback duration of each key video frame is 1/3s.
  • the associated calculation is performed according to the total playback duration of the audio data and the number of key video frames, so that when the media data is played on the ink screen, the key video frames can be synchronized with the audio frames, thereby improving user experience.
  • step of caching audio data and video data includes:
  • S101 Receive the media data through a wireless network, and cache it in a preset buffer area;
  • S102 Demultiplexing the media data to obtain the audio data and the video data
  • S103 Perform decoding processing on the audio data and the video data respectively, to obtain each of the audio frames and their corresponding first time stamps, and each of the video frames and their corresponding second time stamps.
  • the system is equipped with a first-level data buffer area and a second-level data buffer area.
  • the system caches the media data received through the wireless network to be the first-level data Cache area (i.e. default cache area).
  • the system demultiplexes the media data to obtain audio data and video data respectively; and then decodes the audio data and video data respectively to obtain each audio frame included in the audio data and the first audio frame corresponding to each audio frame respectively.
  • time stamp, each video frame included in the video data and the second time stamp corresponding to each video frame, and the decoded data information is cached in the secondary data buffer area. at this time.
  • the decoded audio data and video data are independent of each other and can be played separately, and each frame of data carries its corresponding time information, which is convenient for separate processing and subsequent synchronization.
  • the ink screen device includes an ink display screen and a microphone, and the audio frames are played sequentially according to their corresponding first time stamps, and at the same time, each of the key video frames is controlled to be played according to their corresponding second time stamps.
  • Stamp and playing time length are played sequentially, realize the step of synchronous playing of described audio frame and described video frame, comprise:
  • the ink screen device includes an ink display screen and a microphone.
  • the reason why the ink screen device affects the playback effect of media data is that the frame rate of the ink screen itself is low, while the playback of audio data in the media data is not affected. Therefore, the system takes the playback start time of the audio data as the benchmark, and takes the first timestamp corresponding to the audio frame at the top of the audio data (the audio frames are sorted according to their respective first timestamps) as the start timestamp. Each audio frame is output to the microphone according to the order of the corresponding first time stamp for playback.
  • each key video frame is output to the ink display screen for display according to its corresponding second time stamp and playing time sequence, that is, the play order of each key video frame is corresponding to its corresponding second time stamp, and the second time stamp
  • the first one is played first, and each key video frame is played according to the playback duration (for example, there are 3 key video frames, and the key video frames are arranged according to the order of the second time stamps corresponding to the key video frames.
  • step of setting the playing duration of each of the key video frames according to the number of the key video frames and the total playing duration of the audio data it includes:
  • S5 Resetting the third timestamp corresponding to each of the key video frames respectively according to the corresponding second timestamp, the playback duration and the total playback duration of each of the key video frames, the first
  • the three timestamps include the start timestamp and the end timestamp of the key video frame.
  • the system determines the order in which each key video frame is played according to the second time stamp corresponding to each key video frame, and then according to the total playback duration of the media data (the total playback duration of the audio data, video data, and media data)
  • the duration is the same and the playback duration calculated by each key video frame is reset, and the corresponding start timestamp and end timestamp of each key video frame are reset when the corresponding audio frame is played, so as to form the corresponding first time stamp when each key video frame is played.
  • the total playing time of the media data is 10s, and there are 30 key video frames screened from the above, assuming that they are key video frame 1 and key video frame 2 after being arranged in the order of their corresponding second timestamps.
  • Key video frame 3...Key video frame 30 the playback duration of a single key video frame is 1/3s.
  • the third timestamp after setting each key video frame according to the above rules is then: key video frame 1 (0,1 /3), key video frame 2 (1/3,2/3), key video frame 3 (2/3,1)...key video frame 29 , key video frame 30 .
  • the playback duration and the total playback time corresponding to each of the key video frames reset the corresponding third time stamp of each of the key video frames.
  • each said audio frame is played sequentially according to the first time stamp corresponding respectively, and each said key video frame is controlled to play sequentially according to the third time stamp respectively corresponding at the same time, realizes said audio frame and said video frame synchronized playback.
  • the system uses the playback start time of the audio data as a reference, and sequentially plays each audio frame contained in the audio data according to the corresponding first time stamps. At the same time, the system controls the selected key video frames to be played sequentially according to their corresponding third time stamps. Since the playback start time of the audio frame and the key video frame are the same, and the third time stamp of each key video frame can correspond to the total playback duration of the audio data, so that when the media data is played through the ink screen device, the audio of the media data Frame and video frame can be perfectly synchronized without affecting the user's viewing experience.
  • an embodiment of the present application also provides an audio and video frame synchronization device based on an ink screen device, including:
  • Buffering module 1 is used for buffering audio data and video data, and described audio data and described video data originate from same media data, and described audio data comprises a plurality of audio frames with first time stamp, and described video data comprises a plurality of video frames with a second time stamp;
  • a screening module 2 configured to screen out several key video frames from each of the video frames, the key video frames representing video frames with preset features;
  • the first setting module 3 is used to set the playback duration of each of the key video frames according to the number of the key video frames and the total playback duration of the audio data;
  • the first synchronizing module 4 is used for sequentially playing each said audio frame according to its corresponding first time stamp, and simultaneously controlling each of said key video frames to play sequentially according to its corresponding second time stamp and playing duration, so as to realize The synchronous playback of the audio frame and the video frame.
  • the first setting module 3 includes:
  • a calculation unit configured to divide the total playing time by the number of key video frames to obtain the playing time of each key video frame.
  • the cache module 1 includes:
  • a cache unit configured to receive the media data through the wireless network, and cache the media data in a preset cache area
  • a demultiplexing unit configured to demultiplex the media data to obtain the audio data and the video data
  • the decoding unit is configured to respectively decode the audio data and the video data to obtain each of the audio frames and their corresponding first time stamps, and each of the video frames and their corresponding second time stamps.
  • the ink screen device includes an ink display screen and a microphone
  • the first synchronization module 4 includes:
  • a synchronization unit configured to use the first time stamp corresponding to the first audio frame in the audio data as the start time stamp, output each of the audio frames to the microphone in order of their corresponding first time stamps for playback, At the same time, the key video frames are output to the ink display screen for display according to their corresponding second time stamps and playing time in sequence.
  • the synchronization device also includes
  • the second setting module 5 is used to reset the third key video frame corresponding to each key video frame according to the second time stamp, the playback duration and the total playback duration corresponding to each of the key video frames.
  • a timestamp, the third timestamp includes a start timestamp and an end timestamp of the key video frame.
  • the synchronization device also includes:
  • the second synchronizing module 6 is used for sequentially playing each of the audio frames according to their corresponding first time stamps, and simultaneously controlling each of the key video frames to be sequentially played according to their respective corresponding third time stamps to realize the audio Frame and synchronous playback of the video frame.
  • each module and unit of the synchronization device is used to correspondingly execute each step in the above-mentioned audio and video frame synchronization method based on the ink screen device, and its specific implementation process will not be described in detail here.
  • This embodiment provides an audio and video frame synchronization device based on an ink screen device.
  • the system first buffers audio data and video data, wherein the audio data and video data originate from the same media data, and the audio data includes multiple The audio frame with a time stamp, and the video data includes a plurality of video frames with a second time stamp.
  • the system screens out several key video frames from each video frame, and the key video frames represent video frames with preset characteristics; then, according to the number of key video frames and the total playing time of audio data, the playing time of each key video frame is set.
  • the system plays each audio frame in sequence according to its corresponding first time stamp, and controls each key video frame to play sequentially according to its corresponding second time stamp and playback duration, so as to realize synchronous playback of audio frames and video frames.
  • the system separates the audio data and video data of the media data, then screens out the key video frames from the video frames of the video data, and sets the playback duration of each key video frame according to the total playback duration, and finally integrates with the audio
  • the frames are played at the same time, realizing the synchronization of video frames and audio frames, and improving user experience.
  • an embodiment of the present application further provides a computer device, which may be a server, and its internal structure may be as shown in FIG. 3 .
  • the computer device includes a processor, memory, network interface and database connected by a system bus. Among them, the processor designed by the computer is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer programs and databases.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer device is used to store data such as audio data.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection.
  • the above-mentioned processor executes the steps of the above-mentioned audio and video frame synchronization method based on the ink screen device:
  • S1 cache audio data and video data, the audio data and the video data originate from the same media data, the audio data includes a plurality of audio frames with a first time stamp, and the video data includes a plurality of audio frames with a first time stamp the video frame of the second timestamp;
  • S2 Select several key video frames from each of the video frames, and the key video frames represent video frames with preset features;
  • each described audio frame is played sequentially according to the first time stamp corresponding respectively, controls each described key video frame to play sequentially according to the second time stamp corresponding respectively and playing duration simultaneously, realizes described audio frame and all Synchronized playback of the above video frames.
  • An embodiment of the present application also provides a computer-readable storage medium.
  • the storage medium may be a non-volatile storage medium or a volatile storage medium on which a computer program is stored.
  • the computer program is executed by a processor
  • the method is specifically:
  • S1 cache audio data and video data, the audio data and the video data originate from the same media data, the audio data includes a plurality of audio frames with a first time stamp, and the video data includes a plurality of audio frames with a first time stamp the video frame of the second timestamp;
  • S2 Select several key video frames from each of the video frames, and the key video frames represent video frames with preset features;
  • each described audio frame is played sequentially according to the first time stamp corresponding respectively, controls each described key video frame to play sequentially according to the second time stamp corresponding respectively and playing duration simultaneously, realizes described audio frame and all Synchronized playback of the above video frames.
  • Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (SSRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Hardware Design (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本申请提供了一种基于墨水屏设备的音视频帧同步方法、装置和计算机设备,通过从视频数据的视频帧中筛选出关键视频帧,并根据播放总时长设置每个关键视频帧的播放时长,最后与音频帧进行同时播放,实现视频帧与音频帧的同步,提升用户使用体验。

Description

基于墨水屏设备的音视频帧同步方法、装置和计算机设备 技术领域
本申请涉及媒体播放技术领域,特别涉及一种基于墨水屏设备的音视频帧同步方法、装置和计算机设备。
背景技术
墨水屏又被称为电子纸显示屏,是一种革新的信息显示方法,与传统的显示屏相比的一大优势就是容易阅读,它的显示介质-电子墨水,看起来更像印刷的文字,因而使得用户的眼睛更为轻松。但是,由于墨水屏的显示刷新率低,因此用户在使用墨水屏设备观看视频或直播时,会出现音视频帧不同步的现象,影响用户的观看体验。
技术问题
本申请的主要目的为提供一种基于墨水屏设备的音视频帧同步方法、装置和计算机设备,旨在解决现有墨水屏设备观看视频或直播时音视频帧不同步的弊端。
技术解决方案
为实现上述目的,第一方面,本申请提供一种基于墨水屏设备的音视频帧同步方法,包括:
缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
第二方面,本申请还提供了一种基于墨水屏设备的音视频帧同步装置,包括:
缓存模块,用于缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
筛选模块,用于从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
第一设置模块,用于根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
第一同步模块,用于将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
第三方面,本申请还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,其中,所述处理器执行所述计算机程序时实现一种基于墨水屏设备的音视频帧同步方法;
其中,所述基于墨水屏设备的音视频帧同步方法包括:
缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
第四方面,本申请还提供一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时实现一种基于墨水屏设备的音视频帧同步方法,所述基于墨水屏设备的音视频帧同步方法包括以下步骤:
缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
有益效果
本申请中提供的一种基于墨水屏设备的音视频帧同步方法、装置和计算机设备,系统首先进行缓存音频数据和视频数据,其中,音频数据和视频数据源于同一媒体数据,音频数据包括多个带有第一时间戳的音频帧,视频数据包括多个带有第二时间戳的视频帧。系统从各视频帧中筛选出若干个关键视频帧,关键视频帧表征具有预设特征的视频帧;然后根据关键视频帧的数量及音频数据的播放总时长,设置各个关键视频帧的播放时长。系统将各个音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现音频帧和视频帧的同步播放。本申请中,系统通过将媒体数据的音频数据和视频数据进行分离,然后从视频数据的视频帧中筛选出关键视频帧,并根据播放总时长设置每个关键视频帧的播放时长,最后与音频帧进行同时播放,实现视频帧与音频帧的同步,提高用户使用体验。
附图说明
图1是本申请一实施例中基于墨水屏设备的音视频帧同步方法步骤示意图;
图2是本申请一实施例中基于墨水屏设备的音视频帧同步装置整体结构框图;
图3是本申请一实施例的计算机设备的结构示意框图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的最佳实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,本申请一实施例中提供了一种基于墨水屏设备的音视频帧同步方法,包括:
S1:缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
S2:从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
S3:根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
S4:将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
优选的,所述关键视频帧为帧内编码帧。
本实施例中,墨水屏设备的控制系统(下文简称系统)在接收到直播网课等类型的媒体数据后,先将媒体数据缓存至预先构建的数据缓存区。然后,通过对媒体数据进行解复用处理,将其分离为音频数据和视频数据并分开进行缓存。系统再分别对音频数据和视频数据进行解码处理,得到音频数据所包含的各个音频帧和各音频帧分别对应的第一时间戳,以及视频数据所包含的各个视频帧和各个视频帧分别对应的第二时间戳。系统对各个视频帧进行筛选,得到若干个关键视频帧,其中,关键视频帧表征具有预设特征的视频帧;优选为,该关键视频帧为帧内编码帧,是一种自带全部信息的独立帧,最能表现视频帧中的行为信息,且无需参考其他图像便可独立进行解码,能单独重构完整图像,可以简单理解为一张静态画面。系统以音频数据的播放总时长为基准,根据从视频数据中筛选出的关键视频帧的数量做求均计算,得到各个关键视频帧的播放时长。系统将各个音频帧按照各自对应的第一时间戳进行顺序播放;同时,控制各个关键视频帧按照各自对应的第二时间戳进行顺序播放,且每个关键视频帧的播放时长均为上述计算得到的播放时长,实现墨水屏设备在输出媒体数据时,音频帧和视频帧能够同步播放。
本实施例中,系统通过将媒体数据的音频数据和视频数据进行分离,然后从视频数据的视频帧中筛选出关键视频帧,并根据播放总时长设置每个关键视频帧的播放时长,最后与音频帧进行同时播放,实现视频帧与音频帧的同步,提高用户使用体验。
进一步的,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤,包括:
S301:将所述播放总时长除以所述关键视频帧的数量,得到各所述关键视频帧的播放时长。
本实施例中,视频数据由I帧、P帧、B帧三种帧组成,视频压缩编码端以25帧作为一秒内视频的帧数进行压缩发送数据包,再到解码端进行解码,视频帧也将由I帧、P帧、B帧三种帧构成。其中,I帧最能表现视频帧中的行为信息,且能单独重构完整图像。编码端有限制是2个I帧之间的帧数不能超过12-15帧,且一段数据流从I帧开始到I帧结束,那么对应解码端的视频数据帧率,第一个帧作为引流帧必须是I帧,根据视频信息量和墨水屏(墨水屏的帧率只有3帧/秒)的帧率推算,这25帧视频帧中总共能筛选出3个I帧,即3个关键视频帧。系统以音频数据的播放时间为基准,将音频数据的播放总时长(音频数据的播放总时长与视频数据的播放总时长相同,两者均源于同一媒体数据)除以关键视频帧的数量,计算得到各个关键视频帧的播放时长。比如音频数据的播放总时长为10s,由上可知每秒的视频数据能够筛选得到3个关键视频帧,因此10s的视频数据总共能筛选得到30个关键视频帧;通过播放总时长和关键视频帧的求均计算,可以得到每个关键视频帧的播放时长为1/3s。系统也可以根据墨水屏的帧率来计算各个关键视频帧的播放总时长,由于墨水屏的帧率为3帧/秒,而每秒的视频数据对应的关键视频帧为3个,因此可以通过求均计算得到每个关键视频帧的播放时长为1/3s。本实施例通过根据音频数据的播放总时长与关键视频帧的数量进行关联计算,从而使得在墨水屏播放媒体数据时,关键视频帧能够与音频帧实现同步,提高用户的使用体验。
进一步的,所述缓存音频数据和视频数据的步骤,包括:
S101:通过无线网络接收所述媒体数据,并缓存至预设缓存区;
S102:将所述媒体数据进行解复用处理,得到所述音频数据和所述视频数据;
S103:分别将所述音频数据和所述视频数据进行解码处理,得到各所述音频帧和各自对应的第一时间戳,以及各所述视频帧和各自对应的第二时间戳。
本实施例中,系统内部设置有一级数据缓存区和二级数据缓存区,用户使用墨水屏设备观看直播网课、视频等媒体数据时,系统将通过无线网络接收的媒体数据缓存值一级数据缓存区(即预设缓存区)。然后,系统将媒体数据进行解复用处理,分别得到音频数据和视频数据;再分别对音频数据和视频数据进行解码处理,得到音频数据所包含的各个音频帧和各音频帧分别对应的第一时间戳,以及视频数据包含的各个视频帧和各视频帧分别对应的第二时间戳,并将解码后的数据信息缓存至二级数据缓存区。此时。解码后的音频数据和视频数据相互独立,都可以实现单独播放,且各帧数据均携带有各自对应的时间信息,便于进行单独处理以及后续进行同步对应。
进一步的,所述墨水屏设备包括墨水显示屏和麦克风,所述将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放的步骤,包括:
S401:以所述音频数据中排序首位的音频帧对应的第一时间戳为开始时间戳,将各所述音频帧按照各自对应的第一时间戳顺序输出到所述麦克风进行播放,同时将各所述关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到所述墨水显示屏进行显示。
本实施例中,墨水屏设备包括墨水显示屏和麦克风,由于墨水屏设备影响媒体数据播放效果的原因在于墨水显示屏本身的帧率较低,而媒体数据中的音频数据播放则不受影响。因此,系统以音频数据的播放开始时间为基准,以音频数据中排序首位(音频帧的排序根据各自对应的第一时间戳进行顺序排列)的音频帧对应的第一时间戳为开始时间戳,将各个音频帧按照各自对应的第一时间戳顺序输出到麦克风进行播放。同时,将各个关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到墨水显示屏进行显示,即各个关键视频帧的播放顺序由各自对应的第二时间戳进行对应,第二时间戳在前的则先播放,且每个关键视频帧均按照播放时长进行播放(比如共有3个关键视频帧,且按照关键视频帧各自对应的第二时间戳的先后顺序,排列后为关键视频帧A、B、C,则在一秒内,先输出关键视频帧A,关键视频帧A保持1/3秒;然后输出关键视频帧B,关键视频帧B同样保持1/3秒;租后输出关键视频帧C,关键视频帧C保持1/3秒),完成关键视频帧与音频帧的同步播放。
进一步的,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤之后,包括:
S5:将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳,所述第三时间戳包括关键视频帧的开始时间戳和结束时间戳。
本实施例中,系统根据各个关键视频帧各自对应的第二时间戳确定各个关键视频帧播放时的先后顺序,然后再根据媒体数据的播放总时长(音频数据、视频数据和媒体数据的播放总时长均相同以及各个关键视频帧计算所得的播放时长,重新设置各个关键视频帧在对应音频帧进行播放时分别对应的开始时间戳和结束时间戳,从而形成各个关键视频帧播放时分别对应的第三时间戳。比如媒体数据的播放总时长为10s,由上可知筛选得到的关键视频帧共有30个,假设按照各自对应的第二时间戳顺序排列后分别为关键视频帧1、关键视频帧2、关键视频帧3……关键视频帧30,单个关键视频帧的播放时长为1/3s。按照上述规则对各个关键视频帧设置后的第三时间戳则为:关键视频帧1(0,1/3),关键视频帧2(1/3,2/3),关键视频帧3(2/3,1)……关键视频帧29
Figure dest_path_image001
,关键视频帧30
Figure dest_path_image002
进一步的,所述将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳的步骤之后,包括:
S6:将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第三时间戳进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
本实施例中,系统以音频数据的播放开始时间为基准,将音频数据包含的各个音频帧按照各自对应的第一时间戳进行顺序播放。与此同时,系统控制筛选所得的各个关键视频帧按照各自对应的第三时间戳进行顺序播放。由于音频帧和关键视频帧的播放开始时间相同,且各个关键视频帧的第三时间戳能够与音频数据的播放总时长实现对应,从而使得在通过墨水屏设备播放媒体数据时,媒体数据的音频帧和视频帧能够完美同步,不会影响用户的观看体验。
参照图2,本申请一实施例中还提供了一种基于墨水屏设备的音视频帧同步装置,包括:
缓存模块1,用于缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
筛选模块2,用于从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
第一设置模块3,用于根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
第一同步模块4,用于将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
进一步的,所述第一设置模块3,包括:
计算单元,用于将所述播放总时长除以所述关键视频帧的数量,得到各所述关键视频帧的播放时长。
进一步的,所述缓存模块1,包括:
缓存单元,用于通过无线网络接收所述媒体数据,并缓存至预设缓存区;
解复用单元,用于将所述媒体数据进行解复用处理,得到所述音频数据和所述视频数据;
解码单元,用于分别将所述音频数据和所述视频数据进行解码处理,得到各所述音频帧和各自对应的第一时间戳,以及各所述视频帧和各自对应的第二时间戳。
进一步的,所述墨水屏设备包括墨水显示屏和麦克风,所述第一同步模块4,包括:
同步单元,用于以所述音频数据中排序首位的音频帧对应的第一时间戳为开始时间戳,将各所述音频帧按照各自对应的第一时间戳顺序输出到所述麦克风进行播放,同时将各所述关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到所述墨水显示屏进行显示。
进一步的,所述同步装置,还包括
第二设置模块5,用于将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳,所述第三时间戳包括关键视频帧的开始时间戳和结束时间戳。
进一步的,所述同步装置,还包括:
第二同步模块6,用于将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第三时间戳进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
本实施例中,同步装置各模块、单元用于对应执行与上述基于墨水屏设备的音视频帧同步方法中的各个步骤,其具体实施过程在此不做详述。
本实施例提供的一种基于墨水屏设备的音视频帧同步装置,系统首先进行缓存音频数据和视频数据,其中,音频数据和视频数据源于同一媒体数据,音频数据包括多个带有第一时间戳的音频帧,视频数据包括多个带有第二时间戳的视频帧。系统从各视频帧中筛选出若干个关键视频帧,关键视频帧表征具有预设特征的视频帧;然后根据关键视频帧的数量及音频数据的播放总时长,设置各个关键视频帧的播放时长。系统将各个音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现音频帧和视频帧的同步播放。本申请中,系统通过将媒体数据的音频数据和视频数据进行分离,然后从视频数据的视频帧中筛选出关键视频帧,并根据播放总时长设置每个关键视频帧的播放时长,最后与音频帧进行同时播放,实现视频帧与音频帧的同步,提高用户使用体验。
参照图3,本申请实施例中还提供一种计算机设备,该计算机设备可以是服务器,其内部结构可以如图3所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设计的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储音频数据等数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现上述的任一实施例一种基于墨水屏设备的音视频帧同步方法的功能。
上述处理器执行上述基于墨水屏设备的音视频帧同步方法的步骤:
S1:缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
S2:从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
S3:根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
S4:将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
本申请一实施例还提供一种计算机可读存储介质,所述存储介质可以是非易失性存储介质,也可以是易失性存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现上述的任一实施例基于墨水屏设备的音视频帧同步方法,所述方法具体为:
S1:缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
S2:从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
S3:根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
S4:将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储与一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的和实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可以包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM通过多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双速据率SDRAM(SSRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其它变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其它要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
以上所述仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其它相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种基于墨水屏设备的音视频帧同步方法,其特征在于,包括:
    缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
    从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
    根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
  2. 根据权利要求1所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤,包括:
    将所述播放总时长除以所述关键视频帧的数量,得到各所述关键视频帧的播放时长。
  3. 根据权利要求1所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述缓存音频数据和视频数据的步骤,包括:
    通过无线网络接收所述媒体数据,并缓存至预设缓存区;
    将所述媒体数据进行解复用处理,得到所述音频数据和所述视频数据;
    分别将所述音频数据和所述视频数据进行解码处理,得到各所述音频帧和各自对应的第一时间戳,以及各所述视频帧和各自对应的第二时间戳。
  4. 根据权利要求1所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述墨水屏设备包括墨水显示屏和麦克风,所述将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放的步骤,包括:
    以所述音频数据中排序首位的音频帧对应的第一时间戳为开始时间戳,将各所述音频帧按照各自对应的第一时间戳顺序输出到所述麦克风进行播放,同时将各所述关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到所述墨水显示屏进行显示。
  5. 根据权利要求1所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤之后,包括:
    将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳,所述第三时间戳包括关键视频帧的开始时间戳和结束时间戳。
  6. 根据权利要求5所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳的步骤之后,包括:
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第三时间戳进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
  7. 根据权利要求1所述的基于墨水屏设备的音视频帧同步方法,其特征在于,所述关键视频帧为帧内编码帧。
  8. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机程序,其中,所述处理器执行所述计算机程序时实现一种基于墨水屏设备的音视频帧同步方法;
    其中,所述基于墨水屏设备的音视频帧同步方法包括:
    缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
    从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
    根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
  9. 根据权利要求8所述的计算机设备,其中,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤,包括:
    将所述播放总时长除以所述关键视频帧的数量,得到各所述关键视频帧的播放时长。
  10. 根据权利要求8所述的计算机设备,其中,所述缓存音频数据和视频数据的步骤,包括:
    通过无线网络接收所述媒体数据,并缓存至预设缓存区;
    将所述媒体数据进行解复用处理,得到所述音频数据和所述视频数据;
    分别将所述音频数据和所述视频数据进行解码处理,得到各所述音频帧和各自对应的第一时间戳,以及各所述视频帧和各自对应的第二时间戳。
  11. 根据权利要求8所述的计算机设备,其中,所述墨水屏设备包括墨水显示屏和麦克风,所述将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放的步骤,包括:
    以所述音频数据中排序首位的音频帧对应的第一时间戳为开始时间戳,将各所述音频帧按照各自对应的第一时间戳顺序输出到所述麦克风进行播放,同时将各所述关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到所述墨水显示屏进行显示。
  12. 根据权利要求8所述的计算机设备,其中,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤之后,包括:
    将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳,所述第三时间戳包括关键视频帧的开始时间戳和结束时间戳。
  13. 根据权利要求12所述的计算机设备,其中,所述将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳的步骤之后,包括:
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第三时间戳进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
  14. 根据权利要求8所述的计算机设备,其中,所述关键视频帧为帧内编码帧。
  15. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现一种基于墨水屏设备的音视频帧同步方法,所述基于墨水屏设备的音视频帧同步方法包括以下步骤:
    缓存音频数据和视频数据,所述音频数据和所述视频数据源于同一媒体数据,所述音频数据包括多个带有第一时间戳的音频帧,所述视频数据包括多个带有第二时间戳的视频帧;
    从各所述视频帧中筛选出若干个关键视频帧,所述关键视频帧表征具有预设特征的视频帧;
    根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长;
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
  16. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤,包括:
    将所述播放总时长除以所述关键视频帧的数量,得到各所述关键视频帧的播放时长。
  17. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述缓存音频数据和视频数据的步骤,包括:
    通过无线网络接收所述媒体数据,并缓存至预设缓存区;
    将所述媒体数据进行解复用处理,得到所述音频数据和所述视频数据;
    分别将所述音频数据和所述视频数据进行解码处理,得到各所述音频帧和各自对应的第一时间戳,以及各所述视频帧和各自对应的第二时间戳。
  18. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述墨水屏设备包括墨水显示屏和麦克风,所述将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第二时间戳以及播放时长进行顺序播放,实现所述音频帧和所述视频帧的同步播放的步骤,包括:
    以所述音频数据中排序首位的音频帧对应的第一时间戳为开始时间戳,将各所述音频帧按照各自对应的第一时间戳顺序输出到所述麦克风进行播放,同时将各所述关键视频帧按照各自对应的第二时间戳和播放时长顺序输出到所述墨水显示屏进行显示。
  19. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述根据所述关键视频帧的数量及所述音频数据的播放总时长,设置各所述关键视频帧的播放时长的步骤之后,包括:
    将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳,所述第三时间戳包括关键视频帧的开始时间戳和结束时间戳。
  20. 根据权利要求19所述的计算机可读存储介质,其特征在于,所述将各所述关键视频帧按照各自对应的所述第二时间戳、所述播放时长以及所述播放总时长,重新设置各所述关键视频帧分别对应的第三时间戳的步骤之后,包括:
    将各所述音频帧按照各自对应的第一时间戳进行顺序播放,同时控制各所述关键视频帧按照各自对应的第三时间戳进行顺序播放,实现所述音频帧和所述视频帧的同步播放。
PCT/CN2021/111592 2021-05-26 2021-08-09 基于墨水屏设备的音视频帧同步方法、装置和计算机设备 WO2022247014A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110578862.2 2021-05-26
CN202110578862.2A CN113316012B (zh) 2021-05-26 2021-05-26 基于墨水屏设备的音视频帧同步方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2022247014A1 true WO2022247014A1 (zh) 2022-12-01

Family

ID=77375197

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/111592 WO2022247014A1 (zh) 2021-05-26 2021-08-09 基于墨水屏设备的音视频帧同步方法、装置和计算机设备

Country Status (2)

Country Link
CN (1) CN113316012B (zh)
WO (1) WO2022247014A1 (zh)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101466044A (zh) * 2007-12-19 2009-06-24 康佳集团股份有限公司 一种流媒体音视频同步播放的方法及其系统
WO2012032537A2 (en) * 2010-09-06 2012-03-15 Indian Institute Of Technology A method and system for providing a content adaptive and legibility retentive display of a lecture video on a miniature video device
US8798438B1 (en) * 2012-12-07 2014-08-05 Google Inc. Automatic video generation for music playlists
CN106162293A (zh) * 2015-04-22 2016-11-23 无锡天脉聚源传媒科技有限公司 一种视频声音与图像同步的方法及装置
CN106713855A (zh) * 2016-12-13 2017-05-24 深圳英飞拓科技股份有限公司 一种视频播放方法及装置
CN106792154A (zh) * 2016-12-02 2017-05-31 广东赛特斯信息科技有限公司 视频播放器的跳帧同步系统及其控制方法
CN107295284A (zh) * 2017-08-03 2017-10-24 浙江大学 一种由音频和图片组成的视频文件的生成和检索播放方法、装置
CN108174269A (zh) * 2017-12-28 2018-06-15 优酷网络技术(北京)有限公司 可视化音频播放方法及装置
CN110944225A (zh) * 2019-11-20 2020-03-31 武汉长江通信产业集团股份有限公司 一种基于html5的不同帧率音视频的同步方法及装置
CN111641858A (zh) * 2020-04-29 2020-09-08 上海推乐信息技术服务有限公司 一种音视频同步方法及系统

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003009588A1 (en) * 2001-07-17 2003-01-30 Yesvideo, Inc. Digital visual recording content indexing and packaging
EP1527602A2 (en) * 2002-07-30 2005-05-04 Koninklijke Philips Electronics N.V. Trick play behavior controlled by a user
US7673238B2 (en) * 2006-01-05 2010-03-02 Apple Inc. Portable media device with video acceleration capabilities
CN104021152B (zh) * 2014-05-19 2017-09-05 广州酷狗计算机科技有限公司 基于音频文件播放的图片显示方法和装置
CN106162182B (zh) * 2015-03-25 2019-08-30 杭州海康威视数字技术股份有限公司 一种视频编码码流的播放控制方法及系统
CN106816055B (zh) * 2017-04-05 2019-02-01 杭州恒生数字设备科技有限公司 一种可交互的低功耗教学直播录播系统及方法

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101466044A (zh) * 2007-12-19 2009-06-24 康佳集团股份有限公司 一种流媒体音视频同步播放的方法及其系统
WO2012032537A2 (en) * 2010-09-06 2012-03-15 Indian Institute Of Technology A method and system for providing a content adaptive and legibility retentive display of a lecture video on a miniature video device
US8798438B1 (en) * 2012-12-07 2014-08-05 Google Inc. Automatic video generation for music playlists
CN106162293A (zh) * 2015-04-22 2016-11-23 无锡天脉聚源传媒科技有限公司 一种视频声音与图像同步的方法及装置
CN106792154A (zh) * 2016-12-02 2017-05-31 广东赛特斯信息科技有限公司 视频播放器的跳帧同步系统及其控制方法
CN106713855A (zh) * 2016-12-13 2017-05-24 深圳英飞拓科技股份有限公司 一种视频播放方法及装置
CN107295284A (zh) * 2017-08-03 2017-10-24 浙江大学 一种由音频和图片组成的视频文件的生成和检索播放方法、装置
CN108174269A (zh) * 2017-12-28 2018-06-15 优酷网络技术(北京)有限公司 可视化音频播放方法及装置
CN110944225A (zh) * 2019-11-20 2020-03-31 武汉长江通信产业集团股份有限公司 一种基于html5的不同帧率音视频的同步方法及装置
CN111641858A (zh) * 2020-04-29 2020-09-08 上海推乐信息技术服务有限公司 一种音视频同步方法及系统

Also Published As

Publication number Publication date
CN113316012A (zh) 2021-08-27
CN113316012B (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
CN109714634B (zh) 一种直播数据流的解码同步方法、装置及设备
US11330311B2 (en) Transmission device, transmission method, receiving device, and receiving method for rendering a multi-image-arrangement distribution service
JP6610555B2 (ja) 受信装置、送信装置、およびデータ処理方法
CN104885473B (zh) 用于经由http的动态自适应流式传输(dash)的实况定时方法
CN103931204B (zh) 媒体数据的网络流
CN113225598B (zh) 移动端音视频同步的方法、装置、设备及存储介质
JP7271856B2 (ja) ネットワーク遅延がある環境での遠隔クラウドベースのビデオ制作システム
JP6043089B2 (ja) 放送通信連携受信装置
JP2013009361A (ja) 放送通信連携受信装置およびアプリケーションサーバ
KR20130138213A (ko) 멀티미디어 흐름 처리 방법 및 대응하는 장치
JP4511952B2 (ja) メディア再生装置
US8769562B2 (en) Digital broadcast method, data receiving device, and data transmitting device
JP5997500B2 (ja) 放送通信連携受信装置
CN105812961B (zh) 自适应流媒体处理方法及装置
WO2022247014A1 (zh) 基于墨水屏设备的音视频帧同步方法、装置和计算机设备
KR101700626B1 (ko) 멀티 앵글 뷰 처리장치
JP5854208B2 (ja) 多段高速再生のための映像コンテンツ生成方法
JP2022095777A (ja) 放送サービス通信ネットワーク配信装置および方法
WO2010134479A1 (ja) 動画表示装置
CN114827747B (zh) 一种流媒体数据切换方法、装置、设备及存储介质
JP4414467B2 (ja) ストリーミング配信方法
JP2002176609A (ja) データ受信再生方法及びデータ受信再生装置
US10531132B2 (en) Methods and techniques for reducing latency in changing channels in a digital video environment
JP2008011430A (ja) コンテンツ再送信方法、コンテンツ受信方法及びコンテンツ受信装置
JP2024508911A (ja) 時間同期されたマルチストリームデータ伝送を提供する方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21942583

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE