WO2018072098A1 - 一种音视频同步方法和装置 - Google Patents

一种音视频同步方法和装置 Download PDF

Info

Publication number
WO2018072098A1
WO2018072098A1 PCT/CN2016/102442 CN2016102442W WO2018072098A1 WO 2018072098 A1 WO2018072098 A1 WO 2018072098A1 CN 2016102442 W CN2016102442 W CN 2016102442W WO 2018072098 A1 WO2018072098 A1 WO 2018072098A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
video
current
audio
corrected
Prior art date
Application number
PCT/CN2016/102442
Other languages
English (en)
French (fr)
Inventor
洪伟焕
Original Assignee
深圳市福斯康姆智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市福斯康姆智能科技有限公司 filed Critical 深圳市福斯康姆智能科技有限公司
Priority to PCT/CN2016/102442 priority Critical patent/WO2018072098A1/zh
Publication of WO2018072098A1 publication Critical patent/WO2018072098A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present invention relates to the field of media broadcasting, and in particular, to an audio and video synchronization method and apparatus.
  • the player After playing, the player must ensure that the corresponding audio or video frame is played during the specified time. To this end, the player will have a local chime, and the audio or video renderer will compare the PTS with the local chime. If it matches, ie the PTS falls within a certain local clock reception range, the corresponding audio is played. a frame or a video frame; the PTS is discarded earlier than the receiving interval, and the data frame is discarded; when the PTS is later than the receiving interval, the data frame is reserved and delayed until the corresponding time interval.
  • the playback control logic of the existing player is to read audio and video data, and perform audio video playback according to the PTS of the audio and video data, without analyzing and correcting the inter-time stamp of the audio and video data.
  • audio and video data is playback software obtained through the network, and the quality of the network has a great influence on the synchronization effect of audio and video. Therefore, the effect obtained by the existing audio and video synchronization method is not accurate.
  • a first aspect of the present invention provides an audio and video synchronization method, including:
  • a second aspect of the present invention provides an audio and video synchronization apparatus, including:
  • a video frame display inter-postmark correction module configured to correct a display inter-frame stamp of a current video frame
  • an audio frame display inter-postmark correction module configured to correct a display inter-page stamp of the current audio frame according to the corrected inter-turn stamp of the current video frame;
  • an audio frame index correction module configured to correct a frame index of the current audio frame according to the corrected inter-post stamp of the current audio frame
  • the play control module is configured to control the play speed of the video according to the display inter-postmark corrected by the current video frame and the display inter-postmark corrected by the current audio frame, so that the playing of the audio and the video are synchronized.
  • the parameters such as the audio and video display inter-postmark and the audio frame index are corrected, and accurate audio and video frame display can be obtained.
  • FIG. 1 is a schematic flowchart of an implementation process of an audio and video synchronization method according to Embodiment 1 of the present invention
  • FIG. 2 is a schematic flowchart of an implementation process for correcting a display inter-picture stamp of a current video frame according to Embodiment 2 of the present invention.
  • FIG. 3 is a schematic flowchart of an implementation process of correcting a display inter-head stamp and an index frame of a current audio frame according to Embodiment 3 of the present invention
  • FIG. 4 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 4 of the present invention
  • 5 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 5 of the present invention
  • FIG. 6 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 6 of the present invention.
  • FIG. 7 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 7 of the present invention.
  • Embodiment 8 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 8 of the present invention.
  • FIG. 8b is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 9 of the present invention.
  • FIG. 8 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 10 of the present invention.
  • FIG. 8-d is a schematic structural diagram of an audio-video synchronization apparatus according to Embodiment 11 of the present invention.
  • Embodiment 12 of the present invention is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 12 of the present invention.
  • the present invention provides an audio and video synchronization method, the method includes: correcting a display inter-picture stamp of a current video frame; correcting a display period of the current audio frame according to the corrected inter-turn stamp of the current video frame Stamping; correcting the frame index of the current audio frame according to the corrected inter-picture stamp of the current audio frame; displaying the inter-post stamp after the current video frame correction and the current audio frame corrected display inter-post stamp , control the playback speed of the video, so that the audio and video playback is synchronized.
  • the present invention also provides a corresponding audio and video synchronization device. The details are described below separately.
  • FIG. 1 is a schematic flowchart of an implementation process of an audio and video synchronization method according to Embodiment 1 of the present invention, which mainly includes the following steps S101 to S104, which are described in detail as follows:
  • correcting the display inter-page stamp of the current video frame may be implemented by the following steps S1011 to S1013:
  • the display timestamp lastVpts of the previous video frame of the current video frame is in the current view.
  • the previous video frame of the frequency frame is displayed after the correction, and the network layer actually receives the current orgVtime of the current video frame and the network layer actually receives the previous video frame of the current video frame.
  • the lastVtime can be obtained by parsing the video frame after receiving the video frame.
  • S1012 Obtain a video inter-pool jump cumulative value sumIntervalTime.
  • the video inter-pool jump accumulation value sumlntervalTime is a cumulative value of the difference between the inter-day orgVtime that the network layer actually receives the current video frame and the time interval of the last video frame that the network layer actually receives the current video frame.
  • sumIntervalTime ⁇ (orgVtime_lastMediaTime), where lastMediaT ime indicates that the network layer actually receives the time of any previous video frame of the current video frame, and the sum symbol " ⁇ " indicates the accumulation of the inter-turn interval.
  • FIG. 2 Correcting the display inter-picture stamp of the current video frame, and a more detailed process thereof is shown in FIG. 2, which mainly includes steps S201 to S208, which are as follows:
  • the method includes: after receiving a frame of video frames in the network layer, saving the actual time of receiving the video frame as the original receiving time orgVtime, and saving the display time stamp of the video frame as the original video frame display.
  • the orgVpts is stamped, and the frame of the video frame is saved as the original video frame 1 0r gVInd ex .
  • S202 Determine whether the current video frame is a first frame video frame.
  • step S203 If the current video frame is the first frame video frame, the process proceeds to step S203, otherwise, step S204 is performed.
  • S203 Set parameters such as an actual reception time, a frame index, a display inter-frame stamp, and a video inter-hop jump cumulative value of the previous frame of the current video frame.
  • the method includes: setting orgVtime to the last media frame of the previous frame, setting the frame index interval intervalVI between the current video frame and the previous frame, and setting the accumulated jump value of the video to 0. Setting the inter-time interval intervalVtim e of the current video frame and the previous frame of the video layer to 0, and setting the display inter-frame stamp orgVpts of the current video frame to the display time of the previous video frame of the first frame of the video frame.
  • Poke lastVpts set the inter-day orgVtime that the network layer actually receives the current video frame to the lastVtime of the last video frame that the network layer actually receives the current video frame, and the frame index of the current video frame or gVIndex is set to the frame of the previous video frame of the current video frame I lastVIndex, and the display of the current video frame orgVpts as the original display of the previous video frame of the current video frame lastorg Vpts, etc. Wait.
  • S204 Set an actual reception time, a frame index, a display inter-frame stamp of the previous frame of the current video frame, and calculate a time interval between the current network frame and the previous frame of the video frame, and calculate the video recording. Parameters such as the cumulative value of the jump jump.
  • Inter-frame interval of the frame video frame where intervalVtime is the inter-turn interval between the current video frame and the previous frame of the video frame received by the network layer; the orgVtime that the network layer actually receives the current video frame as the network layer actually receives the current video frame
  • the lastVtime of the previous frame of the video frame the frame of the current video frame is saved.
  • orgVIndex is used as the frame of the previous video frame of the current video frame. I lastVIndex;
  • the display of the current video frame is displayed as ⁇ Vpts as the original display of the previous frame of the current video frame, lastorgVpts, and so on.
  • step S208 If intervalVI is equal to 1, the flow proceeds to step S208, otherwise, the flow proceeds to step S206.
  • S206 Determine whether the network layer actually receives the inter-time interval between the current video frame and the previous frame of the video frame is greater than a preset threshold.
  • step S207 If the network layer actually receives the inter-time interval intervalVtime of the current video frame and the previous frame of the video frame is greater than a preset threshold, for example, 3 seconds, the flow proceeds to step S207, otherwise, the process proceeds to step S208.
  • a preset threshold for example, 3 seconds
  • sumlntervalTime ⁇ (orgVtime - lastMediaTime), where lastMediaTime indicates that the network layer actually receives the daytime of any previous video frame of the current video frame.
  • the summation symbol " ⁇ " indicates the accumulation of the inter-turn interval.
  • the display inter-page stamp of the current audio frame can be corrected by the following steps S1021 and S1022:
  • the frame of the current audio frame can be corrected according to the following steps S 1031 and S 1032 according to the display inter-post stamp corrected by the current audio frame:
  • S1031 Read a frame index lastAIndex of a previous audio frame of the current audio frame and a display inter-postmark la S tadjApt S of the previous audio frame corrected by the current audio frame.
  • the method includes: after receiving one frame of audio frames in the network layer, saving the actual time of receiving the audio frame as The original receiving time orgAtime, saves the display time stamp of the audio frame as the original audio frame display time stamp orgApts, and saves the frame of the audio frame as the original audio frame 1 0r gAInd ex .
  • S302. Determine whether the current audio frame is the first frame audio frame.
  • step S303 If the current audio frame is the first frame audio frame, the process proceeds to step S303, otherwise, step S304 is performed.
  • S303 Set an actual receiving time, a frame index, and a display inter-page stamp of the audio frame of the previous frame of the current audio frame.
  • the method includes: setting orgAtime to the last frame of the media frame lastMediaTime, setting a frame index interval intervalVI between the current audio frame and the previous frame of the audio frame, and setting the network layer to actually receive the current audio frame and
  • the daytime orgAtime is set to the lastAtime of the last audio frame of the current audio frame that the network layer actually receives
  • the frame index I orgAIndex of the current audio frame is set to the frame of the last audio frame of the current audio frame, I lastAIndex
  • the display inter-post stamp orgApts of the current audio frame is saved as the display inter-frame stamp lastadjApts of the previous frame of the current audio frame, and so on.
  • S304 Set an actual reception time of the audio frame of the previous frame of the current audio frame, calculate a time interval between the network layer actually receiving the current audio frame and the audio frame of the previous frame, and calculate a current audio frame and a previous frame audio frame. Parameters such as frame index difference.
  • S306. Determine whether the network layer actually receives the inter-time interval between the current audio frame and the previous frame of the audio frame, which is greater than a preset threshold.
  • step S307 If the network layer actually receives the inter-time interval intervalAtime of the current audio frame and the previous frame audio frame is greater than a preset threshold, for example, 3 seconds, the flow proceeds to step S307, otherwise, the process proceeds to step S308.
  • a preset threshold for example, 3 seconds
  • sumlntervalTime ⁇ (orgAtime - lastMediaTime), where lastMediaTime indicates that the network layer actually receives the last audio frame of the current audio frame, the sum symbol " ⁇ " indicates ⁇ The accumulation of intervals.
  • S309. Determine whether the current audio frame is the first frame audio frame.
  • step S310 If the current audio frame is the first frame audio frame, the process proceeds to step S310, otherwise, step S311 is performed.
  • S310 Set the display inter-page stamp adjApts of the corrected audio frame to the display inter-postmark l as tadjApts of the previous frame audio frame correction.
  • S311 Calculate a frame index adjAIndex of the corrected audio frame and save the frame index lastAIndex of the previous audio frame of the current audio frame, and save the lastAtime of the last frame of the audio frame of the current audio frame.
  • the frame index lastAIndex of the previous audio frame saves the inter-day orgAtime that the network layer actually receives the current audio frame as the lastAtime of the last frame of the audio frame of the current audio frame, and so on.
  • S104 displaying the inter-post stamp after the current video frame is corrected and the current audio frame is corrected.
  • the playback speed of the video is controlled according to the display inter-picture stamp corrected by the current video frame and the display inter-post stamp corrected by the current audio frame, so that the synchronization of the audio and the video can be: According to the current video frame corrected display inter-page stamp and the current audio frame corrected display inter-post stamp to know that the video playback speed is faster than the audio playback speed, the video playback pause is increased until the audio and video playback speeds are the same; If the video playback speed is slower than the audio playback speed according to the corrected video clip after the current video frame correction and the current audio frame corrected display, the video playback pause is reduced until the audio and video playback speeds are the same. , .
  • the parameters such as the presentation of the audio and video and the audio frame index are corrected, and accurate sounds can be obtained.
  • the video frame shows the inter-post stamp, which lays a good foundation for the synchronization of audio and video.
  • the display of the inter-post stamp can accurately control the playback speed of the video, which ultimately enables the audio and video playback to be accurately synchronized.
  • FIG. 4 is a schematic structural diagram of an audio and video synchronization apparatus according to Embodiment 4 of the present invention.
  • FIG. 4 shows only parts related to the embodiment of the present invention.
  • the audio-video synchronization device exemplified in Fig. 4 may be the execution body of the audio-video synchronization method exemplified in Fig. 1.
  • the audio and video synchronization device of the example of FIG. 4 mainly includes a video frame display inter-frame correction module 401, an audio frame display inter-frame correction module 40, an audio frame index correction module 403, and a playback control module 404, wherein:
  • a video frame display inter-postmark correction module 401 configured to correct a display inter-frame stamp of a current video frame
  • the audio frame display inter-postmark correction module 402 is configured to correct the display inter-page stamp of the current audio frame according to the corrected inter-turn stamp of the current video frame;
  • the audio frame index correction module 403 is configured to correct a frame index of the current audio frame according to the corrected inter-post stamp of the current audio frame;
  • the play control module 404 is configured to control the play speed of the video according to the corrected inter-picture stamp of the current video frame and the display inter-post stamp corrected by the current audio frame, so that the play of the audio and the video is synchronized.
  • each functional module is merely an example, and the actual application may be implemented according to requirements, such as corresponding hardware configuration requirements or software implementation. Convenience considerations, and the above functions are assigned by different functional modules, The internal structure of the audio and video synchronization device is divided into different functional modules to perform all or part of the functions described above. Moreover, in practical applications, the corresponding functional modules in this embodiment may be implemented by corresponding hardware, or may be implemented by corresponding hardware to execute corresponding software.
  • the foregoing video frame display inter-postmark correction module may be Having a hardware for performing the aforementioned correction of the current video frame, such as a video frame display inter-spot aligner, or a general processor or other hardware device capable of executing a corresponding computer program to perform the aforementioned functions;
  • the audio frame display inter-postmark correction module may be a hardware that performs a display inter-page stamp corrected according to the current video frame, and corrects the display inter-frame stamp of the current audio frame, such as an audio frame display inter-turn stamp correction device, or A general processor or other hardware device that executes a corresponding computer program to perform the aforementioned functions (the various described embodiments of the present specification may apply the above described principles).
  • the video frame display meta-page correction module 401 illustrated in FIG. 4 may include a first reading unit 501, an obtaining unit 502, and a first calculating unit 503, as shown in FIG. 5, the sound provided in Embodiment 5 of the present invention.
  • Video sync device where:
  • the first reading unit 501 is configured to read the last video frame of the current video frame, and display the last ptV of the last video frame of the current video frame, and the network layer actually receives the previous video frame.
  • the obtaining unit 502 is configured to obtain an accumulated inter-hop jump value sumlntervalTime
  • the audio frame display inter-frame correction module 402 illustrated in FIG. 5 may include a second reading unit 601 and a second computing unit 602, as shown in FIG. 6, the audio-video synchronization device provided in Embodiment 6 of the present invention, Wherein: [0100]
  • the second reading unit 601 is configured to read the display interval lastVpts of the previous video frame of the current video frame.
  • the original display of the frame is ⁇ Apts;
  • the audio frame index correction module 403 illustrated in FIG. 6 may include a third reading unit 701 and a third computing unit 702. As shown in FIG. 7, the audio and video synchronization device provided in Embodiment 7 of the present invention, wherein:
  • the third reading unit 701 is configured to read a frame index lastAIndex of a previous audio frame of the current audio frame and a display interpret lastadjApts of the previous audio frame corrected by the previous audio frame;
  • the playback control module 404 of any of FIGS. 4-7 may include a deceleration unit 801 and a speed increasing unit 802, as shown in FIGS. 8-a through 8-d, the sounds provided by embodiments 8 through 11 of the present invention.
  • Video sync device where:
  • the deceleration unit 801 is configured to increase the pause of the video playback if the video playback speed is faster than the audio playback speed according to the display inter-page stamp corrected by the current video frame and the display inter-postmark corrected by the current audio frame. Between, until the audio and video playback speed is the same;
  • the speed increasing unit 802 is configured to reduce the pause of the video playback if the video playback speed is slower than the audio playback speed according to the display inter-picture stamp corrected by the current video frame and the display inter-postmark corrected by the current audio frame. In the meantime, until the audio and video play at the same speed.
  • the present invention provides a schematic diagram of an audio and video synchronization device.
  • the audio and video synchronization device may be a functional unit in a computer device or a computer device.
  • the specific embodiment of the present invention does not limit the specific implementation of the audio and video synchronization device.
  • the audio and video synchronization device includes:
  • processor 901 communication interface 902, memory
  • the processor 901, the communication interface 902, and the memory 903 complete communication with each other via the bus 904.
  • the communication interface 902 is configured to communicate with an external device, such as a personal computer, a server, or the like.
  • the processor 901 is configured to execute the program 905.
  • the program 905 can include program code, the program code including computer operating instructions.
  • the processor 901 may be a central processing unit CPU, or an application specific integrated circuit (ASIC), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • CPU central processing unit
  • ASIC application specific integrated circuit
  • the memory 903 is configured to store the program 905.
  • Memory 903 may contain high speed RAM memory, or Also included is a non-volatile memory, such as at least one disk storage.
  • the program 905 may specifically include:
  • each module in the program 905 refers to the corresponding modules in the embodiment shown in FIG. 4, and details are not described herein.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division, and the actual implementation may have another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not executed.
  • the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some communication interface, device or unit, and may be in electrical, mechanical or other form.
  • the unit described as a separate component may or may not be physically distributed, and the component displayed as a unit may or may not be a physical unit, that is, may be located in one place, or may be distributed to multiple On the network unit. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or a part of the technical solution, may be embodied in the form of a software product, which is stored in a storage medium, including Several instructions To enable a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: u disk
  • RAM random access memory
  • disk disk or optical disk, and other media that can store program code.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种音视频同步方法,解决现有技术中播放媒体文件时音视频不能良好同步的问题,包括:矫正当前视频帧的展示时间戳;根据所述当前视频帧矫正后的展示时间戳,矫正当前音频帧的展示时间戳;根据所述当前音频帧矫正后的展示时间戳,矫正所述当前音频帧的帧索引;根据所述当前视频帧矫正后的展示时间戳和当前音频帧矫正后的展示时间戳,控制视频的播放速度,使得音频和视频的播放同步。提供了一种音视频同步方法,使得播放媒体文件时音视频能够精确同步。

Description

发明名称:一种音视频同步方法和装置
技术领域
[0001] 本发明涉及媒体播放领域, 尤其涉及一种音视频同步方法和装置。
背景技术
[0002] 通常, 数字媒体文件或媒体流中的音频数据和视频数据是分离存放和处理的, 因此, 当进行音视频播放吋, 播放器必须对音频和视频进行同步, 否则, 观赏 者看到的图像和听到的声音就不同步。 为达到同步的目的, 媒体源会有一个吋 钟, 每一帧音频数据或视频数据都会打上一个所谓的展示吋间戳 (Presentation Time Stamp, PTS) , PTS指示了播放器播放每帧音频或视频的相对吋间, 此吋 间是相对于媒体源吋钟的吋间基点。
[0003] 在播放吋, 播放器必须确保在指定的吋间播放相应的音频帧或视频帧。 为此, 播放器会有一个本地吋钟, 音频或视频的呈现装置 (renderer) 会比较 PTS和本 地吋钟, 如果吻合, 即 PTS落入一定的本地吋钟接收区间范围, 则播放相应的音 频帧或视频帧; 在 PTS早于该接收区间, 则丢弃该数据帧; 当 PTS晚于该接收区 间, 则保留并延迟到相应的吋间再呈现该数据帧。
[0004] 现有播放器的播放控制逻辑为读取音视频数据, 根据音视频数据的 PTS进行音 频视频的各自播放, 不对音视频数据的吋间戳进行分析及矫正。 然而, 音视频 数据是通过网络获取的播放软件, 网络状况的好坏对音视频同步效果的影响非 常大。 因此, 现有的音视频同步方法的得到的效果不精确。
技术问题
[0005] 本发明的目的在于提供一种音视频同步方法和装置, 以使得播放媒体文件吋音 视频能够精确同步。
问题的解决方案
技术解决方案
[0006] 本发明的第一方面, 提供一种音视频同步方法, 包括:
[0007] 矫正当前视频帧的展示吋间戳; [0008] 根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳; [0009] 根据所述当前音频帧矫正后的展示吋间戳, 矫正所述当前音频帧的帧索引; [0010] 根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
[0011] 本发明的第二方面, 提供一种音视频同步装置, 包括:
[0012] 视频帧展示吋间戳矫正模块, 用于矫正当前视频帧的展示吋间戳;
[0013] 音频帧展示吋间戳矫正模块, 用于根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳;
[0014] 音频帧索引矫正模块, 用于根据所述当前音频帧矫正后的展示吋间戳, 矫正所 述当前音频帧的帧索引;
[0015] 播放控制模块, 用于根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫 正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
发明的有益效果
有益效果
[0016] 从上述本发明提供的技术方案可知, 一方面, 在音视频媒体文件播放前, 对音 视频的展示吋间戳和音频帧索引等参数进行了矫正, 能够获得精准的音视频帧 展示吋间戳, 这为音视频的同步打下了良好的基础; 另一方面, 根据已经获得 的当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳即精准的 展示吋间戳, 能够精确地控制视频的播放速度, 从而最终使得音视频的播放能 够精确同步。
对附图的简要说明
附图说明
[0017] 图 1是本发明实施例一提供的音视频同步方法的实现流程示意图;
[0018] 图 2是本发明实施例二提供的矫正当前视频帧的展示吋间戳的实现流程示意图
[0019] 图 3是本发明实施例三提供的矫正当前音频帧的展示吋间戳和索引帧的实现流 程示意图;
[0020] 图 4是本发明实施例四提供的音视频同步装置的结构示意图; [0021] 图 5是本发明实施例五提供的音视频同步装置的结构示意图;
[0022] 图 6是本发明实施例六提供的音视频同步装置的结构示意图;
[0023] 图 7是本发明实施例七提供的音视频同步装置的结构示意图;
[0024] 图 8-a是本发明实施例八提供的音视频同步装置的结构示意图;
[0025] 图 8-b是本发明实施例九提供的音视频同步装置的结构示意图;
[0026] 图 8-c是本发明实施例十提供的音视频同步装置的结构示意图;
[0027] 图 8-d是本发明实施例十一提供的音视频同步装置的结构示意图;
[0028] 图 9是本发明实施例十二提供的音视频同步装置的结构示意图。
本发明的实施方式
[0029] 为了使本发明的目的、 技术方案及优点更加清楚明白, 以下结合附图及实施例 , 对本发明进行进一步详细说明。 应当理解, 此处所描述的具体实施例仅仅用 以解释本发明, 并不用于限定本发明。
[0030] 本发明提供一种音视频同步方法, 所述方法包括: 矫正当前视频帧的展示吋间 戳; 根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳 ; 根据所述当前音频帧矫正后的展示吋间戳, 矫正所述当前音频帧的帧索引; 根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。 本发明还提供相应的音视 频同步装置。 以下分别进行详细说明。
[0031] 请参阅附图 1, 是本发明实施例一提供的音视频同步方法的实现流程示意图, 主要包括以下步骤 S101至步骤 S104, 详细说明如下:
[0032] S101 , 矫正当前视频帧的展示吋间戳。
[0033] 作为本发明一个实施例, 矫正当前视频帧的展示吋间戳可通过如下步骤 S1011 至 S1013实现:
[0034] S1011 , 读取当前视频帧的上一视频帧的展示吋间戳 lastVpts 网络层实际接收 当前视频帧的吋间 orgVtime和网络层实际接收当前视频帧的上一视频帧的吋间 las tVtime=
[0035] 需要说明的是, 当前视频帧的上一视频帧的展示吋间戳 lastVpts是在对当前视 频帧的上一视频帧做展示吋间戳矫正吋, 矫正完之后保存下来的, 而网络层实 际接收当前视频帧的吋间 orgVtime和网络层实际接收当前视频帧的上一视频帧的 吋间 lastVtime均可以在接收到视频帧吋, 解析视频帧即可获得。
[0036] S1012, 获取录像吋间跳跃累积值 sumIntervalTime。
[0037] 需要说明的是, 录像吋间跳跃累积值 sumlntervalTime是网络层实际接收当前视 频帧的吋间 orgVtime与网络层实际接收当前视频帧的上一视频帧的吋间之差的累 积值, 可表示为 sumIntervalTime=∑(orgVtime_lastMediaTime), 其中, lastMediaT ime表示网络层实际接收当前视频帧的任意上一视频帧的吋间, 求和符号"∑"表 示吋间间隔的累积。
[0038] S1013 , 按照公式 adjVpts = lastVpts + (orgVtime - lastVtime) - sumlntervalTime 计算矫正后的视频帧的展示吋间戳, 其中, adjVpts为当前视频帧矫正后的展示 吋间戳。
[0039] 矫正当前视频帧的展示吋间戳, 其更为详细的流程如附图 2所示, 主要包括步 骤 S201至 S208 , 说明如下:
[0040] S201 , 保存原始的参数值。
[0041] 具体包括: 在网络层接收到一帧视频帧吋, 将接收该视频帧的实际吋间保存为 原始接收吋间 orgVtime, 将该视频帧的展示吋间戳保存为原始的视频帧展示吋间 戳 orgVpts, 将该视频帧的帧索弓 I保存为原始的视频帧索弓 10rgVIndex
[0042] S202, 判断当前视频帧是否为第一帧视频帧。
[0043] 若当前视频帧是第一帧视频帧, 则流程执行步骤 S203 , 否则, 执行步骤 S204。
[0044] S203 , 设置当前视频帧的上一帧视频帧的实际接收吋间、 帧索引、 展示吋间戳 以及录像吋间跳跃累积值等参数。
[0045] 具体地, 包括: 将 orgVtime设置为上一帧媒体帧吋间 lastMediaTime, 设置当前 视频帧和上一帧视频帧之间的帧索引间隔 intervalVI为 1, 设置录像吋间跳跃累积 值为 0, 设置网络层实际接收当前视频帧和上一帧视频帧的吋间间隔 intervalVtim e为 0, 将当前视频帧的展示吋间戳 orgVpts设置为第一帧视频帧的上一视频帧的 展示吋间戳 lastVpts , 将网络层实际接收当前视频帧的吋间 orgVtime设置为网络 层实际接收当前视频帧的上一视频帧的吋间 lastVtime, 将当前视频帧的帧索引 or gVIndex设置为当前视频帧的上一视频帧的帧索弓 I lastVIndex, 以及将当前视频帧 的展示吋间戳 orgVpts保存为当前视频帧的上一帧视频帧的原始展示吋间戳 lastorg Vpts , 等等。
[0046] S204, 设置当前视频帧的上一帧视频帧的实际接收吋间、 帧索引、 展示吋间戳 以及计算网络层实际接收当前视频帧和上一帧视频帧的吋间间隔, 计算录像吋 间跳跃累积值等参数。
[0047] 具体地, 包括: 将 orgVtime设置为上一帧媒体帧吋间 lastMediaTime; 读取当前 视频帧的帧索弓 I orgVIndex和当前视频帧的上一视频帧的帧索弓 I lastVIndex并按照 公式 intervalVI = orgVIndex - lastVIndex计算当前视频帧和当前视频帧的上一视频 帧之间的帧索引差值, 其中, intervalVI为当前视频帧和当前视频帧的上一视频 帧之间的帧索引差值; 读取网络层实际接收当前视频帧的吋间 orgVtime和网络层 实际接收当前视频帧的上一视频帧的吋间 lastVtime并按照公式 intervalVtime = orgVtime - lastVtime计算为网络层实际接收当前视频帧和上一帧视频帧的吋间间 隔, 其中, intervalVtime为网络层实际接收当前视频帧和上一帧视频帧的吋间间 隔; 保存网络层实际接收当前视频帧的吋间 orgVtime作为网络层实际接收当前视 频帧的上一帧视频帧的吋间 lastVtime; 保存当前视频帧的帧索弓 I orgVIndex作为 当前视频帧的上一视频帧的帧索弓 I lastVIndex; 保存当前视频帧的展示吋间戳 org Vpts作为当前视频帧的上一帧视频帧的原始展示吋间戳 lastorgVpts , 等等。
[0048] S205, 判断当前视频帧和当前视频帧的上一视频帧之间的帧索弓 I差值是否等于 1?
[0049] 若 intervalVI等于 1, 则流程执行步骤 S208 , 否则, 流程执行步骤 S206。
[0050] S206 , 判断网络层实际接收当前视频帧和上一帧视频帧的吋间间隔是否大于预 设阈值。
[0051] 若网络层实际接收当前视频帧和上一帧视频帧的吋间间隔 intervalVtime大于预 设阈值, 例如 3秒, 则流执行步骤 S207, 否则, 流程执行步骤 S208。
[0052] S207, 计算录像吋间间隔累积值 sumIntervalTime。
[0053] 在本发明实施例中, 按照公式 sumlntervalTime =∑(orgVtime - lastMediaTime), 其中, lastMediaTime表示网络层实际接收当前视频帧的任意上一视频帧的吋间 , 求和符号"∑"表示吋间间隔的累积。
[0054] S208 , 保存当前视频帧的上一视频帧的展示吋间戳 lastVpts和上一帧媒体帧吋 间 lastMediaTime , 并计算当前视频帧矫正后的展示吋间戳 adjVpts。
[0055] 具体地, 将当前视频帧的展示吋间戳 orgVpts加上 intervalVtime后保存, 作为当 前视频帧的上一视频帧的展示吋间戳 lastVpts , 将网络层实际接收当前视频帧的 吋间 orgVtime保存, 作为上一帧媒体帧吋间 lastMediaTime, 按照公式 adj Vpts = lastVpts + (orgVtime - lastVtime) _ sumlntervalTimei†算矫正后的视频帧的展示 吋间戳, 其中, adjVpts为当前视频帧矫正后的展示吋间戳, 等等。
[0056] S102, 根据当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳。
[0057] 作为本发明一个实施例, 根据当前视频帧矫正后的展示吋间戳, 矫正当前音频 帧的展示吋间戳可通过如下步骤 S1021和 S1022实现:
[0058] S1021 , 读取当前视频帧的上一视频帧的展示吋间戳 lastVpts. 网络层实际接收 视频帧的原始展示吋间戳 orgVpts和网络层实际接收音频帧的原始展示吋间戳 org Apts°
[0059] S1022, 按照公式 adj Apts = lastVpts + (orgApts - orgVpts) 计算矫正后的音频帧 的展示吋间戳, 其中, adjApts为当前音频帧矫正后的展示吋间戳。
[0060] S103 , 根据当前音频帧矫正后的展示吋间戳, 矫正当前音频帧的帧索引。
[0061] 作为本发明一个实施例, 根据当前音频帧矫正后的展示吋间戳, 矫正当前音频 帧的帧索弓 I可通过如下步骤 S 1031和 S 1032实现:
[0062] S1031 , 读取当前音频帧的上一音频帧的帧索引 lastAIndex和当前音频帧的上一 音频帧矫正后的展示吋间戳 laStadjAptS
[0063] S1032, 按照公式 adjlndex = lastAIndex + (adjApts - lastadjApts) /60计算矫正后 的音频帧的展示吋间戳, 其中, adjlndex为当前音频帧矫正后的帧索引。
[0064] 根据当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳, 以及根 据当前音频帧矫正后的展示吋间戳, 矫正当前音频帧的帧索引, 其更为详细的 流程如附图 3所示, 主要包括步骤 S301至 S311, 说明如下:
[0065] S301 , 保存原始的参数值。
[0066] 具体包括: 在网络层接收到一帧音频帧吋, 将接收该音频帧的实际吋间保存为 原始接收吋间 orgAtime, 将该音频帧的展示吋间戳保存为原始的音频帧展示吋间 戳 orgApts, 将该音频帧的帧索弓 I保存为原始的音频帧索弓 10rgAIndex
[0067] S302, 判断当前音频帧是否为第一帧音频帧。
[0068] 若当前音频帧是第一帧音频帧, 则流程执行步骤 S303, 否则, 执行步骤 S304。
[0069] S303 , 设置当前音频帧的上一帧音频帧的实际接收吋间、 帧索引、 展示吋间戳 等参数。
[0070] 具体地, 包括: 将 orgAtime设置为上一帧媒体帧吋间 lastMediaTime, 设置当前 音频帧和上一帧音频帧之间的帧索引间隔 intervalVI为 1, 设置网络层实际接收当 前音频帧和上一帧音频帧的吋间间隔 intervalVtime为 0, 按照公式 lastApts=orgApt S-lastorgVpts+lastVpts计算第一帧音频帧的上一音频帧的展示吋间戳 lastApts , 将 网络层实际接收当前音频帧的吋间 orgAtime设置为网络层实际接收当前音频帧的 上一音频帧的吋间 lastAtime, 将当前音频帧的帧索弓 I orgAIndex设置为当前音频 帧的上一音频帧的帧索弓 I lastAIndex, 以及将当前音频帧的展示吋间戳 orgApts保 存为当前音频帧的上一帧音频帧矫正后的展示吋间戳 lastadjApts , 等等。
[0071] S304, 设置当前音频帧的上一帧音频帧的实际接收吋间、 计算网络层实际接收 当前音频帧和上一帧音频帧的吋间间隔以及计算当前音频帧和上一帧音频帧的 帧索引差值等参数。
[0072] 具体地, 包括: 将 orgAtime设置为上一帧媒体帧吋间 lastMediaTime; 读取当前 音频帧的帧索弓 I orgAIndex和当前音频帧的上一音频帧的帧索弓 I lastAIndex并按照 公式 intervalAI = orgAIndex - lastAIndex计算当前音频帧和当前音频帧的上一音频 帧之间的帧索引差值, 其中, intervalAI为当前音频帧和当前视频帧的上一音频 帧之间的帧索引差值; 读取网络层实际接收当前音频帧的吋间 orgAtime和网络层 实际接收当前音频帧的上一视频帧的吋间 lastAtime并按照公式 intervalAtime = orgAtime - lastAtime计算为网络层实际接收当前音频帧和上一帧音频帧的吋间间 隔, 其中, intervalAtime为网络层实际接收当前音频帧和上一帧音频帧的吋间间 隔, 等等。
[0073] S305, 判断当前音频帧和当前音频帧的上一音频帧之间的帧索弓 I差值是否等于 1? [0074] 若 intervalAI等于 1, 则流程执行步骤 S308, 否则, 流程执行步骤 S306。
[0075] S306 , 判断网络层实际接收当前音频帧和上一帧音频帧的吋间间隔是否大于预 设阈值。
[0076] 若网络层实际接收当前音频帧和上一帧音频帧的吋间间隔 intervalAtime大于预 设阈值, 例如 3秒, 则流执行步骤 S307, 否则, 流程执行步骤 S308。
[0077] S307, 计算录像吋间间隔累积值 sumIntervalTime。
[0078] 在本发明实施例中, 按照公式 sumlntervalTime =∑(orgAtime - lastMediaTime), 其中, lastMediaTime表示网络层实际接收当前音频帧的任意上一音频帧的吋间 , 求和符号"∑"表示吋间间隔的累积。
[0079] S308 , 计算当前音频帧的上一视音帧的展示吋间戳 lastApts和当前音频帧矫正 后的展示吋间戳 adjApts。
[0080] 具体地, 按照公式 lastApts=orgApts-lastorgVpts+lastVpts计算当前音频帧的上一 视音帧的展示吋间戳 lastApts , 按照公式 adjApts = orgApts-lastorgVpts+lastVpts -sumlntervalTime计算矫正后的音频帧的展示吋间戳 adjApts, 等等。
[0081] S309 , 判断当前音频帧是否为第一帧音频帧。
[0082] 若当前音频帧是第一帧音频帧, 则流程执行步骤 S310, 否则, 执行步骤 S311。
[0083] S310, 将矫正后的音频帧的展示吋间戳 adjApts设置为上一帧音频帧矫正后的展 示吋间戳 lastadjApts。
[0084] S311, 计算矫正后的音频帧的帧索引 adjAIndex并保存为当前音频帧的上一音 频帧的帧索引 lastAIndex, 保存网络层实际接收当前音频帧的上一帧音频帧的吋 间 lastAtime°
[0085] 具体地, 按照公式 adjAIndex = lastAIndex + (adjApts - lastadjApts) /60计算矫 正后的音频帧的展示吋间戳 adjAIndex, 并将矫正后的音频帧的帧索弓 I adjAIndex 保存为当前音频帧的上一音频帧的帧索引 lastAIndex, 同吋, 将网络层实际接收 当前音频帧的吋间 orgAtime保存为网络层实际接收当前音频帧的上一帧音频帧的 吋间 lastAtime, 等等。
[0086] S104, 根据当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳
, 控制视频的播放速度, 使得音频和视频的播放同步。 [0087] 作为本发明一个实施例, 根据当前视频帧矫正后的展示吋间戳和当前音频帧矫 正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步可以是 : 若根据当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳获 知视频播放速度快于音频播放速度, 则增大视频播放的停顿吋间, 直至音频和 视频播放速度相同; 若根据当前视频帧矫正后的展示吋间戳和当前音频帧矫正 后的展示吋间戳获知视频播放速度慢于音频播放速度, 则减小视频播放的停顿 吋间, 直至音频和视频播放速度相同, 。
[0088] 从上述附图 1示例的音视频同步方法可知, 一方面, 在音视频媒体文件播放前 , 对音视频的展示吋间戳和音频帧索引等参数进行了矫正, 能够获得精准的音 视频帧展示吋间戳, 这为音视频的同步打下了良好的基础; 另一方面, 根据已 经获得的当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳即 精准的展示吋间戳, 能够精确地控制视频的播放速度, 从而最终使得音视频的 播放能够精确同步。
[0089] 请参阅附图 4, 是本发明实施例四提供的音视频同步装置的结构示意图。 为了 便于说明, 附图 4仅示出了与本发明实施例相关的部分。 附图 4示例的音视频同 步装置可以是附图 1示例的音视频同步方法的执行主体。 附图 4示例的音视频同 步装置主要包括视频帧展示吋间戳矫正模块 401、 音频帧展示吋间戳矫正模块 40 2、 音频帧索引矫正模块 403和播放控制模块 404, 其中:
[0090] 视频帧展示吋间戳矫正模块 401, 用于矫正当前视频帧的展示吋间戳;
[0091] 音频帧展示吋间戳矫正模块 402, 用于根据当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳;
[0092] 音频帧索引矫正模块 403, 用于根据当前音频帧矫正后的展示吋间戳, 矫正当 前音频帧的帧索引;
[0093] 播放控制模块 404, 用于根据当前视频帧矫正后的展示吋间戳和当前音频帧矫 正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
[0094] 需要说明的是, 以上附图 4示例的音视频同步装置的实施方式中, 各功能模块 的划分仅是举例说明, 实际应用中可以根据需要, 例如相应硬件的配置要求或 者软件的实现的便利考虑, 而将上述功能分配由不同的功能模块完成, 即将所 述音视频同步装置的内部结构划分成不同的功能模块, 以完成以上描述的全部 或者部分功能。 而且, 实际应用中, 本实施例中的相应的功能模块可以是由相 应的硬件实现, 也可以由相应的硬件执行相应的软件完成, 例如, 前述的视频 帧展示吋间戳矫正模块, 可以是具有执行前述矫正当前视频帧的展示吋间戳的 硬件, 例如视频帧展示吋间戳矫正器, 也可以是能够执行相应计算机程序从而 完成前述功能的一般处理器或者其他硬件设备; 再如前述的音频帧展示吋间戳 矫正模块, 可以是执行根据当前视频帧矫正后的展示吋间戳, 矫正当前音频帧 的展示吋间戳的硬件, 例如音频帧展示吋间戳矫正器, 也可以是能够执行相应 计算机程序从而完成前述功能的一般处理器或者其他硬件设备 (本说明书提供 的各个实施例都可应用上述描述原则) 。
[0095] 附图 4示例的视频帧展示吋间戳矫正模块 401可以包括第一读取单元 501、 获取 单元 502和第一计算单元 503, 如附图 5所示本发明实施例五提供的音视频同步装 置, 其中:
[0096] 第一读取单元 501, 用于读取当前视频帧的上一视频帧的展示吋间戳 lastVpts 网络层实际接收当前视频帧的吋间 orgVtime和网络层实际接收当前视频帧的上一 视频帧的吋间 lastVtime;
[0097] 获取单元 502, 用于获取录像吋间跳跃累积值 sumlntervalTime;
[0098] 第一计算单元 503, 用于按照公式 adjVpts = lastVpts + (orgVtime - lastVtime) - sumlntervalTime计算矫正后的视频帧的展示吋间戳, 其中, adjVpts为当前视频 帧矫正后的展示吋间戳。
[0099] 附图 5示例的音频帧展示吋间戳矫正模块 402可以包括第二读取单元 601和第二 计算单元 602, 如附图 6所示本发明实施例六提供的音视频同步装置, 其中: [0100] 第二读取单元 601, 用于读取当前视频帧的上一视频帧的展示吋间戳 lastVpts. 网络层实际接收视频帧的原始展示吋间戳 orgVpts和网络层实际接收音频帧的原 始展示吋间戳 orgApts;
[0101] 第二计算单元 602, 用于按照公式 adjApts = lastVpts + (orgApts - orgVpts) 计算 矫正后的音频帧的展示吋间戳, 其中, adjApts为当前音频帧矫正后的展示吋间 戳。 [0102] 附图 6示例的音频帧索引矫正模块 403可以包括第三读取单元 701和第三计算单 元 702, 如附图 7所示本发明实施例七提供的音视频同步装置, 其中:
[0103] 第三读取单元 701, 用于读取当前音频帧的上一音频帧的帧索引 lastAIndex和当 前音频帧的上一音频帧矫正后的展示吋间戳 lastadjApts;
[0104] 第三计算单元 702, 用于按照公式 adj Alndex = lastAIndex + (adjApts - lastadjApts) /60计算矫正后的音频帧的展示吋间戳, 其中, adj Alndex为当前音 频帧矫正后的帧索引。
[0105] 附图 4至 7任一示例的播放控制模块 404可以包括减速单元 801和增速单元 802, 如附图 8-a至 8-d所示本发明实施例八至十一提供的音视频同步装置, 其中:
[0106] 减速单元 801, 用于若根据当前视频帧矫正后的展示吋间戳和当前音频帧矫正 后的展示吋间戳获知视频播放速度快于音频播放速度, 则增大视频播放的停顿 吋间, 直至音频和视频播放速度相同;
[0107] 增速单元 802, 用于若根据当前视频帧矫正后的展示吋间戳和当前音频帧矫正 后的展示吋间戳获知视频播放速度慢于音频播放速度, 则减小视频播放的停顿 吋间, 直至音频和视频播放速度相同。
[0108] 请参考图 9, 本发明提供了一种音视频同步装置的示意图。 音视频同步装置可 能是计算机设备或者计算机设备中的一个功能单元, 本发明具体实施例并不对 音视频同步装置的具体实现做限定。 音视频同步装置包括:
[0109] 处理器 (processor) 901、 通信接口 (Communications Interface) 902、 存储器
(memory) 903和总线 904。
[0110] 处理器 901、 通信接口 902和存储器 903通过总线 904完成相互间的通信。
[0111] 通信接口 902, 用于与外界设备, 例如, 个人电脑、 服务器等通信。
[0112] 处理器 901, 用于执行程序 905。
[0113] 具体地, 程序 905可以包括程序代码, 所述程序代码包括计算机操作指令。
[0114] 处理器 901可能是一个中央处理器 CPU, 或者是特定集成电路 ASIC (Applicatio n Specific Integrated Circuit) , 或者是被配置成实施本发明实施例的一个或多个 集成电路。
[0115] 存储器 903, 用于存放程序 905。 存储器 903可能包含高速 RAM存储器, 也可能 还包括非易失性存储器 (non-volatile memory) , 例如至少一个磁盘存储器。 程 序 905具体可以包括:
[0116] 矫正当前视频帧的展示吋间戳;
[0117] 根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳; [0118] 根据所述当前音频帧矫正后的展示吋间戳, 矫正所述当前音频帧的帧索引; [0119] 根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
[0120] 程序 905中各模块的具体实现参见图 4所示实施例中的相应模块, 在此不赘述。
[0121] 所属领域的技术人员可以清楚地了解到, 为描述的方便和简洁, 上述描述的装 置和单元的具体工作过程, 可以参考前述方法实施例中的对应过程, 在此不再 赘述。
[0122] 在本申请所提供的几个实施例中, 应该理解到, 所揭露的系统、 装置和方法, 可以通过其它的方式实现。 例如, 以上所描述的装置实施例仅仅是示意性的, 例如, 所述单元的划分, 仅仅为一种逻辑功能划分, 实际实现吋可以有另外的 划分方式, 例如多个单元或组件可以结合或者可以集成到另一个系统, 或一些 特征可以忽略, 或不执行。 另一点, 所显示或讨论的相互之间的耦合或直接耦 合或通信连接可以是通过一些通信接口, 装置或单元的间接耦合或通信连接, 可以是电性, 机械或其它的形式。
[0123] 所述作为分离部件说明的单元可以是或者也可以不是物理上分幵的, 作为单元 显示的部件可以是或者也可以不是物理单元, 即可以位于一个地方, 或者也可 以分布到多个网络单元上。 可以根据实际的需要选择其中的部分或者全部单元 来实现本实施例方案的目的。
[0124] 另外, 在本发明各个实施例中的各功能单元可以集成在一个处理单元中, 也可 以是各个单元单独物理存在, 也可以两个或两个以上单元集成在一个单元中。
[0125] 所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用吋, 可 以存储在一个计算机可读取存储介质中。 基于这样的理解, 本发明的技术方案 本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产 品的形式体现出来, 该计算机软件产品存储在一个存储介质中, 包括若干指令 用以使得一台计算机设备 (可以是个人计算机, 服务器, 或者网络设备等) 执 行本发明各个实施例所述方法的全部或部分步骤。 而前述的存储介质包括: u盘
、 移动硬盘、 只读存储器 (ROM, Read-Only
Memory) 、 随机存取存储器 (RAM, Random Access Memory) 、 磁碟或者光盘 等各种可以存储程序代码的介质。
以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在本发明的 精神和原则之内所作的任何修改、 等同替换和改进等, 均应包含在本发明的保 护范围之内。

Claims

权利要求书
[权利要求 1] 一种音视频同步方法, 其特征在于, 所述方法包括:
矫正当前视频帧的展示吋间戳;
根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋 间戳;
根据所述当前音频帧矫正后的展示吋间戳, 矫正所述当前音频帧的帧 索引;
根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示 吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
[权利要求 2] 根据权利要求 1所述的方法, 其特征在于, 所述矫正当前视频帧的展 示吋间戳包括:
读取当前视频帧的上一视频帧的展示吋间戳 lastVpts、 网络层实际接 收当前视频帧的吋间 orgVtime和网络层实际接收当前视频帧的上一视 频帧的吋间 lastVtime;
获取录像吋间跳跃累积值 sumlntervalTime;
按^ 公式 adjVpts = lastVpts + (orgVtime - lastVtime) - sumlntervalTime计算矫正后的视频帧的展示吋间戳, 所述 adjVpts为所 述当前视频帧矫正后的展示吋间戳。
[权利要求 3] 根据权利要求 2所述的方法, 其特征在于, 所述根据所述当前视频帧 矫正后的展示吋间戳, 矫正当前音频帧的展示吋间戳包括: 读取当前视频帧的上一视频帧的展示吋间戳 lastVpts、 网络层实际接 收视频帧的原始展示吋间戳 orgVpts和网络层实际接收音频帧的原始 展示吋间戳 orgApts;
按照公式 adjApts = lastVpts + (orgApts - orgVpts) 计算矫正后的音频 帧的展示吋间戳, 所述 adjApts为所述当前音频帧矫正后的展示吋间
[权利要求 4] 根据权利要求 3所述的方法, 其特征在于, 所述根据所述当前音频帧 矫正后的展示吋间戳, 矫正所述当前音频帧的帧索弓 I包括: 读取当前音频帧的上一音频帧的帧索引 lastAIndex和当前音频帧的上 一音频帧矫正后的展示吋间戳 lastadjApts ;
按照、公式 adjAIndex = lastAIndex + (adjApts - lastadjApts) /60计算矫 正后的音频帧的展示吋间戳, 所述 adjAIndex为所述当前音频帧矫正 后的帧索引。
[权利要求 5] 根据权利要求 1至 4任意一项所述的方法, 其特征在于, 所述根据所述 当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步包括: 。
若根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展 示吋间戳获知视频播放速度快于音频播放速度, 则增大视频播放的停 顿吋间, 直至所述音频和视频播放速度相同;
若根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展 示吋间戳获知视频播放速度慢于音频播放速度, 则减小视频播放的停 顿吋间, 直至所述音频和视频播放速度相同。
[权利要求 6] —种音视频同步装置, 其特征在于, 所述装置包括:
视频帧展示吋间戳矫正模块, 用于矫正当前视频帧的展示吋间戳; 音频帧展示吋间戳矫正模块, 用于根据所述当前视频帧矫正后的展示 吋间戳, 矫正当前音频帧的展示吋间戳;
音频帧索引矫正模块, 用于根据所述当前音频帧矫正后的展示吋间戳 , 矫正所述当前音频帧的帧索引;
播放控制模块, 用于根据所述当前视频帧矫正后的展示吋间戳和当前 音频帧矫正后的展示吋间戳, 控制视频的播放速度, 使得音频和视频 的播放同步。
[权利要求 7] 根据权利要求 6所述的装置, 其特征在于, 所述视频帧展示吋间戳矫 正模块包括:
第一读取单元, 用于读取当前视频帧的上一视频帧的展示吋间戳 last Vpts、 网络层实际接收当前视频帧的吋间 orgVtime和网络层实际接收 当前视频帧的上一视频帧的吋间 lastVtime; 获取单元, 用于获取录像吋间跳跃累积值 sumlntervalTime;
第一计算单元, 用于按照公式 adjVpts = lastVpts + (orgVtime - lastVtime) - sumlntervalTime计算矫正后的视频帧的展示吋间戳, 所 述 adjVpts为所述当前视频帧矫正后的展示吋间戳。
[权利要求 8] 根据权利要求 7所述的装置, 其特征在于, 所述音频帧展示吋间戳矫 正模块包括:
第二读取单元, 用于读取当前视频帧的上一视频帧的展示吋间戳 last Vpts、 网络层实际接收视频帧的原始展示吋间戳 orgVpts和网络层实 际接收音频帧的原始展示吋间戳 orgApts;
第二计算单元, 用于按照公式 adjApts = lastVpts + (orgApts - orgVpts
) 计算矫正后的音频帧的展示吋间戳, 所述 adjApts为所述当前音频 帧矫正后的展示吋间戳。
[权利要求 9] 根据权利要求 8所述的装置, 其特征在于, 所述音频帧索引矫正模块 包括:
第三读取单元, 用于读取当前音频帧的上一音频帧的帧索引 lastAInde X和当前音频帧的上一音频帧矫正后的展示吋间戳 lastadjApts ;
第三计算单元, 用于按照公式 adjAIndex = lastAIndex + (adjApts - lastadjApts) /60计算矫正后的音频帧的展示吋间戳, 所述 adjAIndex为 所述当前音频帧矫正后的帧索引。
[权利要求 10] 根据权利要求 6至 9任意一项所述的装置, 其特征在于, 所述播放控制 模块包括:
减速单元, 用于若根据所述当前视频帧矫正后的展示吋间戳和当前音 频帧矫正后的展示吋间戳获知视频播放速度快于音频播放速度, 则增 大视频播放的停顿吋间, 直至所述音频和视频播放速度相同; 增速单元, 用于若根据所述当前视频帧矫正后的展示吋间戳和当前音 频帧矫正后的展示吋间戳获知视频播放速度慢于音频播放速度, 则减 小视频播放的停顿吋间, 直至所述音频和视频播放速度相同。
[权利要求 11] 一种音视频同步装置, 其特征在于, 所述装置包括: 处理器, 通信接 口, 存储器和总线; 其中, 所述处理器、 所述通信接口和所述存储器 通过所述总线完成相互间的通信;
所述通信接口, 用于与外界设备通信;
所述处理器, 用于执行程序;
所述存储器, 用于存放所述程序;
所述程序包括:
矫正当前视频帧的展示吋间戳;
根据所述当前视频帧矫正后的展示吋间戳, 矫正当前音频帧的展示吋 间戳;
根据所述当前音频帧矫正后的展示吋间戳, 矫正所述当前音频帧的帧 索引;
根据所述当前视频帧矫正后的展示吋间戳和当前音频帧矫正后的展示 吋间戳, 控制视频的播放速度, 使得音频和视频的播放同步。
PCT/CN2016/102442 2016-10-18 2016-10-18 一种音视频同步方法和装置 WO2018072098A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/102442 WO2018072098A1 (zh) 2016-10-18 2016-10-18 一种音视频同步方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/102442 WO2018072098A1 (zh) 2016-10-18 2016-10-18 一种音视频同步方法和装置

Publications (1)

Publication Number Publication Date
WO2018072098A1 true WO2018072098A1 (zh) 2018-04-26

Family

ID=62018162

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/102442 WO2018072098A1 (zh) 2016-10-18 2016-10-18 一种音视频同步方法和装置

Country Status (1)

Country Link
WO (1) WO2018072098A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464256A (zh) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 时间戳的校正方法、装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313313B2 (en) * 2002-07-25 2007-12-25 Microsoft Corporation Audio/video synchronization with no clean points
CN101303880A (zh) * 2008-06-30 2008-11-12 北京中星微电子有限公司 录制、播放音视频文件的方法及装置
CN102780929A (zh) * 2012-05-31 2012-11-14 新奥特(北京)视频技术有限公司 一种通过处理时码跳变以使视音频同步的方法
CN103945166A (zh) * 2012-11-01 2014-07-23 波利康公司 用于在媒体中继会议中同步音频和视频流的方法和系统
CN105898500A (zh) * 2015-12-22 2016-08-24 乐视云计算有限公司 网络视频播放方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7313313B2 (en) * 2002-07-25 2007-12-25 Microsoft Corporation Audio/video synchronization with no clean points
CN101303880A (zh) * 2008-06-30 2008-11-12 北京中星微电子有限公司 录制、播放音视频文件的方法及装置
CN102780929A (zh) * 2012-05-31 2012-11-14 新奥特(北京)视频技术有限公司 一种通过处理时码跳变以使视音频同步的方法
CN103945166A (zh) * 2012-11-01 2014-07-23 波利康公司 用于在媒体中继会议中同步音频和视频流的方法和系统
CN105898500A (zh) * 2015-12-22 2016-08-24 乐视云计算有限公司 网络视频播放方法及装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111464256A (zh) * 2020-04-14 2020-07-28 北京百度网讯科技有限公司 时间戳的校正方法、装置、电子设备和存储介质

Similar Documents

Publication Publication Date Title
CN106658133B (zh) 一种音视频同步播放的方法及终端
US20170034263A1 (en) Synchronized Playback of Streamed Audio Content by Multiple Internet-Capable Portable Devices
US11178451B2 (en) Dynamic playout of transition frames while transitioning between playout of media streams
US9832507B2 (en) System and method for synchronizing media output devices connected on a network
EP2752023B1 (en) Method to match input and output timestamps in a video encoder and advertisement inserter
TWI716018B (zh) 動態縮減替換內容之播放以幫助對齊替換內容之結束與已替換內容之結束
WO2017067489A1 (zh) 机顶盒音视频同步的方法及装置、存储介质
US20140376873A1 (en) Video-audio processing device and video-audio processing method
US20130219444A1 (en) Receiving apparatus and subtitle processing method
US9549027B2 (en) Network-synchronized media playback
WO2017107516A1 (zh) 网络视频播放方法及装置
JP5178375B2 (ja) デジタル放送再生装置およびデジタル放送再生方法
EP2538689A1 (en) Adaptive media delay matching
CN110519627B (zh) 一种音频数据的同步方法和装置
JP6809225B2 (ja) 送信装置、送信方法、受信装置および受信方法
JP2023508945A (ja) 無線オーディオのビデオとの同期
WO2018072098A1 (zh) 一种音视频同步方法和装置
CN115767158A (zh) 同步播放方法、终端设备及存储介质
TWI587697B (zh) 多媒體同步系統與方法
CN115914708A (zh) 媒体的音视频同步方法及系统、电子设备
JP6275906B1 (ja) 動画コンテンツを再生するためのプログラム及び方法、並びに、動画コンテンツを配信及び再生するためのシステム
JP2009212696A (ja) データ処理装置、データ処理方法、およびプログラム
WO2023273601A1 (zh) 一种音频同步方法及音频播放设备、音频源、存储介质
CN110463209B (zh) 用于在多媒体系统中发送和接收信号的设备和方法
JP6433321B2 (ja) 同期制御システム、方法及びプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16919013

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16919013

Country of ref document: EP

Kind code of ref document: A1