CN106612452B - method and device for synchronizing audio and video of set top box - Google Patents

method and device for synchronizing audio and video of set top box Download PDF

Info

Publication number
CN106612452B
CN106612452B CN201510691001.XA CN201510691001A CN106612452B CN 106612452 B CN106612452 B CN 106612452B CN 201510691001 A CN201510691001 A CN 201510691001A CN 106612452 B CN106612452 B CN 106612452B
Authority
CN
China
Prior art keywords
frame
video
audio
output
rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510691001.XA
Other languages
Chinese (zh)
Other versions
CN106612452A (en
Inventor
陈斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen ZTE Microelectronics Technology Co Ltd
Original Assignee
Shenzhen ZTE Microelectronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen ZTE Microelectronics Technology Co Ltd filed Critical Shenzhen ZTE Microelectronics Technology Co Ltd
Priority to CN201510691001.XA priority Critical patent/CN106612452B/en
Priority to PCT/CN2016/102775 priority patent/WO2017067489A1/en
Publication of CN106612452A publication Critical patent/CN106612452A/en
Application granted granted Critical
Publication of CN106612452B publication Critical patent/CN106612452B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/24Systems for the transmission of television signals using pulse code modulation
    • H04N7/52Systems for transmission of a pulse code modulated video signal with one or more other pulse code modulated signals, e.g. an audio signal or a synchronizing signal
    • H04N7/54Systems for transmission of a pulse code modulated video signal with one or more other pulse code modulated signals, e.g. an audio signal or a synchronizing signal the signals being synchronous

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Television Receiver Circuits (AREA)

Abstract

The embodiment of the invention provides a method for synchronizing audios and videos of a set top box, which comprises the following steps: when the first frame of audio and video is determined to be output in the current synchronization queue, acquiring a time stamp of a video frame sequence in the current synchronization queue, and repairing and predicting the time stamp; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame; calculating an ideal frame rate and a compensation frame rate of the video frame sequence; acquiring a repeated frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate; according to the repeated frame rate, synchronously processing and outputting the audio frame and the video frame in the current synchronous queue; therefore, the error of obtaining the time stamp of the audio and video frame can be reduced, and the synchronization effect of the audio and video is improved. The embodiment of the invention also provides a device for synchronizing the audio and video of the set-top box.

Description

method and device for synchronizing audio and video of set top box
Technical Field
the invention relates to the technical field of intelligent set top boxes, in particular to a method and a device for synchronizing audio and video of a set top box.
background
with the popularization of broadband technology, digital set-top boxes are becoming hot spots of smart home research. At present, digital set-top boxes are already popularized and applied, so the user experience of set-top box users is directly influenced by the quality of the audio and video synchronization effect of the set-top boxes.
due to the requirement of high-definition video quality and resource limitation of a Central Processing Unit (CPU), the existing set-top box usually adopts a hardware decoding mode, and needs to perform synchronous Processing on an audio/video frame before the decoded audio/video frame is sent to a display module, that is, to control the frame rate of the currently sent display frame, so as to ensure that the playing time difference of the audio/video frame is small. However, the hardware decoding method has a large error in the timestamp acquired after the audio and video frame is decoded, and the audio and video synchronization effect is not good.
disclosure of Invention
In view of this, the embodiments of the present invention are intended to provide a method and an apparatus for synchronizing audio and video of a set-top box, so as to improve the effect of audio and video synchronization.
the technical scheme of the embodiment of the invention is realized as follows:
A set top box audio and video synchronization method comprises the following steps:
When the first frame of audio and video is determined to be output in the current synchronization queue, acquiring a time stamp of a video frame sequence in the current synchronization queue, and repairing and predicting the time stamp; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame;
calculating an ideal frame rate and a compensation frame rate of the video frame sequence;
Acquiring a repeated frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate;
and according to the repetition frame rate, carrying out synchronous processing on the audio frame and the video frame in the current synchronous queue and outputting the audio frame and the video frame.
in the above method, the calculating the ideal frame rate and the compensated frame rate of the video frame sequence includes:
calculating the ideal frame rate of the video frame sequence through a low frame rate frame interpolation algorithm and a high frame rate frame dropping algorithm according to the refresh rate of the current display module and the time stamps of the video frame sequence;
And calculating the compensation frame rate of the video frame sequence according to the difference value between the time stamps of the video frame sequence and the time stamps of the audio frame sequence.
In the above method, the obtaining a repetition frame rate of the sequence of video frames according to the ideal frame rate and the compensation frame rate includes:
And adding the ideal frame rate and the compensation frame rate to obtain the repeated frame rate of the video frame sequence.
In the above method, the synchronizing and outputting the audio frame and the video frame in the current synchronization queue according to the repetition frame rate includes:
If the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate;
if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue;
If the repeated frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronous queue at the ideal frame rate;
And if the repetition frame rate is greater than the ideal frame rate, inserting the repetition frames with the compensated frame rate into the video frames so as to synchronously output the audio frames and the video frames in the current synchronous queue.
In the above method, before determining that the first frame of audio/video is output in the current synchronization queue, the method further includes:
Judging whether a first frame of audio and video is output in the current synchronization queue or not according to a first frame of audio and video output identification bit;
If the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video;
if the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
in the above method, the method further comprises:
When the first frame of audio and video is determined not to be output in the current synchronization queue, outputting the first frame of audio and video according to a preset first frame of audio and video output synchronization scheme; the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme;
if the preset first frame audio and video output synchronization scheme is a slow synchronization scheme, outputting the first frame video frame, and then synchronously outputting other video frames and audio frames in the synchronization queue;
And if the preset first frame audio and video output synchronization scheme is a fast synchronization scheme, simultaneously outputting the first frame audio frame and the first frame video frame.
In the above method, said outputting the first frame audio frame and the first frame video frame simultaneously includes:
if the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output;
if the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not;
If the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the first frame audio and video output identification position as no;
if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame;
And if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as no.
in the above method, before the determining whether the first frame of audio and video is output in the current synchronization queue according to the first frame of audio and video output identification bit, the method further includes:
Judging whether the input frame number in the current synchronization queue meets a preset frame number or not, and judging whether a display queue overflows or not;
And when the input frame number in the current synchronization queue meets the preset frame number and the display queue does not overflow, executing the operation of judging whether the first frame of audio and video in the current synchronization queue is output or not.
A set-top box audio-video synchronization apparatus, the apparatus comprising:
The determining module is used for determining that the first frame of audio and video is output in the current synchronization queue;
The acquisition module is used for acquiring the time stamp of the video frame sequence in the current synchronization queue and repairing and predicting the time stamp when the determination module determines that the first frame of audio and video is output in the current synchronization queue; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame;
the computing module is used for computing the ideal frame rate and the compensation frame rate of the video frame sequence;
The obtaining module is used for obtaining the repeated frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate; and according to the repetition frame rate, carrying out synchronous processing on the audio frame and the video frame in the current synchronous queue and outputting the audio frame and the video frame.
in the above apparatus, the calculation module is specifically configured to: calculating the ideal frame rate of the video frame sequence through a low frame rate frame interpolation algorithm and a high frame rate frame dropping algorithm according to the refresh rate of the current display module and the time stamps of the video frame sequence; and calculating the compensation frame rate of the video frame sequence according to the difference value between the time stamps of the video frame sequence and the time stamps of the audio frame sequence.
In the above apparatus, the obtaining module is specifically configured to:
and adding the ideal frame rate and the compensation frame rate to obtain the repeated frame rate of the video frame sequence.
in the above apparatus, the obtaining module is specifically configured to:
if the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate;
if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue;
if the repeated frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronous queue at the ideal frame rate;
and if the repetition frame rate is greater than the ideal frame rate, inserting the repetition frames with the compensated frame rate into the video frames so as to synchronously output the audio frames and the video frames in the current synchronous queue.
In the above apparatus, the apparatus further comprises: the judging module is used for judging whether a first frame of audio and video is output in the current synchronization queue or not according to a first frame of audio and video output identification bit; if the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video; if the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
in the above apparatus, the obtaining module is further configured to: when the determining module determines that the first frame of audio and video is not output in the current synchronization queue, outputting the first frame of audio and video according to a preset first frame audio and video output synchronization scheme; the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme;
If the preset first frame audio and video output synchronization scheme is a slow synchronization scheme, the acquisition module outputs the first frame video frame first and then synchronously outputs other video frames and audio frames in the synchronization queue;
If the preset first frame audio/video output synchronization scheme is a fast synchronization scheme, the acquisition module simultaneously outputs the first frame audio frame and the first frame video frame.
In the above apparatus, the obtaining module is specifically configured to:
If the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output;
If the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not;
if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the first frame audio and video output identification position as no;
If the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame;
and if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as no.
in the above apparatus, the determining module is further configured to:
Judging whether the input frame number in the current synchronization queue meets a preset frame number or not, and judging whether a display queue overflows or not;
when the judging module judges that the input frame number in the current synchronous queue meets the preset frame number and the display queue does not overflow, the processing module executes the operation of judging whether the first frame of audio and video in the current synchronous queue is output or not.
According to the method and the device for synchronizing the audio and video of the set top box, provided by the embodiment of the invention, the time stamp of the video frame sequence in the current synchronization queue is obtained by determining that the first frame of audio and video is output in the current synchronization queue, and the time stamp is repaired and predicted; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame; calculating an ideal frame rate and a compensation frame rate of the video frame sequence; acquiring a repeated frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate; according to the repeated frame rate, synchronously processing and outputting the audio frame and the video frame in the current synchronous queue; therefore, the error of obtaining the time stamp of the audio and video frame can be reduced, and the synchronization effect of the audio and video is improved.
Drawings
fig. 1 is a flowchart of a method for synchronizing audio and video of a set-top box according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a set-top box audio and video synchronization device provided in an embodiment of the present invention.
Detailed Description
in each embodiment of the invention, the time stamp of the audio frame is taken as a time standard, and in the synchronization process, the frame rate of the video frame is mainly adjusted according to the time stamp difference value of the audio and video, so that the audio frame is waited or discarded, and the synchronization of the video frame and the audio frame is realized, thereby reducing the error of obtaining the time stamp of the audio and video frame and improving the synchronization effect of the audio and video.
fig. 1 is a flowchart of a method for synchronizing audio and video of a set-top box according to an embodiment of the present invention. As shown in fig. 1, the method for synchronizing audio and video of a set-top box provided by this embodiment may specifically include:
step 101, when it is determined that the first frame of audio and video is output in the current synchronization queue, obtaining a time stamp of a video frame sequence in the current synchronization queue, and repairing and predicting the time stamp; the first frame audio and video comprises a first frame audio frame and a first frame video frame.
If the current synchronization queue outputs the first frame of audio and video, namely the first frame of audio and video is already output, the time stamp of the video frame sequence needs to be repaired and predicted.
specifically, during the playing process, the normal pts sequence has the characteristic of continuously increasing, i.e. pts1< pts2< pts3< pts4, and if the pts sequence does not satisfy the condition, all pts sequences are judged to be abnormal pts. In this example, three types of abnormal states are described.
Abnormal form 1: pts1< pts2, pts2> pts4> pts3, which occurs when the live channel appears to be looped (the film source is broadcast circularly, the connection point is from head to tail); in the abnormal condition, some internal variables of the synchronization module need to be cleared, which indicates that the playing is restarted without repairing pts; abnormal morphology 2: pts1< pts2< pts3, p3> pts4> p 2; in the form, pts3 needs to be repaired, wherein the repair method is pts3 ═ pts2+ delay; delay represents the ideal difference in pts between frames; abnormal morphology 3: p3< p1< p2< p 4; in the form, pts3 needs to be repaired, wherein the repair method is pts3 ═ pts2+ delay; the delay represents an ideal inter-frame pts difference, and a pts difference value with the maximum probability is taken as the ideal delay by counting the distribution of pts difference values of two frames before and after the video of the first 100 frames.
for pts prediction: if the interlaced video is processed by interlacing, the frame rate is doubled, that is, each original field is changed into a frame after interpolation, each original frame is changed into two frames, one frame adopts pts of the original frame as pts, and pts of the other frame adopts the pts mean value of the current original frame and the next original frame as predicted pts.
In this embodiment, before determining that the first frame of audio and video is output in the current synchronization queue, it is further required to determine whether the first frame of audio and video is output in the current synchronization queue according to a first frame of audio and video output identification bit; if the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video; if the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
For example, when the output identification position of a first frame of audio/video is set to 1, it indicates that the first frame of audio/video is output, and at this time, when the output identification position of the first frame of audio/video is set to 0, it indicates that the first frame of audio/video is not output, and at this time, when the output identification position of the first frame of audio/video is set to 0, it indicates that the first frame of audio/video is output; or the first frame audio/video output identification position is set to be 0 to indicate that the first frame audio/video is output, and the first frame audio/video output identification position is set to be 1 to indicate that the first frame audio/video is not output. The present embodiment does not specifically limit this.
before judging whether a first frame of audio and video is output in the current synchronization queue according to a first frame of audio and video output identification bit, judging whether the number of input frames in the current synchronization queue meets a preset number of frames or not and judging whether a display queue overflows or not; and when the input frame number in the current synchronization queue meets the preset frame number and the display queue does not overflow, executing the operation of judging whether the first frame of audio and video in the current synchronization queue is output or not.
Judging whether the input frame number in the current synchronization queue meets the preset frame number, in this embodiment, for the interlaced source progressive output, the required frame number is greater than or equal to 3, and otherwise, the required frame number is greater than or equal to 2.
Further, when it is determined that the first frame of audio/video is not output in the current synchronization queue, the first frame of audio/video is output according to a preset first frame of audio/video output synchronization scheme, and the processing flow is ended.
It should be noted that the preset first frame audio/video output synchronization scheme is preset according to specific requirements of a client, and when it is determined that the first frame audio/video is output in the current synchronization queue, the first frame audio/video can be output according to the preset first frame audio/video output synchronization scheme.
specifically, the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme; if the preset first frame audio and video output synchronization scheme is a slow synchronization scheme, outputting the first frame video frame, and then synchronously outputting other video frames and audio frames in the synchronization queue, namely, the slow synchronization scheme requires the first frame video frame to be immediately output, and after the first frame video frame is output, synchronizing the video frame and the audio frame through the synchronization scheme; and if the preset first frame audio and video output synchronization scheme is a fast synchronization scheme, simultaneously outputting the first frame audio frame and the first frame video frame, namely, in the fast synchronization scheme, requiring that the audio and the video are simultaneously output and are synchronized when being output.
in the fast synchronization scheme, the outputting the first frame audio frame and the first frame video frame simultaneously may specifically include: if the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output; if the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not; if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the first frame audio and video output identification position as no; if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame; and if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as no.
it should be noted that, if the audio and video is not synchronized or the audio frame does not arrive for more than 1.5s, the first frame of video frame is forcibly output.
Step 102, calculating an ideal frame rate and a compensation frame rate of the video frame sequence.
In the step, according to the refresh rate of the current display module and the time stamps of the video frame sequence, calculating the ideal frame rate ideal _ repeat of the video frame sequence through a low frame rate frame interpolation algorithm and a high frame rate frame dropping algorithm; calculating a compensated frame rate extra _ repeat of the sequence of video frames from a difference between timestamps of the sequence of video frames and timestamps of the sequence of audio frames.
specifically, the high frame rate frame dropping algorithm adopted in this embodiment is as follows: three frames are lost in two periods, taking the display module delay of 20ms as an example: acquiring three frames pts sequentially coming out of a synchronous input queue (video post-processing), and recording the three frames pts as pts0, pts1 and pts2, wherein the three frames pts respectively correspond to a tail frame pts of a synchronous output queue, a head frame pts of the synchronous input queue and a second frame pts of the synchronous input queue; if (pts2-pts0) >2 × 20ms, preparing to drop the video frame corresponding to pts 1; otherwise, the low frame rate chip source is considered, and the algorithm is exited; if the number of frames of the display queue medium pressure is not enough N frames (experience value), the video frame is not allowed to be dropped, the effect of audio waiting for the video is realized by inserting a small amount of audio data at the moment, otherwise, the current video frame is dropped.
specifically, in the low frame rate frame interpolation algorithm adopted in this embodiment, it is required to ensure that the display sending frequency cannot exceed the refresh rate of the display module, so as to ensure that each valid frame stays on the screen for a proper time by interpolating the repeated frames.
Because the ideal frame rate only considers the influence of the refresh rate of the display module to ensure that the decoding and playing of the video can be smooth, and the synchronization between the audio and the video is not considered, the synchronization of the video to the audio is to insert or drop frames of the video frames according to the playing speed of the audio, which is called as the compensation frame rate extra _ repeat.
step 103, obtaining a repetition frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate.
Adding the ideal frame rate and the compensation frame rate to obtain a repetition frame rate all _ repeat of the video frame sequence; specifically, all _ repeat is ideal _ repeat + extra _ repeat.
And step 104, synchronizing and outputting the audio frame and the video frame in the current synchronization queue according to the repetition frame rate.
If the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate; if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue; if the repeated frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronous queue at the ideal frame rate; and if the repetition frame rate is greater than the ideal frame rate, inserting the repetition frames with the compensated frame rate into the video frames so as to synchronously output the audio frames and the video frames in the current synchronous queue.
If all _ repeat >0, the current head node of the video post-processing module is put into a synchronization queue, the number of repeated frames is all _ repeat, if all _ repeat is ideal _ repeat, the video is played at an ideal frame rate, and if all _ repeat > ideal _ repeat indicates that the video is faster than the audio, extra _ repeat repeated frames are required to be inserted to compensate the frame rate compared with the ideal frame rate, so that the audio and the video can be synchronized; if all _ repeat <0, which means that video is slower than audio, the current frame needs to be dropped, and multiple repeated frames can be inserted for the current frame but at most one frame can be dropped.
in practical application, in order to prevent the picture from being frozen, when the extra _ repeat is less than 0, the frame is judged to be lost, and at the moment, the display queue data of the display module is insufficient, the extra _ repeat is equal to 1; to prevent picture stuttering, at most 3 frames are continuously lost; in order to prevent the repeated frames from being too much, 3 repeated frames are inserted into a synchronous queue for one frame node at most; in order to ensure smooth playing of several frames before the beginning of playing, at most 1 repeated frame is inserted into the several frames of extra _ repeat before the beginning of playing for compensation.
According to the technical scheme of the embodiment, the error of the time stamp for acquiring the audio and video frames can be reduced, and the audio and video synchronization effect is improved.
fig. 2 is a schematic structural diagram of a set-top box audio and video synchronization device provided in an embodiment of the present invention. As shown in fig. 2, the apparatus provided in this embodiment may include:
The determining module 11 is configured to determine that the first frame of audio and video is output in the current synchronization queue;
an obtaining module 12, configured to obtain a timestamp of a sequence of video frames in the current synchronization queue when the determining module determines that the first frame of audio and video is output in the current synchronization queue, and repair and predict the timestamp; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame;
a calculating module 13, configured to calculate an ideal frame rate and a compensation frame rate of the video frame sequence;
The obtaining module 12 is configured to obtain a repetition frame rate of the video frame sequence according to the ideal frame rate and the compensation frame rate; and according to the repetition frame rate, carrying out synchronous processing on the audio frame and the video frame in the current synchronous queue and outputting the audio frame and the video frame.
the calculation module 13 is specifically configured to: calculating the ideal frame rate of the video frame sequence through a low frame rate frame interpolation algorithm and a high frame rate frame dropping algorithm according to the refresh rate of the current display module and the time stamps of the video frame sequence; and calculating the compensation frame rate of the video frame sequence according to the difference value between the time stamps of the video frame sequence and the time stamps of the audio frame sequence.
the obtaining module 12 is specifically configured to: adding the ideal frame rate and the compensation frame rate to obtain a repeated frame rate of the video frame sequence; if the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate; if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue; if the repeated frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronous queue at the ideal frame rate; and if the repetition frame rate is greater than the ideal frame rate, inserting the frame-compensated frame into the video frame so as to synchronously output the audio frame and the video frame in the current synchronous queue.
Furthermore, the device can also comprise a judging module which is used for judging whether a first frame of audio and video is output in the current synchronization queue or not according to a first frame of audio and video output identification bit; if the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video; if the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
further, the obtaining module 12 may be further configured to: when the determining module 11 determines that the first frame of audio and video is not output in the current synchronization queue, outputting the first frame of audio and video according to a preset first frame of audio and video output synchronization scheme; the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme; if the preset first frame audio/video output synchronization scheme is a slow synchronization scheme, the obtaining module 12 outputs the first frame video frame first, and then synchronously outputs other video frames and audio frames in the synchronization queue; if the preset first frame audio/video output synchronization scheme is a fast synchronization scheme, the obtaining module 12 outputs the first frame audio frame and the first frame video frame at the same time.
specifically, the obtaining module 12 may be specifically configured to: if the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output; if the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not; if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the output identification position of the first frame audio and video as output; if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame; and if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as output.
the determining module may be further configured to: judging whether the input frame number in the current synchronization queue meets a preset frame number or not, and judging whether a display queue overflows or not; when the judging module judges that the input frame number in the current synchronous queue meets the preset frame number and the display queue does not overflow, the processing module executes the operation of judging whether the first frame of audio and video in the current synchronous queue is output or not.
The set-top box audio and video synchronization apparatus of this embodiment may be configured to execute the technical solution of the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.
In practical applications, the determining module 11, the obtaining module 12 and the calculating module 13 may be implemented by a Central Processing Unit (CPU), a Microprocessor (MPU), a Digital Signal Processor (DSP) or a Field Programmable Gate Array (FPGA) on the user terminal.
as will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
these computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (10)

1. a set top box audio and video synchronization method is characterized by comprising the following steps:
when determining that a first frame of audio and video is output in a current synchronization queue, acquiring a time stamp of a video frame sequence in the current synchronization queue, and repairing and predicting the time stamp; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame;
calculating the ideal frame rate of the video frame sequence through a low frame rate frame interpolation algorithm and a high frame rate frame dropping algorithm according to the refresh rate of the current display module and the time stamps of the video frame sequence;
Calculating a compensated frame rate of the sequence of video frames according to a difference between the time stamps of the sequence of video frames and the time stamps of the sequence of audio frames;
Adding the ideal frame rate and the compensation frame rate to obtain a repeated frame rate of the video frame sequence;
if the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate;
or if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue;
or, if the repetition frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronization queue at the ideal frame rate;
Or, if the repetition frame rate is greater than the ideal frame rate, inserting the repetition frames with the compensated frame rate into the video frame, so that the audio frame and the video frame in the current synchronization queue are synchronously output.
2. The method of claim 1, wherein before determining that the first frame of audio and video is output in the current synchronization queue, the method further comprises:
judging whether a first frame of audio and video is output in the current synchronization queue or not according to a first frame of audio and video output identification bit;
If the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video;
If the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
3. The method of claim 1, further comprising:
When the first frame of audio and video is determined not to be output in the current synchronization queue, outputting the first frame of audio and video according to a preset first frame of audio and video output synchronization scheme; the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme;
If the preset first frame audio and video output synchronization scheme is a slow synchronization scheme, outputting the first frame video frame, and then synchronously outputting other video frames and audio frames in the synchronization queue;
and if the preset first frame audio and video output synchronization scheme is a fast synchronization scheme, simultaneously outputting the first frame audio frame and the first frame video frame.
4. The method of claim 3, wherein said outputting the first frame of audio frames and the first frame of video frames simultaneously comprises:
if the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output;
If the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not;
if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the first frame audio and video output identification position as no;
if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame;
And if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as no.
5. the method according to claim 2, wherein before the determining whether the first frame of audio and video is output in the current synchronization queue according to the first frame of audio and video output identification bit, the method further comprises:
Judging whether the input frame number in the current synchronization queue meets a preset frame number or not, and judging whether a display queue overflows or not;
And when the input frame number in the current synchronization queue meets the preset frame number and the display queue does not overflow, executing the operation of judging whether the first frame of audio and video in the current synchronization queue is output or not.
6. A set-top box audio and video synchronization device is characterized by comprising:
the determining module is used for determining that the first frame of audio and video is output in the current synchronization queue;
the acquisition module is used for acquiring the time stamp of the video frame sequence in the current synchronization queue and repairing and predicting the time stamp when the determination module determines that the first frame of audio and video is output in the current synchronization queue; the first frame of audio and video comprises a first frame of audio frame and a first frame of video frame;
The computing module is used for computing the ideal frame rate of the video frame sequence through a low frame rate frame insertion algorithm and a high frame rate frame dropping algorithm according to the refresh rate of the current display module and the time stamps of the video frame sequence; calculating a compensated frame rate of the sequence of video frames according to a difference between the time stamps of the sequence of video frames and the time stamps of the sequence of audio frames;
the obtaining module is used for adding the ideal frame rate and the compensation frame rate to obtain a repeated frame rate of the video frame sequence; if the repetition frame rate is greater than zero, the current head node of the video post-processing module is input into the current synchronization queue, and the number of the repetition frames is equal to the repetition frame rate; or if the repetition frame rate is less than or equal to zero, discarding the current video frame, and synchronously outputting the audio frame and the video frames except the current video frame in the current synchronous queue; or, if the repetition frame rate is equal to the ideal frame rate, synchronously outputting the audio frames and the video frames in the current synchronization queue at the ideal frame rate; or, if the repetition frame rate is greater than the ideal frame rate, inserting the repetition frames with the compensated frame rate into the video frame, so that the audio frame and the video frame in the current synchronization queue are synchronously output.
7. the apparatus of claim 6, further comprising: the judging module is used for judging whether a first frame of audio and video is output in the current synchronization queue or not according to a first frame of audio and video output identification bit; if the first frame audio/video output identification position is negative, judging whether the output is the first frame audio/video; if the first frame audio/video output identification position is yes, the first frame audio/video is judged to be output.
8. The apparatus of claim 6, wherein the obtaining module is further configured to: when the determining module determines that the first frame of audio and video is not output in the current synchronization queue, outputting the first frame of audio and video according to a preset first frame audio and video output synchronization scheme; the preset first frame audio/video output synchronization scheme is a slow synchronization scheme or a fast synchronization scheme;
if the preset first frame audio and video output synchronization scheme is a slow synchronization scheme, the acquisition module outputs the first frame video frame first and then synchronously outputs other video frames and audio frames in the synchronization queue;
if the preset first frame audio/video output synchronization scheme is a fast synchronization scheme, the acquisition module simultaneously outputs the first frame audio frame and the first frame video frame.
9. The apparatus of claim 8, wherein the obtaining module is specifically configured to:
if the first frame audio frame is detected not to arrive, the first frame video frame waits and is not output;
If the first frame of audio frame is detected to arrive, judging whether the first frame of audio frame is output or not;
if the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is within the preset time difference range, outputting the first frame audio and video and setting the first frame audio and video output identification position as no;
If the first frame audio frame is not output and the difference value between the time stamp of the first frame audio frame and the time stamp of the first frame video frame is not within the preset time difference range, waiting for the first frame video frame when the first frame video frame is faster than the first frame audio frame, and discarding the first frame video frame when the first frame video frame is slower than the first frame audio frame;
And if the first frame audio frame is output, outputting the first frame video frame, and setting the first frame audio/video output identification position as no.
10. the apparatus of claim 7, wherein the determining module is further configured to:
judging whether the input frame number in the current synchronization queue meets a preset frame number or not, and judging whether a display queue overflows or not;
When the judging module judges that the input frame number in the current synchronous queue meets the preset frame number and the display queue does not overflow, the processing module executes the operation of judging whether the first frame of audio and video in the current synchronous queue is output or not.
CN201510691001.XA 2015-10-22 2015-10-22 method and device for synchronizing audio and video of set top box Active CN106612452B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510691001.XA CN106612452B (en) 2015-10-22 2015-10-22 method and device for synchronizing audio and video of set top box
PCT/CN2016/102775 WO2017067489A1 (en) 2015-10-22 2016-10-20 Set-top box audio-visual synchronization method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510691001.XA CN106612452B (en) 2015-10-22 2015-10-22 method and device for synchronizing audio and video of set top box

Publications (2)

Publication Number Publication Date
CN106612452A CN106612452A (en) 2017-05-03
CN106612452B true CN106612452B (en) 2019-12-13

Family

ID=58556676

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510691001.XA Active CN106612452B (en) 2015-10-22 2015-10-22 method and device for synchronizing audio and video of set top box

Country Status (2)

Country Link
CN (1) CN106612452B (en)
WO (1) WO2017067489A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711421A (en) * 2017-10-25 2019-05-03 腾讯科技(深圳)有限公司 A kind of data processing method and device
CN108379832B (en) * 2018-01-29 2021-03-30 珠海金山网络游戏科技有限公司 Game synchronization method and device
CN108495164B (en) * 2018-04-09 2021-01-29 珠海全志科技股份有限公司 Audio and video synchronization processing method and device, computer device and storage medium
CN111355989B (en) * 2018-12-21 2023-08-08 深圳市中兴微电子技术有限公司 Frame rate control method and related equipment
CN112825562B (en) * 2019-11-21 2022-11-01 杭州海康威视数字技术股份有限公司 Video frame compensation method and device and video processing chip
CN113596549B (en) * 2020-10-13 2023-09-22 杭州涂鸦信息技术有限公司 Audio and video synchronization method and device based on different reference clocks and computer equipment
CN113157228B (en) * 2021-02-01 2022-07-05 中国船舶重工集团公司第七0九研究所 Display control device and method for multi-source frame rate interactive high frame rate
CN113573119B (en) * 2021-06-15 2022-11-29 荣耀终端有限公司 Method and device for generating time stamp of multimedia data
CN113949898B (en) * 2021-10-13 2024-03-08 北京奇艺世纪科技有限公司 Multimedia processing method, device, equipment and storage medium
CN114268830B (en) * 2021-12-06 2024-05-24 咪咕文化科技有限公司 Cloud guide synchronization method, device, equipment and storage medium
CN114302169B (en) * 2021-12-24 2023-03-07 威创集团股份有限公司 Picture synchronous recording method, device, system and computer storage medium
CN114512139B (en) * 2022-04-18 2022-09-20 杭州星犀科技有限公司 Processing method and system for multi-channel audio mixing, mixing processor and storage medium
CN115334344B (en) * 2022-08-08 2023-08-18 青岛海信宽带多媒体技术有限公司 Channel switching method and device applied to intelligent set top box
CN116233472B (en) * 2023-05-08 2023-07-18 湖南马栏山视频先进技术研究院有限公司 Audio and video synchronization method and cloud processing system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101207822A (en) * 2006-12-22 2008-06-25 上海晨兴电子科技有限公司 Method for synchronization of audio frequency and video frequency of stream media terminal
CN101771869A (en) * 2008-12-30 2010-07-07 深圳市万兴软件有限公司 AV (audio/video) encoding and decoding device and method
EP2334049A2 (en) * 2009-12-14 2011-06-15 QNX Software Systems GmbH & Co. KG Synchronization of video presentation by video cadence modification
CN102685437A (en) * 2012-02-03 2012-09-19 深圳市创维群欣安防科技有限公司 Method and monitor for compensating video image
CN103581730A (en) * 2013-10-28 2014-02-12 南京熊猫电子股份有限公司 Method for achieving synchronization of audio and video on digital set top box

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103167342B (en) * 2013-03-29 2016-07-13 天脉聚源(北京)传媒科技有限公司 A kind of audio-visual synchronization processing means and method
US20150062353A1 (en) * 2013-08-30 2015-03-05 Microsoft Corporation Audio video playback synchronization for encoded media
CN104378675B (en) * 2014-12-08 2019-07-30 厦门雅迅网络股份有限公司 A kind of multi-channel sound audio video synchronization play handling method
CN104618786B (en) * 2014-12-22 2018-01-05 深圳市腾讯计算机系统有限公司 Audio and video synchronization method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101207822A (en) * 2006-12-22 2008-06-25 上海晨兴电子科技有限公司 Method for synchronization of audio frequency and video frequency of stream media terminal
CN101771869A (en) * 2008-12-30 2010-07-07 深圳市万兴软件有限公司 AV (audio/video) encoding and decoding device and method
EP2334049A2 (en) * 2009-12-14 2011-06-15 QNX Software Systems GmbH & Co. KG Synchronization of video presentation by video cadence modification
CN102685437A (en) * 2012-02-03 2012-09-19 深圳市创维群欣安防科技有限公司 Method and monitor for compensating video image
CN103581730A (en) * 2013-10-28 2014-02-12 南京熊猫电子股份有限公司 Method for achieving synchronization of audio and video on digital set top box

Also Published As

Publication number Publication date
WO2017067489A1 (en) 2017-04-27
CN106612452A (en) 2017-05-03

Similar Documents

Publication Publication Date Title
CN106612452B (en) method and device for synchronizing audio and video of set top box
CN106658133B (en) Audio and video synchronous playing method and terminal
CN110139148B (en) Video switching definition method and related device
US9489980B2 (en) Video/audio synchronization apparatus and video/audio synchronization method
CN108243350B (en) Audio and video synchronization processing method and device
US11503366B2 (en) Dynamic playout of transition frames while transitioning between play out of media streams
EP0897245A2 (en) MPEG frame processing method and apparatus
CN109275008B (en) Audio and video synchronization method and device
CN107566889B (en) Audio stream flow velocity error processing method and device, computer device and computer readable storage medium
WO2020036668A1 (en) Dynamic reduction in playout of replacement content to help align end of replacement content with end of replaced content
CN109660805B (en) Audio and video synchronous optimization method, storage medium, equipment and system in decoding and playing
US20040264577A1 (en) Apparatus and method for controlling the synchronization of a video transport stream
CN103888813A (en) Audio and video synchronization realization method and system
US9615130B2 (en) Method and device for processing multimedia frame and storage medium
US20150109411A1 (en) Image playback apparatus for 3dtv and method performed by the apparatus
CN111131874B (en) Method, equipment and computer storage medium for solving problem of playing jam of H.265 code stream random access point
CN106470291A (en) Recover in the interruption in time synchronized from audio/video decoder
CN112653904A (en) Rapid video clipping method based on PTS and DTS modification
CN103581730A (en) Method for achieving synchronization of audio and video on digital set top box
CN110087116B (en) Multi-rate live video stream editing method and device, terminal and storage medium
US8842740B2 (en) Method and system for fast channel change
CN113382300B (en) Audio and video playing method and device
US20210195001A1 (en) Reception device, data processing method, and transmission/reception system
CN110177294A (en) Player audio and video synchronization method and system, storage medium and terminal
CN107087210B (en) Method and terminal for judging video playing state based on cache time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant