CN103888813A - Audio and video synchronization realization method and system - Google Patents

Audio and video synchronization realization method and system Download PDF

Info

Publication number
CN103888813A
CN103888813A CN201210564355.4A CN201210564355A CN103888813A CN 103888813 A CN103888813 A CN 103888813A CN 201210564355 A CN201210564355 A CN 201210564355A CN 103888813 A CN103888813 A CN 103888813A
Authority
CN
China
Prior art keywords
video
clock
audio
diff
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210564355.4A
Other languages
Chinese (zh)
Inventor
刘丽霞
张琍
孙昆
王文超
王子亨
穆森
常青
赵倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING AEROSPACE AIWEI ELECTRONIC TECHNOLOGY Co Ltd
Beijing Institute of Computer Technology and Applications
Original Assignee
BEIJING AEROSPACE AIWEI ELECTRONIC TECHNOLOGY Co Ltd
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING AEROSPACE AIWEI ELECTRONIC TECHNOLOGY Co Ltd, Beijing Institute of Computer Technology and Applications filed Critical BEIJING AEROSPACE AIWEI ELECTRONIC TECHNOLOGY Co Ltd
Priority to CN201210564355.4A priority Critical patent/CN103888813A/en
Publication of CN103888813A publication Critical patent/CN103888813A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention provides an audio and video synchronization realization method and a system. The method comprises steps: step 1, a memory management module stores transmitted video data packets and audio data packets into a memory; step 2, a decoding module extracts the video data packets and the audio data packets and carries out analytical processing to obtain audio frames, a current audio playback clock, video frames and a current video playback clock, and the step 3 and the step 4 are carried out; step 3, an audio playback module plays the corresponding audio frames according to the current audio playback clock; step 4, a threshold 1 and a threshold 2 are obtained according to ITU-R BT.1359-1 standard, the difference Diff of a clock control module is used to judge whether the audio and the video are synchronous, if the Diff is within the range between the threshold 1 and the threshold 2, synchronization is realized, or otherwise, synchronization is lost, the current audio playback clock serves as the synchronization clock, the speed of the current video playback clock is controlled to be synchronous onto the current audio playback clock, and the step 5 is carried out; and step 5, a video playback module plays the corresponding video frames according to the video playback clock after synchronization.

Description

A kind of implementation method of audio-visual synchronization and system
Technical field
The present invention relates to multimedia technology field, relate in particular to a kind of implementation method and system of audio-visual synchronization.
Background technology
Audio-visual synchronization is that sound and the video signal of synchronous acquisition keeps synchronous degree in reproduction process.
Along with the development of network and new and high technology, audio video transmission is developed into by Internet Transmission by coaxial cable, and the original audio, video data collecting needs compressed encoding, plays in real time or stores after receiving terminal through Internet Transmission.No matter be that the historical video of real-time video or storage is play at receiving terminal, all need the audio, video data of compression to decode, obtain original audio, video data.If original audio, video data absolute coding, transmission that synchronous acquisition arrives, storage, receiving reproduction end, just need to adopt audio-visual synchronization technology to guarantee synchronous broadcasting so.
At present, more conventional audio-visual synchronization technology has:
Multiplexed simultaneous techniques: by the data multiplex to of a multiple Media Streams data flow or a message.It has simplified synchronous between Media Stream; Do not need additional control channel and synchronization clock.But, waste bandwidth resource, multiplexing algorithm is more complicated also, is not suitable for Media Stream and derives from the situation of different nodes.
Timestamp simultaneous techniques: media data is made to timestamp in chronological order, and identical time stamp data are play simultaneously.It does not need extra synchronizing information, does not change original data stream.But access time stamp expense is large.
Synchronizing channel technology: media are at the transmission separating, and synchronizing information is carried out individual transmission by synchronizing signal.Its advantage is to support complicated synchronized relation, can be used for direct-connected device; Do not need clock synchronous.But synchronizing information may be lost.
Simultaneous techniques based on feedback: according to receiving terminal imbalance detection information, carry out Synchronization Control at transmitting terminal or receiving terminal, although be applicable to network environment, have certain hysteresis quality, real-time is poor.
Technically, the preferred plan of solution audio-visual synchronization problem is exactly timestamp.Common way is first to select a reference clock, in the time playing, timestamp in read block, arrange to play with reference to the time on present clock simultaneously, if the time started of data block is greater than the time on current reference clock, be not eager to play this data block, until reference clock reaches the time started of data block; If the time started of data block is less than the time on current reference clock, " as early as possible " play this blocks of data or without hesitation this blocks of data " abandoned " so that playing progress rate catch up with reference clock.
Summary of the invention
Goal of the invention of the present invention is, proposes a kind of implementation method and system of audio-visual synchronization, and it plays clock as synchronised clock using audio frequency, controlling video playout speed is synchronized on audio frequency broadcasting clock, the smooth playing that guarantees audio, video data, makes audio-visual synchronization without hysteresis, without postponing.
For achieving the above object, the invention provides a kind of implementation method of audio-visual synchronization.The method comprises:
Step 1, memory management module deposits the video packets of data of transmission and packets of audio data in internal memory;
Step 2, decoder module carries out dissection process to it respectively after reading described video packets of data and packets of audio data, obtains audio frame, current audio frequency broadcasting clock and frame of video, current video and plays clock, execution step 3 and step 4;
Step 3, audio playing module is play clock according to described current audio frequency and is play corresponding audio frame;
Step 4, obtain first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then the difference Diff that time control module is play clock and current video broadcasting clock according to described current audio frequency judges that whether audio frequency and video are synchronous, if Diff is within threshold1 and threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, control current video broadcasting clock speed and be synchronized on current audio frequency broadcasting clock, execution step 5;
Step 5, video playback module is play corresponding frame of video according to the video playback clock after synchronous.
In the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
When audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
Postponing described current video broadcasting clock is:
Computing relay difference delay_Diff; if delay_Diff < is threshold1, after the delay that doubles, recalculate delay_Diff, and and threshold1; threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay_Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
Wherein, first threshold is-90ms, and Second Threshold is 20ms, and ms is unit millisecond, and on the occasion of representing voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is for being not discernable thresholding.
For achieving the above object, the present invention also provides a kind of system that realizes of audio-visual synchronization, and this system comprises:
Memory management module, for depositing the video packets of data of transmission and packets of audio data in internal memory;
Decoder module carries out dissection process to it respectively after extracting described video packets of data and packets of audio data, obtains audio frame, current audio frequency broadcasting clock and frame of video, current video and plays clock, carries out respectively audio playing module and time control module;
Audio playing module, plays corresponding audio frame for playing clock according to described current audio frequency;
Time control module, for obtaining first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then judge that according to the difference Diff of described current audio frequency broadcasting clock and current video broadcasting clock whether audio frequency and video are synchronous, if Diff is within first threshold threshold1 and Second Threshold threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, controlling current video broadcasting clock speed is synchronized on current audio frequency broadcasting clock, carry out video playback module,
Video playback module, for playing corresponding frame of video according to the video playback clock after synchronous.
In the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
When audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
Postponing described current video broadcasting clock is:
Computing relay difference delay_Diff, if delay_Diff < is threshold1, after the delay that doubles, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay_Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
Wherein, first threshold is-90ms, and Second Threshold is 20ms, and ms is unit millisecond, and on the occasion of representing voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is for being not discernable thresholding.
Beneficial functional of the present invention is, simultaneously collected for audio, video data, but encode and store independently situation, a kind of implementation method and system of audio-visual synchronization proposed, it plays clock as synchronised clock using audio frequency, adopts timestamp technology to realize historical audio-visual synchronization and plays.It uses FFMPEG to decode respectively to historical audio-video document, the audio frequency calculating after decoding is play clock as synchronised clock, controls video playout speed and is synchronized on audio frequency broadcasting clock, has guaranteed the smooth broadcasting of audio, video data, synchronous without lagging behind, without postponing.
Describe the present invention below in conjunction with the drawings and specific embodiments, but not as a limitation of the invention.
Accompanying drawing explanation
Fig. 1 is the implementation method flow chart of audio-visual synchronization of the present invention;
Fig. 2 is the system schematic that realizes of audio-visual synchronization of the present invention;
Fig. 3 is the playing flow figure of audio-visual synchronization of the present invention.
Embodiment
Fig. 1 is the implementation method flow chart of audio-visual synchronization of the present invention.As shown in Figure 1, the method comprises:
Step 1, memory management module deposits the video packets of data of transmission and packets of audio data in internal memory;
Step 2, decoder module carries out dissection process to it respectively after extracting described video packets of data and packets of audio data, obtains that audio frame, current audio frequency are play clock and frame of video, current video are play clock, performs step respectively 3 and step 4;
Step 3, audio playing module is play clock according to described current audio frequency and is play corresponding audio frame;
Step 4, obtain first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then the difference Diff that time control module is play clock and current video broadcasting clock according to described current audio frequency judges that whether audio frequency and video are synchronous, if Diff is within threshold1 and threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, control current video broadcasting clock speed and be synchronized on current audio frequency broadcasting clock, execution step 5;
Step 5, video playback module is play corresponding frame of video according to the video playback clock after synchronous.
In the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
When audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
Postponing described current video broadcasting clock is:
Computing relay difference delay_Diff, if delay_Diff < is threshold1, after the delay that doubles, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay_Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
Wherein, first threshold is-90ms, and Second Threshold is 20ms, and ms is unit millisecond, and on the occasion of representing voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is for being not discernable thresholding.
Now illustrate the content of the method in conjunction with Fig. 3.Fig. 3 is the playing flow figure of audio-visual synchronization of the present invention.As shown in Figure 3:
Memory management module: the media data (real-time or historical audio data stream and video data stream) that memory management module is carried out transmission is write in internal memory, and these data are made up of a series of bags.
Decoder module: in decoder module, the audio pack in internal memory, after resolving, is used the decoding storehouse FFMPEG that increases income to decode audio frame and corresponding audio frequency displaying time PTS; In like manner, video packets, after resolving, is used the decoding storehouse FFMPEG that increases income to decode frame of video and corresponding video displaying time PTS.Because PTS is relative time, for audio frequency being play to clock and sound card, to play clock-unit consistent, audio frequency PTS is converted into current audio frequency and plays clock audio_clock, in like manner, obtains current video play clock video_clock by video displaying time PTS.In addition, estimate next frame video playback clock next_video_clock. by current video broadcasting clock and current video packet-related information
Audio playing module: the current audio frame that decoder module is obtained is put into sound card play buffer, plays clock audio_clock according to current audio frequency, and driver of sound card device is play corresponding audio frame.
Next determine video playback, video playback is fixed against current audio frequency broadcasting clock audio_video and plays synchronizeing of clock video_clock with current video.The present invention has set synchronization threshold threshold1 according to the achievement in research that human auditory, vision is perceiveed to ability, threshold2(is according to ITU-R BT.1359-1 standard recommendation: when sound falls behind video 90ms in the scope of leading video 20ms, people almost can not discover the variation of audio-visual quality, are not discernable thresholding.Threshold1=-90ms is set, threshold2=20ms, audio-visual synchronization represents with the time difference that they occur, and unit is ms, and voice signal, in advance for just, lags as bearing.
When the difference Diff of audio_clock and video_clock is within threshold1 and threshold2 scope, just think basic synchronization, otherwise be lock-out.In the time of audio-visual synchronization, corresponding frame of video is sent in video playback buffering area, utilize DirectX to play current video frame, then continue to read voice data, video data, decoding, upgrades current audio_clock and video_clock, calculated difference Diff, whole process repeated.
Beyond the difference Diff of audio frequency reproduction time audio_clock and video playback time video_clock drops on threshold1 and threshold2, illustrate that audio frequency and video are asynchronous.Lock-out has two kinds of situations:
1, audio frequency broadcasting speed is faster than video image display speed
In the time of Diff > threshold2, now audio frequency broadcasting speed is faster than video image display speed.If the absolute value of Diff is less than broadcasting as last frame duration now_duration, play current video frame, continue to read audio frequency and video bag, upgrade audio frequency reproduction time, video playback time, be recycled to and calculate Diff step; If the absolute value of Diff is greater than the duration now_duration that plays next frame video, skipping this frame of video does not broadcast, upgrading video playback clock is video_clock=next_video_clock, then read video packets, decoding, obtain next_video_clock, jump to the step of calculating current audio_clock and video_clock difference Diff, as shown in Figure 1.
2, audio frequency broadcasting speed lags behind video image display speed
When audio frequency starts to play, video total delay amount total_delay is set, initialization total_delay=0.In the time of Diff < threshold1, now audio frequency broadcasting speed lags behind video image display speed.Video playback needs to postpone, and is tentatively set time of delay is delay.
delay=last_duration=video_clock-last_video_clock。
If delay_Diff=audio_clock+total_delay+delay-video_clock.
If delay_Diff < is threshold1, illustrate that the delay time is shorter, doubling to postpone is delay=2*delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2.
If delay_Diff > is threshold2, time of delay is longer, needing to shorten time of delay is delay=delay/2, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2.
If threshold1<delay_Diff < is threshold2, video playback postponed after the delay time, upgraded audio frequency and play clock audio_clock, upgraded total_delay=total_delay+delay.Return to Fig. 1 and calculate Diff step.
Fig. 2 is the system schematic that realizes of audio-visual synchronization of the present invention.As shown in Figure 2, this system comprises:
Memory management module 100, for depositing the video packets of data of transmission and packets of audio data in internal memory;
Decoder module 200, after being used for extracting described video packets of data and packets of audio data, respectively it is carried out to dissection process, obtain audio frame, current audio frequency broadcasting clock and frame of video, current video and play clock, carry out respectively audio playing module 300 and time control module 400;
Audio playing module 300, plays corresponding audio frame for playing clock according to described current audio frequency;
Time control module 400, for obtaining first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then judge that according to the difference Diff of described current audio frequency broadcasting clock and current video broadcasting clock whether audio frequency and video are synchronous, if Diff is within first threshold threshold1 and Second Threshold threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, controlling current video broadcasting clock speed is synchronized on current audio frequency broadcasting clock, carry out video playback module 500,
Video playback module 500, for playing corresponding frame of video according to the video playback clock after synchronous.
In the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
When audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
Postponing described current video broadcasting clock is:
Computing relay difference delay_Diff; if delay_Diff < is threshold1, after the delay that doubles, recalculate delay_Diff, and and threshold1; threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
Wherein, first threshold is-90ms, and Second Threshold is 20ms, and ms is unit millisecond, and on the occasion of representing voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is for being not discernable thresholding.
Now illustrate the content of this system in conjunction with Fig. 3.Fig. 3 is the playing flow figure of audio-visual synchronization of the present invention.As shown in Figure 3:
Memory management module: the media data (real-time or historical audio data stream and video data stream) that memory management module is carried out transmission is write in internal memory, and these data are made up of a series of bags.
Decoder module: in decoder module, the audio pack in internal memory, after resolving, is used the decoding storehouse FFMPEG that increases income to decode audio frame and corresponding audio frequency displaying time PTS; In like manner, video packets, after resolving, is used the decoding storehouse FFMPEG that increases income to decode frame of video and corresponding video displaying time PTS.Because PTS is relative time, for audio frequency being play to clock and sound card, to play clock-unit consistent, audio frequency PTS is converted into current audio frequency and plays clock audio_clock, in like manner, obtains current video play clock video_clock by video displaying time PTS.In addition, estimate next frame video playback clock next_video_clock by current video broadcasting clock and current video packet-related information.
Audio playing module: the current audio frame that decoder module is obtained is put into sound card play buffer, plays clock audio_clock according to current audio frequency, and driver of sound card device is play corresponding audio frame.
Next determine video playback, video playback is fixed against current audio frequency broadcasting clock audio_video and plays synchronizeing of clock video_clock with current video.The present invention has set synchronization threshold threshold1 according to the achievement in research that human auditory, vision is perceiveed to ability, threshold2(is according to ITU-R BT.1359-1 standard recommendation: when sound falls behind video 90ms in the scope of leading video 20ms, people almost can not discover the variation of audio-visual quality, are not discernable thresholding.Threshold1=-90ms is set, threshold2=20ms, audio-visual synchronization represents with the time difference that they occur, and unit is ms, and voice signal, in advance for just, lags as bearing.
When the difference Diff of audio_clock and video_clock is within threshold1 and threshold2 scope, just think basic synchronization, otherwise be lock-out.In the time of audio-visual synchronization, corresponding frame of video is sent in video playback buffering area, utilize DirectX to play current video frame, then continue to read voice data, video data, decoding, upgrades current audio_clock and video_clock, calculated difference Diff, whole process repeated.
Beyond the difference Diff of audio frequency reproduction time audio_clock and video playback time video_clock drops on threshold1 and threshold2, illustrate that audio frequency and video are asynchronous.Lock-out has two kinds of situations:
1, audio frequency broadcasting speed is faster than video image display speed
In the time of Diff > threshold2, now audio frequency broadcasting speed is faster than video image display speed.If the absolute value of Diff is less than broadcasting as last frame duration now_duration, play current video frame, continue to read audio frequency and video bag, upgrade audio frequency reproduction time, video playback time, be recycled to and calculate Diff step; If the absolute value of Diff is greater than the duration now_duration that plays next frame video, skipping this frame of video does not broadcast, upgrading video playback clock is video_clock=next_video_clock, then read video packets, decoding, obtain next_video_clock, jump to the step of calculating current audio_clock and video_clock difference Diff, as shown in Figure 1.
2, audio frequency broadcasting speed lags behind video image display speed
When audio frequency starts to play, video total delay amount total_delay is set, initialization total_delay=0.In the time of Diff < threshold1, now audio frequency broadcasting speed lags behind video image display speed.Video playback needs to postpone, and is tentatively set time of delay is delay.
delay=last_duration=video_clock-last_video_clock。
If delay_Diff=audio_clock+total_delay+delay-video_clock.
If delay_Diff < is threshold1, illustrate that the delay time is shorter, doubling to postpone is delay=2*delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2.
If delay_Diff > is threshold2, time of delay is longer, needing to shorten time of delay is delay=delay/2, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2.
If threshold1<delay_Diff < is threshold2, video playback postponed after the delay time, upgraded audio frequency and play clock audio_clock, upgraded total_delay=total_delay+delay.Return to Fig. 1 and calculate Diff step.
Certainly; the present invention also can have other various embodiments; in the situation that not deviating from spirit of the present invention and essence thereof; those of ordinary skill in the art are when making according to the present invention various corresponding changes and distortion, but these corresponding changes and distortion all should belong to the protection range of the appended claim of the present invention.

Claims (10)

1. an implementation method for audio-visual synchronization, is characterized in that, comprising:
Step 1, memory management module deposits the video packets of data of transmission and packets of audio data in internal memory;
Step 2, decoder module carries out dissection process to it respectively after extracting described video packets of data and packets of audio data, obtains that audio frame, current audio frequency are play clock and frame of video, current video are play clock, performs step respectively 3 and step 4;
Step 3, audio playing module is play clock according to described current audio frequency and is play corresponding audio frame;
Step 4, time control module obtains first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then judge that according to the difference Diff of described current audio frequency broadcasting clock and current video broadcasting clock whether audio frequency and video are synchronous, if Diff is within first threshold threshold1 and Second Threshold threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, control current video broadcasting clock speed and be synchronized on current audio frequency broadcasting clock, execution step 5;
Step 5, video playback module is play corresponding frame of video according to the video playback clock after synchronous.
2. the implementation method of audio-visual synchronization as claimed in claim 1, is characterized in that, in the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
3. the implementation method of audio-visual synchronization as claimed in claim 1, it is characterized in that, when audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
4. the implementation method of audio-visual synchronization as claimed in claim 3, is characterized in that, postpones described current video broadcasting clock to be:
Computing relay difference delay_Diff, if delay_Diff < is threshold1, after the delay that doubles, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay_Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
5. the implementation method of audio-visual synchronization as claimed in claim 1, is characterized in that, first threshold is-90ms, Second Threshold is 20ms, and ms is unit millisecond, on the occasion of representing that voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is is not discernable thresholding.
6. the system that realizes of audio-visual synchronization, is characterized in that, comprising:
Memory management module, for depositing the video packets of data of transmission and packets of audio data in internal memory;
Decoder module carries out dissection process to it respectively after extracting described video packets of data and packets of audio data, obtains audio frame, current audio frequency broadcasting clock and frame of video, current video and plays clock, carries out respectively audio playing module and time control module;
Audio playing module, plays corresponding audio frame for playing clock according to described current audio frequency;
Time control module, for obtaining first threshold threshold1 and Second Threshold threshold2 according to ITU-R BT.1359-1 standard, then judge that according to the difference Diff of described current audio frequency broadcasting clock and current video broadcasting clock whether audio frequency and video are synchronous, if Diff is within first threshold threshold1 and Second Threshold threshold2 scope, audio-visual synchronization, otherwise be audio frequency and video lock-out, described current audio frequency is play to clock as synchronised clock, controlling current video broadcasting clock speed is synchronized on current audio frequency broadcasting clock, carry out video playback module,
Video playback module, for playing corresponding frame of video according to the video playback clock after synchronous.
7. the system that realizes of audio-visual synchronization as claimed in claim 6, is characterized in that, in the time of Diff > threshold2, audio frequency broadcasting speed, faster than video image display speed, carries out following processing:
If the absolute value of Diff is less than broadcasting when last frame duration, play current video frame, continue to read audio frequency and video bag, be recycled to and calculate Diff step;
If the absolute value of Diff is greater than the duration of playing next frame video, skipping this frame of video does not broadcast, continue to read video packets, obtain next frame video playback clock, the difference of playing clock and described next frame video playback clock according to described current audio frequency judges that whether audio frequency and video are synchronous.
8. the implementation method of audio-visual synchronization as claimed in claim 6, it is characterized in that, when audio frequency starts to play, video total delay amount initialization video total delay amount are set, in the time of Diff < threshold1, audio frequency broadcasting speed lags behind video image display speed, needs to postpone described current video broadcasting clock and makes audio-visual synchronization.
9. the system that realizes of audio-visual synchronization as claimed in claim 8, is characterized in that, postpones described current video broadcasting clock to be:
Computing relay difference delay_Diff, if delay_Diff < is threshold, after the delay that doubles, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If delay_Diff > is threshold2, after reducing half time of delay, recalculate delay_Diff, and and threshold1, threshold2 compares, until delay_Diff is within the scope of threshold1 and threshold2;
If threshold1<delay_Diff < is threshold2, repeating step 4 after renewal audio frequency broadcasting clock.
10. the system that realizes of audio-visual synchronization as claimed in claim 6, is characterized in that, first threshold is-90ms, Second Threshold is 20ms, and ms is unit millisecond, on the occasion of representing that voice signal in advance, negative value represents that voice signal lags, and the scope between two threshold values is is not discernable thresholding.
CN201210564355.4A 2012-12-21 2012-12-21 Audio and video synchronization realization method and system Pending CN103888813A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210564355.4A CN103888813A (en) 2012-12-21 2012-12-21 Audio and video synchronization realization method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210564355.4A CN103888813A (en) 2012-12-21 2012-12-21 Audio and video synchronization realization method and system

Publications (1)

Publication Number Publication Date
CN103888813A true CN103888813A (en) 2014-06-25

Family

ID=50957501

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210564355.4A Pending CN103888813A (en) 2012-12-21 2012-12-21 Audio and video synchronization realization method and system

Country Status (1)

Country Link
CN (1) CN103888813A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105245976A (en) * 2015-09-30 2016-01-13 合一网络技术(北京)有限公司 Method and system for synchronously playing audio and video
CN105898505A (en) * 2016-04-27 2016-08-24 北京小米移动软件有限公司 Method, device and system for testing audio and video synchronization in video instant messaging
CN106488288A (en) * 2015-08-27 2017-03-08 宏达国际电子股份有限公司 Virtual reality system and its audio/video synchronization method
CN107124641A (en) * 2017-06-02 2017-09-01 广东暨通信息发展有限公司 The control method that a kind of audio-visual synchronization is played
CN107509100A (en) * 2017-09-15 2017-12-22 深圳国微技术有限公司 Audio and video synchronization method, system, computer installation and computer-readable recording medium
CN107613356A (en) * 2017-08-30 2018-01-19 瑞声科技(新加坡)有限公司 Media and vibrations synchronous broadcast method and device, electronic equipment and storage medium
CN108055566A (en) * 2017-12-26 2018-05-18 郑州云海信息技术有限公司 Method, apparatus, equipment and the computer readable storage medium of audio-visual synchronization
CN108632645A (en) * 2017-03-17 2018-10-09 北京京东尚科信息技术有限公司 Information demonstrating method and device
CN108966028A (en) * 2018-08-17 2018-12-07 上海悠络客电子科技股份有限公司 A kind of anti-shaking method based on Network status dynamic regulation broadcasting speed
CN109194975A (en) * 2018-11-02 2019-01-11 深圳市云威物联科技有限公司 Audio-video live streaming chases after stream method and device
CN109275008A (en) * 2018-09-17 2019-01-25 青岛海信电器股份有限公司 A kind of method and apparatus of audio-visual synchronization
CN109348247A (en) * 2018-11-23 2019-02-15 广州酷狗计算机科技有限公司 Determine the method, apparatus and storage medium of audio and video playing timestamp
CN109828742A (en) * 2019-02-01 2019-05-31 珠海全志科技股份有限公司 Voice-frequency-multichannel synchronism output method, computer installation and computer readable storage medium
CN112291607A (en) * 2020-10-29 2021-01-29 成都极米科技股份有限公司 Video and audio data synchronous output method and system thereof
CN113055711A (en) * 2021-02-22 2021-06-29 迅雷计算机(深圳)有限公司 Audio and video synchronization detection method and detection system thereof
US11146611B2 (en) 2017-03-23 2021-10-12 Huawei Technologies Co., Ltd. Lip synchronization of audio and video signals for broadcast transmission
CN114710687A (en) * 2022-03-22 2022-07-05 阿里巴巴(中国)有限公司 Audio and video synchronization method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010021966A1 (en) * 2008-08-21 2010-02-25 Dolby Laboratories Licensing Corporation Feature optimization and reliability estimation for audio and video signature generation and detection
CN101778269A (en) * 2009-01-14 2010-07-14 扬智电子(上海)有限公司 Synchronization method of audio/video frames of set top box
CN102421035A (en) * 2011-12-31 2012-04-18 青岛海信宽带多媒体技术有限公司 Method and device for synchronizing audio and video of digital television

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010021966A1 (en) * 2008-08-21 2010-02-25 Dolby Laboratories Licensing Corporation Feature optimization and reliability estimation for audio and video signature generation and detection
CN101778269A (en) * 2009-01-14 2010-07-14 扬智电子(上海)有限公司 Synchronization method of audio/video frames of set top box
CN102421035A (en) * 2011-12-31 2012-04-18 青岛海信宽带多媒体技术有限公司 Method and device for synchronizing audio and video of digital television

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106488288A (en) * 2015-08-27 2017-03-08 宏达国际电子股份有限公司 Virtual reality system and its audio/video synchronization method
CN106488288B (en) * 2015-08-27 2019-08-06 宏达国际电子股份有限公司 Virtual reality system and its audio/video synchronization method
CN105245976B (en) * 2015-09-30 2016-11-23 合一网络技术(北京)有限公司 Voice & Video synchronizes the method and system play
CN105245976A (en) * 2015-09-30 2016-01-13 合一网络技术(北京)有限公司 Method and system for synchronously playing audio and video
CN105898505A (en) * 2016-04-27 2016-08-24 北京小米移动软件有限公司 Method, device and system for testing audio and video synchronization in video instant messaging
CN105898505B (en) * 2016-04-27 2019-02-19 北京小米移动软件有限公司 The method, apparatus and system of audio-visual synchronization are tested in video instant communication
CN108632645A (en) * 2017-03-17 2018-10-09 北京京东尚科信息技术有限公司 Information demonstrating method and device
US11146611B2 (en) 2017-03-23 2021-10-12 Huawei Technologies Co., Ltd. Lip synchronization of audio and video signals for broadcast transmission
CN107124641A (en) * 2017-06-02 2017-09-01 广东暨通信息发展有限公司 The control method that a kind of audio-visual synchronization is played
CN107613356A (en) * 2017-08-30 2018-01-19 瑞声科技(新加坡)有限公司 Media and vibrations synchronous broadcast method and device, electronic equipment and storage medium
CN107509100A (en) * 2017-09-15 2017-12-22 深圳国微技术有限公司 Audio and video synchronization method, system, computer installation and computer-readable recording medium
CN108055566A (en) * 2017-12-26 2018-05-18 郑州云海信息技术有限公司 Method, apparatus, equipment and the computer readable storage medium of audio-visual synchronization
CN108966028A (en) * 2018-08-17 2018-12-07 上海悠络客电子科技股份有限公司 A kind of anti-shaking method based on Network status dynamic regulation broadcasting speed
CN108966028B (en) * 2018-08-17 2021-04-30 上海悠络客电子科技股份有限公司 Anti-jitter method for dynamically adjusting play speed based on network condition
CN109275008A (en) * 2018-09-17 2019-01-25 青岛海信电器股份有限公司 A kind of method and apparatus of audio-visual synchronization
CN109194975B (en) * 2018-11-02 2021-04-20 深圳市云威物联科技有限公司 Audio and video live broadcast stream following method and device
CN109194975A (en) * 2018-11-02 2019-01-11 深圳市云威物联科技有限公司 Audio-video live streaming chases after stream method and device
CN109348247B (en) * 2018-11-23 2021-03-30 广州酷狗计算机科技有限公司 Method and device for determining audio and video playing time stamp and storage medium
CN109348247A (en) * 2018-11-23 2019-02-15 广州酷狗计算机科技有限公司 Determine the method, apparatus and storage medium of audio and video playing timestamp
CN109828742A (en) * 2019-02-01 2019-05-31 珠海全志科技股份有限公司 Voice-frequency-multichannel synchronism output method, computer installation and computer readable storage medium
CN109828742B (en) * 2019-02-01 2022-02-18 珠海全志科技股份有限公司 Audio multi-channel synchronous output method, computer device and computer readable storage medium
CN112291607A (en) * 2020-10-29 2021-01-29 成都极米科技股份有限公司 Video and audio data synchronous output method and system thereof
CN113055711A (en) * 2021-02-22 2021-06-29 迅雷计算机(深圳)有限公司 Audio and video synchronization detection method and detection system thereof
CN114710687A (en) * 2022-03-22 2022-07-05 阿里巴巴(中国)有限公司 Audio and video synchronization method, device, equipment and storage medium
CN114710687B (en) * 2022-03-22 2024-03-19 阿里巴巴(中国)有限公司 Audio and video synchronization method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103888813A (en) Audio and video synchronization realization method and system
CN109168078B (en) Video definition switching method and device
CN103167320B (en) The live client of audio and video synchronization method, system and mobile phone
CN101303880B (en) Method and apparatus for recording and playing audio-video document
CN101076121B (en) Stream generating apparatus, imaging apparatus, data processing apparatus and stream generating method
CN106612452B (en) method and device for synchronizing audio and video of set top box
CN107113462A (en) Sending method, method of reseptance, dispensing device and reception device
CN101917613B (en) Acquiring and coding service system of streaming media
US20090116814A1 (en) Reproducer, portable telephone, and reproducing method
CN104618786A (en) Audio/video synchronization method and device
CN107431845A (en) Sending method, method of reseptance, dispensing device and reception device
CN101710997A (en) MPEG-2 (Moving Picture Experts Group-2) system based method and system for realizing video and audio synchronization
CN102640511A (en) Method and system for playing video information, and video information content
KR20030078354A (en) Apparatus and method for injecting synchronized data for digital data broadcasting
CN101902649A (en) Audio-video synchronization control method based on H.264 standard
CN103888815A (en) Method and system for real-time separation treatment and synchronization of audio and video streams
CN108141636A (en) Reception device and method of reseptance
CN105493509A (en) Transmission apparatus, transmission method, reception apparatus, and reception method
CN109194974B (en) Media low-delay communication method and system for network video live broadcast
CN104780422A (en) Streaming media playing method and streaming media player
CN103718563A (en) Receiving apparatus and receiving method thereof
CN102752669A (en) Transfer processing method and system for multi-channel real-time streaming media file and receiving device
CN101483782B (en) Digital broadcast receiver and digital broadcast receiving method
CN107566889A (en) Audio stream flow rate error processing method, device, computer installation and computer-readable recording medium
CN106576188A (en) Transmission method, reception method, transmission device, and reception device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140625

WD01 Invention patent application deemed withdrawn after publication