CN104125493A

CN104125493A - Audio-video synchronization system and method

Info

Publication number: CN104125493A
Application number: CN201310145089.6A
Authority: CN
Inventors: 李忠一; 张雅智; 简裕峰
Original assignee: Hongfujin Precision Industry Shenzhen Co Ltd; Hon Hai Precision Industry Co Ltd
Current assignee: Hongfujin Precision Industry Shenzhen Co Ltd; Hon Hai Precision Industry Co Ltd
Priority date: 2013-04-24
Filing date: 2013-04-24
Publication date: 2014-10-29

Abstract

The invention relates to an audio-video synchronization system and method. The system is used for implementing the following steps that: decoded video data are stored to a first buffer region; when a time stamp of a video packet meets a preset requirement, the video data of the first buffer region are outputted to a display device; while video decoding is carried out, decoded audio data are stored to a second buffer region and a time stamp of the video packet is transmitted to the video packet; the decoded audio data in the second buffer region are transferred to a specified queue and then are read in the specified queue at preset time intervals and the decoded audio data are transmitted to a third buffer region; and the decoded audio data are read successively in the third buffer region and the read audio data are outputted to the display device. With the system and the method, the audio and video data can be synchronized.

Description

Audio-visual synchro system and method

Technical field

The present invention relates to a kind of coded system and method, relate in particular to a kind of audio-visual synchro system and method.

Background technology

In general, image (video package) and synchronizeing of sound (audio frequency package) be by data segment with time stamp (as the multimedia time, multimedia time, be called for short MM Time) carry out, image and sound are common with reference to same MM Time, audio frequency package is responsible for upgrading MM Time, and program determines whether presenting immediately picture, presents or abandon this picture out of date after a while according to the MM Time of image.

Manyly take multimedia video application that network is transmission (as video conference, network video phone, the audio-visual broadcasting of remote desktop) in order to reduce the consume of frequency range, image part branch adopts compress technique as H.264, yet due to restriction first in the sky, compression end (encode, be used for transmitting bitstream) first temporarily must be there is to buffering area (buffer) in image data (slice data), so that cannot by network, export decoding end (decode to immediately, be used for receiving bitstream), and decoding end also needs when this image data is without referenced or use by the time just this picture (frame) to be exported by buffering area when decoding, cause the nonsynchronous situation of picture and sound.

If now still use audio frequency package to remove to upgrade MM Time, to have very large probability be can be considered to expired and lose (frame drop) to this picture so, and frame drop situation more seriously will cause audio-visual broadcasting just as transparency effect.

Summary of the invention

In view of above content, be necessary to provide a kind of audio-visual synchro system and method, it can use a queue to be used as the buffering area of audio frequency package, and the multimedia time of synchronization video package and audio frequency package (Multimedia time), to reach the object of video and audio sync.

An audio-visual synchro system, is applied to electronic installation, and this system comprises: video decode module, and for the video package to receiving, decode, and decoded video data is stored to the first buffering area; Described video decode module, also for reading successively decoded video data from the first buffering area; Video output module, when the time when this video package, stamp met preset requirement, exports the video data reading on the display device of electronic installation; Audio decoder module, for when carrying out video decode, decodes to the audio frequency package receiving, and decoded audio data is stored to the second buffering area, and the time stamp of audio frequency package is passed to video package; Generation module, for reading decoded audio data from the second buffering area, moves to specified queue by decoded audio data, and every Preset Time, generates a consumption module; Consumption module, for reading decoded audio data from this specified queue, is sent to the 3rd buffering area by decoded audio data; Audio frequency output module, for reading successively decoded audio data from the 3rd buffering area, exports the audio data reading on display device to.

An audio-visual synchronous method, is applied to electronic installation, and the method comprises: video decode step 1, and for the video package to receiving, decode, and decoded video data is stored to the first buffering area; Video decode step 2 reads successively decoded video data from the first buffering area; Video output step, when the time of this video package, stamp met preset requirement, exports the video data reading on the display device of electronic installation; Audio decoder step, when carrying out video decode, decodes to the audio frequency package receiving, and decoded audio data is stored to the second buffering area, and the time stamp of audio frequency package is passed to video package; Generate step, from the second buffering area, read decoded audio data, decoded audio data is moved to specified queue, and every Preset Time, generate a consumption step; Consumption step reads decoded audio data from this specified queue, and decoded audio data is sent to the 3rd buffering area; Audio frequency output step reads successively decoded audio data from the 3rd buffering area, and the audio data reading is exported on display device.

Compared to prior art, described audio-visual synchro system and method, it can use a queue to be used as the buffering area of audio frequency package, the multimedia time of synchronization video package and audio frequency package (Multimedia time), to reach the object of video and audio sync, and without the program code of changing server end (being compression end).

Accompanying drawing explanation

Fig. 1 is the running environment schematic diagram of the audio-visual synchro system of the present invention.

Fig. 2 is the functional block diagram of the audio-visual synchro system of the present invention.

Fig. 3 is the flow chart of the audio-visual synchronous method of the present invention.

Fig. 4 is another describing mode schematic diagram of Fig. 3.

Main element symbol description

Electronic installation	2
		Display device	20
Input equipment	22
		Memory	23
Audio-visual synchro system	24
		Processor	25
Video decode module	240
		Audio decoder module	241
Generation module	242
		Consumption module	243
Video output module	244
		Audio frequency output module	245

Embodiment

As shown in Figure 1, be the running environment schematic diagram of the audio-visual synchro system of the present invention.This audio-visual synchro system 24 runs in electronic installation 2.This electronic installation 2 also comprises by the connected input equipment 22 of data/address bus, memory 23 and processor 25.Described electronic installation 2 can be computer, mobile phone, PDA(Personal Digital Assistant, personal digital assistant) etc.

Described memory 23 is for storing the data such as the program code of described audio-visual synchro system 24 and image.The various data that described input equipment 22 arranges for inputting user, for example, keyboard, mouse etc.In a specific embodiments, described electronic installation 2 can comprise the display device 20 being connected with data/address bus, and described display device 20 is for showing the data such as described image, and this display device 20 can be that the LCDs of computer is, the touch-screen of mobile phone etc.

In the present embodiment, described audio-visual synchro system 24 can be divided into one or more modules, described one or more module is stored in described memory 23 and is configured to and carried out by one or more processors (the present embodiment is a processor 25), to complete the present invention.For example, consult shown in Fig. 2, described audio-visual synchro system 24 is divided into video decode module 240, audio decoder module 241, generation module 242, consumption module 243, video output module 244 and audio frequency output module 245.The alleged module of the present invention has been the program segment of a specific function, than program, is more suitable in describing the implementation of software in electronic installation 2.The concrete function of each module is described below with reference to Fig. 3 and Fig. 4.

As shown in Figure 3, be the flow chart of the audio-visual synchronous method of the present invention.

In the following description, video decode step S10-S13 is synchronizeed and is carried out with audio decoder step S20-S23.When user plays a film or uses audio-visual software on virtual machine, server end can be set up a video streaming passage (Video Stream Channel) and an audio frequency crossfire passage (Audio Stream Channel) with client (as electronic installation 2), is used for transmitting video package (being image package) and audio frequency package (sound package).Electronic installation 2 will continue via this two passage receiver, video packages and audio frequency package.

Step S10, video decode module 240 is passed through video streaming passage from server end receiver, video package (Video packet).

Step S11, video decode module 240 these video packages of decoding, and decoded video data (being bit data, raw data) is stored to the first buffering area, as the Frame buffering area in Fig. 4.In the present embodiment, video decode module 240, according to the encryption algorithm of this video package, adopts corresponding decoding algorithm to decode to this video package.For example, video package adopts H.264 technology to encode, and video decode module 240 utilizes H.264 decoder to decode to this video package.

In other embodiments, can further include: video decode module 240, according to the OS Type of electronic installation 2, is carried out color gamut conversion to this decoded video data.For example, the operating system of supposing client (as electronic installation 2) is Windows, the colour gamut showing on Windows is RGBA(or RGB32, RGB series), but by image coding (as H.264), be that to adopt is YUV colour gamut (as YUV420, YUV440, YUV444) at server end, so the picture (frame) that video decode module 240 is decoded out is at the beginning YUV colour gamut, then video decode module 240 can change into RGB colour gamut by decoded video data again, makes decoded video data can be shown in the best way client.

Step S12, video decode module 240 reads successively decoded video data from the first buffering area, for example, reads a frame image frame.

Step S13, video decode module 240 judges whether the time stamp of this video package meets preset requirement.In the present embodiment, this time stamp is with the multimedia time (Multimedia time, MM Time) for example describes, and the multimedia time MM Time of this video package obtains from audio frequency package.

If the MM Time of this video package consistent with the current time of electronic installation 2 (as equated), video decode module 240 judges that the time stamp of this video package meets preset requirement, execution step S24, video output module 244 exports the video data reading on display device 20 to.The current time of described electronic installation is the current time of operating system (Operating system) record of electronic installation.

If the current time of the MM Time of this video package and electronic installation 2 is inconsistent, video decode module 240 judges that the time stamp of this video package does not meet preset requirement, and flow process is returned to step S12, and video decode module 240 reads next frame image frame.

Step S20, when video decode module 240 receives video package and decodes, audio decoder module 241 is passed through audio frequency crossfire passage from server end audio reception package (Audio packet).

Step S21, audio decoder module 241 these audio frequency packages of decoding, and decoded audio data (being bit data, raw data) is stored to the second buffering area, and as the PCM(Pulse Code Modulation in Fig. 4, pulse code modulation) buffering area.Meanwhile, audio decoder module 241 passes to video package by the time stamp of audio frequency package (as MM Time), and video package carries out synchronously (consulting step S13) with reference to the MM Time of audio frequency package.

In the present embodiment, audio decoder module 241, according to the encryption algorithm of this audio frequency package, adopts corresponding decoding algorithm to decode to this audio frequency package.For example, audio frequency package adopts pcm encoder technology to encode, and audio decoder module 241 utilizes PCM decoder to decode to this audio frequency package.

Step S22, generation module 242 reads decoded audio data from the second buffering area, and decoded audio data is moved to a specified queue, as the PCM Ring in Fig. 4.In the present embodiment, generation module 242 is a thread, for example Producer thread.

Step S23, generation module 242, every Preset Time, generates a consumption module 243.Then, consumption module 243 reads decoded audio data from specified queue, decoded audio data is sent to the 3rd buffering area, as the Wave Ring in Fig. 4.In the present embodiment, consumption module 243 is a thread, Consumer thread for example, and this consumption module 243 can finish voluntarily after decoded audio data is sent to the 3rd buffering area.

In the present embodiment, described Preset Time is the time difference of first audio data decoding and first video data decoding.That is to say, in the present invention, decoded audio data can't be sent to the 3rd buffering area at once and export, but decoded audio data is first deposited to a specified queue, by the time video package decodes after first picture, generation module 242 just starts to generate a consumption module 243 and consumes the data in this specified queue, thereby makes sound and picture reach synchronous.

Step S24, audio frequency output module 245 reads successively decoded audio data from the 3rd buffering area, and the audio data reading is exported on display device 20.

The present invention can be applied to far-end desktop, video conference and network video telephone etc., take far-end desktop application as example, can take following steps:

(1) client-side program parallel machine is installed in electronic installation 2 to far-end desktop.

(2) choose audio-visual playout software on far-end desktop or the application program of tool multi-media player function, wherein, image partly adopts H.264 coding.

(3) the synchronous playing video data of client-side program and audio document.

Finally it should be noted that, above embodiment is only unrestricted in order to technical scheme of the present invention to be described, although the present invention is had been described in detail with reference to preferred embodiment, those of ordinary skill in the art is to be understood that, can modify or be equal to replacement technical scheme of the present invention, and not depart from the spirit and scope of technical solution of the present invention.

Claims

1. an audio-visual synchro system, is applied to electronic installation, it is characterized in that, this system comprises:

Video decode module, decodes for the video package to receiving, and decoded video data is stored to the first buffering area;

Described video decode module, also for reading successively decoded video data from the first buffering area;

Video output module, when the time when this video package, stamp met preset requirement, exports the video data reading on the display device of electronic installation;

Audio decoder module, for when carrying out video decode, decodes to the audio frequency package receiving, and decoded audio data is stored to the second buffering area, and the time stamp of audio frequency package is passed to video package;

Generation module, for reading decoded audio data from the second buffering area, moves to specified queue by decoded audio data, and every Preset Time, generates a consumption module;

Consumption module, for reading decoded audio data from this specified queue, is sent to the 3rd buffering area by decoded audio data; And

Audio frequency output module, for reading successively decoded audio data from the 3rd buffering area, exports the audio data reading on display device to.

2. audio-visual synchro system as claimed in claim 1, is characterized in that, described video package receives from server end by video streaming passage, and described audio frequency package receives from server end by audio frequency crossfire passage.

3. audio-visual synchro system as claimed in claim 1, is characterized in that, video decode module also for, according to the OS Type of electronic installation, this decoded video data is carried out to color gamut conversion.

4. audio-visual synchro system as claimed in claim 1, is characterized in that, described video decode module also for, if the time stamp of this video package is consistent with the current time of electronic installation, judge that the time stamp of this video package meets preset requirement.

5. audio-visual synchro system as claimed in claim 1, is characterized in that, described Preset Time is the time difference of first audio data decoding and first video data decoding.

6. an audio-visual synchronous method, is applied to electronic installation, it is characterized in that, the method comprises:

Video decode step 1, decodes for the video package to receiving, and decoded video data is stored to the first buffering area;

Video decode step 2 reads successively decoded video data from the first buffering area;

Video output step, when the time of this video package, stamp met preset requirement, exports the video data reading on the display device of electronic installation;

Audio decoder step, when carrying out video decode, decodes to the audio frequency package receiving, and decoded audio data is stored to the second buffering area, and the time stamp of audio frequency package is passed to video package;

Generate step, from the second buffering area, read decoded audio data, decoded audio data is moved to specified queue, and every Preset Time, generate a consumption step;

Consumption step reads decoded audio data from this specified queue, and decoded audio data is sent to the 3rd buffering area; And

Audio frequency output step reads successively decoded audio data from the 3rd buffering area, and the audio data reading is exported on display device.

7. audio-visual synchronous method as claimed in claim 6, is characterized in that, described video package receives from server end by video streaming passage, and described audio frequency package receives from server end by audio frequency crossfire passage.

8. audio-visual synchronous method as claimed in claim 6, is characterized in that, described video decode step 1 also comprises:

According to the OS Type of electronic installation, this decoded video data is carried out to color gamut conversion.

9. audio-visual synchronous method as claimed in claim 6, is characterized in that, described video decode step 2 also comprises:

If the time stamp of this video package is consistent with the current time of electronic installation, judge that the time stamp of this video package meets preset requirement.

10. audio-visual synchronous method as claimed in claim 6, is characterized in that, described Preset Time is the time difference of first audio data decoding and first video data decoding.