CN101483748A - Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network - Google Patents

Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network Download PDF

Info

Publication number
CN101483748A
CN101483748A CNA2008100556835A CN200810055683A CN101483748A CN 101483748 A CN101483748 A CN 101483748A CN A2008100556835 A CNA2008100556835 A CN A2008100556835A CN 200810055683 A CN200810055683 A CN 200810055683A CN 101483748 A CN101483748 A CN 101483748A
Authority
CN
China
Prior art keywords
audio
state
datagram
buffer
state machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008100556835A
Other languages
Chinese (zh)
Inventor
高成伟
严佳
陈航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WUDI YITONG (BEIJING) TECHNOLOGY Co Ltd
Original Assignee
WUDI YITONG (BEIJING) TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WUDI YITONG (BEIJING) TECHNOLOGY Co Ltd filed Critical WUDI YITONG (BEIJING) TECHNOLOGY Co Ltd
Priority to CNA2008100556835A priority Critical patent/CN101483748A/en
Publication of CN101483748A publication Critical patent/CN101483748A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)

Abstract

An audio and video synchronized method and device aiming at real-time video conversation application on a 3G circuit switched network can improve service quality of video conversation application by using a state machine to monitor audio and video synchronized state. Principle of work of the audio and video synchronized technology of the invention is based on queuing condition of audio datagram of an audio buffer for no timestamp information available in the datagram. The invention determines whether to delete the datagram in the audio buffer to realize audio and video synchronization via detecting queuing number of datagram in the audio buffer. State number of the state machine and the audio buffer can balance audio and video synchronization technology and service quality. The audio and video synchronization technology of the invention is convenient to apply and effectively maintains audio and video synchronization. The technology can be applied to various portable devices, for instance, 3G mobile phone.

Description

A kind of audio and video synchronization method and device of using at real-time video call on the 3G circuit-switched network
Technical field
The present invention relates to a kind of method and apparatus that can improve the audio-visual synchronization of the video calling service quality on the 3G circuit-switched network.
The example according to the present invention, Voice ﹠ Video signal are compressed by the Voice ﹠ Video coding engine respectively.In the 3G circuit-switched network, the network bandwidth is to guarantee.Yet, because Voice ﹠ Video has independent seizure and code device, in non-strict real time operating system, moving, Voice ﹠ Video will inevitably lose synchronously.If lost audio-visual synchronization, the service quality of visual telephone can significantly reduce so.Because audio-visual synchronization is a key issue in visual telephone, it must adopt a kind of simultaneous techniques to guarantee service quality.
Vt applications on the 3G Circuit Switching Data Network network has two characteristics:
Not can be used to carry out the timestamp of audio-visual synchronization in the-Voice ﹠ Video datagram.
-audio frequency and video processing time deviation is an accurate constant comparatively speaking, in the video calling session, and change among a small circle.
Audio and video synchronization method of the present invention adopts buffer technology respectively at the Voice ﹠ Video data that received, as shown in Figure 3.Because the sequential of audio data samples and coding is more accurate than video, thereby present technique invention employing voice data is a time reference.If the buffering of voice data surpasses a predefined threshold value, and can not find the complete video image of a frame in this video data buffering, the voice data in the buffer can be eliminated to guarantee audio-visual synchronization so.
Say that in principle the audio-visual synchronization engine that the inventive method and device are provided can be used for the electronic equipment of number of different types, as mobile phone, PDA, etc.
Background technology
Audio-visual synchronization is to weigh an important benchmark of video phone service quality.In the datagram switching network, as the Internet, audio-visual synchronization is by the control of the timestamp in the datagram, as RTP (RTP) datagram.The Voice ﹠ Video signal can find the corresponding informance of sequential separately by the timestamp in its datagram header.Yet this method can not be applied in the vt applications on the circuit-switched network, because do not comprise timestamp information in the Data Transport Protocol datagram H.223.
Audio-visual synchronization on the circuit-switched network is guaranteed by data channel special-purpose between two communication terminals.The Voice ﹠ Video signal send to network transmit before absolute coding and by H.223 breaking into complex data newspaper separately.Therefore, that fixed bit rate is encoded or transmit with fixed bit rate if the Voice ﹠ Video signal is with fixing frame per second, they should be synchronous so.In actual life, the Voice ﹠ Video signal can not guarantee with fixed bit rate and fixing frame per second coding, perhaps with the fixed bit rate transmission.This will cause audio frequency and video to lose synchronously.
The present invention is devoted to seek to improve the service quality of vt applications on the 3G circuit-switched network, promptly recovers the ability of audio-visual synchronization.If a kind of audio and video synchronization method or device at the 3G mobile network is practical, it should be simple as much as possible, because this application operates on the various portable terminals, as 3G mobile etc.In the scene of the real-time video call on the 3G circuit-switched network, the audio-visual synchronization technology is very important for vt applications, because it is a kind of important indicator of weighing the service quality of vt applications.Also there are not at present such method or device.
Summary of the invention
First target of the present invention provides a kind of audio and video synchronization method and device, can guarantee the audio-visual synchronization in the vt applications on the 3G circuit-switched network under the prerequisite of not obvious loss audio quality.
Second target of the present invention provides a kind of audio and video synchronization method and device with low computation complexity.
The example principle according to the present invention by a kind of audio-visual synchronization engine is provided, with its wideest form that contains, is kept the state that a state machine is monitored audio-visual synchronization, and how decision to change between each state of state machine, and how to carry out audio-visual synchronization.
Example of the present invention uses a state machine to monitor the state of audio-visual synchronization.This state machine has parameter, i.e. an audio data buffer.This buffer is used for the uncertainty of compensating network transmission and audio frequency and video processing sequential.If this audio buffer has been expired, and do not have the frame of video decodable code, the cushion space that this is illustrated in this buffer provides is not enough to guarantee audio-visual synchronization.When this happens, state machine moves changes to the state that another has more greatly and do not surpass the audio buffer capacity of the upper limit, comes the bigger time difference between compensating audio and the video.If the data of audio frequency queuing continue to increase, it finally can reach that state with maximal audio buffer capacity so.At this moment the voice data newspaper that arrives at first is eliminated, and audio buffer keeps N up-to-date voice data newspaper and gets back to initial condition.If do not have buffer overflow problem in certain period, state machine is transformed into another and has less and be not lower than the state of the audio buffer capacity of lower limit so.The present invention attempts to find balance between audio quality and the audio-visual synchronization by adjusting the audio buffer capacity.
The efficient of audio-visual synchronization technology of the present invention is to adjust by the number of states of selection mode machine.The state that state machine has is many more, and the technology of the present invention just can be carried out audio-visual synchronization more accurately.Yet the state that state machine has is many more, and it is complicated more that the realization of the technology of the present invention will become.The present invention allows the user to select the status number of its system, the balance between using with the efficient that finds the technology of the present invention and system resource.
Description of drawings
Fig. 1 audio-visual synchronization state machine diagram;
Fig. 2 state machine state flow path switch figure;
Fig. 3 audio-visual synchronization technology schematic diagram.
Embodiment
As shown in Figure 1, the present invention realizes by a state machine of controlling many states.Every kind of state is defined by an audio buffer with specific size.Status number and the audio buffer size that is associated with every kind of state can design and implement by taking into account system resource and application requirements.
Audio-visual synchronization engine 100 of the present invention is made up of a plurality of states and pairing audio buffer.For the vt applications in the 3G circuit-switched network, the communicating pair terminal is passed through a H.245 Indication message, and promptly SkewIndication sends the time difference between the Voice ﹠ Video.A terminal is received the time difference that can learn after this message between the audio ﹠ video that sends out from another terminal, and this information can be with deciding initial condition.For example, if the time deviation between the Voice ﹠ Video is N*20ms, receiving terminal can the audio buffer capacity setting be N voice data newspaper and it is set to initial condition so.N voice data newspaper is equivalent to the voice data of N*20ms, because each voice data newspaper comprises the 20ms voice data.Along with the carrying out of video calling, the state of the state machine of audio-visual synchronization monitoring audio buffer, as shown in Figure 2.If the data queued of audio frequency increases, state machine can be converted to the next state with bigger audio buffer capacity so.If the data of audio frequency queuing continue to increase, it finally can reach that state with maximal audio buffer capacity so.At this moment the voice data newspaper that arrives at first is eliminated, and audio buffer keeps N up-to-date voice data newspaper and gets back to initial condition.If audio queue reduces, state machine can be converted to the preceding state with littler audio frequency buffer capacity so, reaches initial condition up to it.Like this, carry out audio-visual synchronization by abandoning some voice datas.
Removing voice data can cause audio quality to reduce.Yet the method for different removing voice datas reduces audio quality different influences.If we are with end-state, promptly the voice data newspaper buffer capacity that removes state is set at very greatly, and the part voice content can be lost so.Yet this just takes place once in a while, and is not frequent.If it is very little that we are set at the buffer capacity of end-state, will frequently hear some non-level and smooth voice signals.The buffer capacity of final removing state should be selected according to system requirements.
Traditional audio-visual synchronization technology is controlled audio-visual synchronization by using at the timestamp in the datagram of audio and video medium type.They are at the datagram switching network, as the technology of the Internet.Use for the data on the 3G circuit-switched network, as visual telephone, timestamp is for datagram and unavailable.Therefore, traditional audio-visual synchronization technology can not be applied to this situation.The invention provides a kind of technology that on the 3G circuit-switched network, keeps audio-visual synchronization.
The insider should be as can be seen, and the primary and foremost purpose of audio-visual synchronization technology of the present invention is the audio-visual synchronization that keeps in the 3G circuit-switched network foundation structure, and can not cause voice quality obviously to reduce.
Because audio-visual synchronization example of the present invention do not need special hardware supports, only can realize its function, but not get rid of special hardware implementation mode, so this technology can easily be applied on any portable terminal, as 3G mobile etc. by software.In addition, the present invention can be applicable to other network architectures, as datagram switching network etc.
The front has very described the technology that the present invention submitted in detail, make the insider can understand and use the present invention, but, what also will draw attention to is, under the prerequisite that does not depart from essence of the present invention, can also change and improve the technological invention of being submitted to, and the present invention be subjected to the restriction of above explanation or accompanying drawing, but limited according to claims.

Claims (6)

1. keep the method for audio-visual synchronization may further comprise the steps at the video calling on the 3G circuit-switched network:
A. design and use a state machine, it is initial condition that current state is set;
B. monitor the number of the audio buffer sound intermediate frequency datagram under the current state;
If c. the number of audio buffer sound intermediate frequency datagram increases, then state machine is converted to the next state with bigger datagram capacity audio buffer;
If d. state machine is converted to the end-state with maximal audio buffer, then removes audio buffer sound intermediate frequency datagram and turn back to initial condition;
If e. the number of audio buffer sound intermediate frequency datagram reduces, then state machine is converted to and has the more Last status of small data newspaper capacity audio buffer, reaches initial condition up to it.
2. the method for claim 1, wherein step a comprises the step-length of buffer capacity increase and decrease between the buffer capacity of amount of state, each state in the decision state machine and the adjacent states, a plurality of steps.
3. method as claimed in claim 2 wherein should determine the number of states in the state machine and the buffer capacity of each state, so that audio-visual synchronization effect and the audio quality balance between reducing to be provided.
4. keep the device of audio-visual synchronization to comprise at the video calling on the 3G circuit-switched network:
A. design and use a state machine, the unit that current state is an initial condition is set;
B. monitor the unit of current state subaudio frequency buffer sound intermediate frequency datagram number;
If c. the number of audio buffer sound intermediate frequency datagram increases, then state machine is converted to the unit with the next state of bigger datagram capacity audio buffer;
If d. state machine is converted to the end-state with maximal audio buffer, then removes audio buffer sound intermediate frequency datagram and turn back to the unit of initial condition;
If e. the number of audio buffer sound intermediate frequency datagram reduces, then state machine is converted to and has the more Last status of small data newspaper capacity audio buffer, reaches the unit of initial condition up to it.
5. the method for claim 1, wherein unit a comprises the step-length of buffer capacity increase and decrease between the buffer capacity of amount of state, each state in the decision state machine and the adjacent states, a plurality of unit.
6. as device as described in the claim 4, wherein should determine the number of states in the state machine and the buffer capacity of each state, so that audio-visual synchronization effect and the audio quality balance between reducing to be provided.
CNA2008100556835A 2008-01-07 2008-01-07 Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network Pending CN101483748A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008100556835A CN101483748A (en) 2008-01-07 2008-01-07 Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008100556835A CN101483748A (en) 2008-01-07 2008-01-07 Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network

Publications (1)

Publication Number Publication Date
CN101483748A true CN101483748A (en) 2009-07-15

Family

ID=40880658

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008100556835A Pending CN101483748A (en) 2008-01-07 2008-01-07 Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network

Country Status (1)

Country Link
CN (1) CN101483748A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101742548B (en) * 2009-12-22 2012-08-29 武汉虹信通信技术有限责任公司 H.324M protocol-based 3G video telephone audio and video synchronization device and method thereof
CN102075818B (en) * 2009-11-20 2013-04-03 上海艾麒信息科技有限公司 Method for processing intelligent personal television (IPTV) network television data of global system for mobile communication (GSM) mobile phone
CN112511885A (en) * 2020-11-20 2021-03-16 深圳乐播科技有限公司 Audio and video synchronization method and device and storage medium
CN112888062A (en) * 2021-03-16 2021-06-01 芯原微电子(成都)有限公司 Data synchronization method and device, electronic equipment and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102075818B (en) * 2009-11-20 2013-04-03 上海艾麒信息科技有限公司 Method for processing intelligent personal television (IPTV) network television data of global system for mobile communication (GSM) mobile phone
CN101742548B (en) * 2009-12-22 2012-08-29 武汉虹信通信技术有限责任公司 H.324M protocol-based 3G video telephone audio and video synchronization device and method thereof
CN112511885A (en) * 2020-11-20 2021-03-16 深圳乐播科技有限公司 Audio and video synchronization method and device and storage medium
CN112888062A (en) * 2021-03-16 2021-06-01 芯原微电子(成都)有限公司 Data synchronization method and device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
US8045469B2 (en) System and method for adjusting transmission data rates to a device in a communication network
CN109524015B (en) Audio coding method, decoding method, device and audio coding and decoding system
US8723913B2 (en) Rate adaptation for video calling
US10686704B2 (en) Method and apparatus for providing a low latency transmission system using adaptive buffering estimation
JP4661373B2 (en) Transmission device and transmission program for controlling discard of specific media data
EP2772010B1 (en) Optimizing video-call quality of service
US9118801B2 (en) Optimizing video-call quality of service
EP3416335B1 (en) Optimization method and system on basis of network status of push terminal and push terminal
CN101658000A (en) Method of transmitting data in a communication system
CN109862377B (en) Video transmission method, device, system and computer readable storage medium
US10637903B2 (en) Failure detection manager
CN110996103A (en) Method for adjusting video coding rate according to network condition
US10362173B2 (en) Web real-time communication from an audiovisual file
CN113572836B (en) Data transmission method, device, server and storage medium
CN101483748A (en) Audio and video synchronization method and apparatus oriented to real-time video call application on 3G circuit switching network
CN110875860B (en) Method and device for processing network jitter
EP1936879B1 (en) System and method for adjusting characteristics of a video data transmission to a mobile device in a UMTS communications network
EP2043372B1 (en) Method for audio and video synchronization, receiving and transmitting device
JP4717327B2 (en) Video input device
US9118743B2 (en) Media rendering control
CN114786229A (en) Transmission method of call data and related device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090715