KR19980027646A

KR19980027646A - Video and Audio Synchronization Method Using Timestamp Compensation and MPEG-2 Encoder Device Using It

Info

Publication number: KR19980027646A
Application number: KR1019960046485A
Authority: KR
Inventors: 김재동; 김시중; 고종석; 권순홍
Original assignee: 이준; 한국전기통신공사
Priority date: 1996-10-17
Filing date: 1996-10-17
Publication date: 1998-07-15
Also published as: KR100233937B1

Abstract

본 발명은 타임스탬프 보상을 이용한 비디오와 오디오 동기 방법 및 이를 이용한 MPEG-2 인코더 장치에 관한 것으로, 보다 상세하게는 MPEG-2 인코더의 비디오와 오디오의 디코딩타임스탬프 차이를 오디오에 보상하므로써 디코더에서 비디오와 오디오가 동기를 이루게 하기 위한 타임스탬프 보상을 이용한 비디오와 오디오 동기 방법 및 이를 이용한 MPEG-2 인코더 장치에 관한 것이다.The present invention relates to a video and audio synchronizing method using time stamp compensation and an MPEG-2 encoder apparatus using the same, and more particularly, to a video and audio synchronizing method using time stamp compensation, And an MPEG-2 encoder using the time stamp compensation for audio synchronization.

본 발명은 디코더에서 아무런 조작이 없이 인코더에서 코딩 및 디코딩 지연 요소를 고려한 타임스탬프를 삽입하여 미디어인 비디오와 오디오간의 동기를 이루기 위한 방법 및 이를 이용한 인코더 장치를 제공하고자 한다.The present invention provides a method for synchronizing video, which is a medium, with audio by inserting a time stamp considering coding and decoding delay factors in an encoder without any operation in the decoder, and an encoder device using the method.

본 발명은 상기와 같은 목적을 달성하기 위하여 MPEG-2 인코더의 비디오와 오디오의 디코딩타임스탬프의 차이를 오디오에 보상하여 비디오와 오디오가 디코더에서 동기되도록 하는 것을 특징으로 하는 타임스탬프 보상을 이용한 비디오와 오디오 동기 방법 및 이를 이용한 MPEG-2 인코더 장치를 제공한다.According to another aspect of the present invention, there is provided an apparatus and method for compensating for a difference in decoding timestamp between video and audio in an MPEG-2 encoder, An audio synchronization method and an MPEG-2 encoder device using the same are provided.

Description

Video and Audio Synchronization Method Using Timestamp Compensation and MPEG-2 Encoder Device Using It

제 1 도는 일반적인 비디오 및 오디오 코덱 시스팀의 구성 및 각 구성요소의 지연모델을 나타낸 도면,FIG. 1 is a diagram illustrating a configuration of a general video and audio codec system and a delay model of each component;

제 2 도는 본 발명에 따른 타임스탬프 보상법을 이용하는 MPEG-2 인코더 장치도이다.FIG. 2 is an MPEG-2 encoder apparatus using the time stamp compensation method according to the present invention.

*도면의 주요 부분에 대한 부호의 설명*Description of the Related Art [0002]

11 : 비디오 인코더12 : 오디오 인코더11: Video encoder 12: Audio encoder

13 : 다중화부14 : 역다중화부13: multiplexer 14: demultiplexer

15 : 비디오 디코더16 : 오디오 디코더15: video decoder 16: audio decoder

21 : 비디오 부호화기22 : 오디오 부호화기21: video encoder 22: audio encoder

23 : 비디오 PES 패킷화기24 : 오디오 PES 패킷화기23: Video PES packetizer 24: Audio PES packetizer

25,26 : 수신버퍼27 : 다중화기25, 26: Receive buffer 27: Multiplexer

28 : 타임스탬프 보상기28: Timestamp compensator

시간 종속적인 멀티미디어 서비스의 질은 서비스를 구성하는 미디어 객체의 연속성과 미디어 객체간의 동기에 있다. 이는 요구형 서비스 및 등시성 서비스에서도 동일하게 적용된다. 미디어 객체에 대한 연속성은 개별 미디어에 대한 안정적인 재생을 가능하게 하며, 이는 미디어 객체가 표현되는 주기가 일정하다는 것을 의미한다. 또한 시간 종속적인 미디어 객체간의 동기는 시간상의 상관관계를 규명함으로써 가능하다. 그러나 이러한 시간상의 상관관계가 규명되지 않은 인코더 시스템에서 발생된 MPEG-2 데이타는 디코더에서 동기를 이루기 어려우며, 이러한 경우 편법으로서 인위적으로 지연이 작은 미디어의 디코딩 시간이나 재생시간을 조절하는 방법을 사용한다. 그러나, 이러한 방법은 실시간 서비스에서는 부적절하며, 최상의 방법은 인코더에서 정확한 타임스탬프를 삽입하는 것이다.The quality of time-dependent multimedia services depends on the continuity of the media objects that make up the service and the synchronization between the media objects. This applies equally to demanding services and isochronous services. The continuity of the media object enables stable reproduction of the individual media, which means that the period in which the media object is represented is constant. In addition, synchronization between time-dependent media objects is possible by identifying temporal correlation. However, it is difficult to synchronize MPEG-2 data generated in an encoder system that has no correlation with time in the decoder. In this case, a method of adjusting the decoding time or reproduction time of the artificially delayed media is used . However, this method is inappropriate for real-time services, and the best method is to insert the correct time stamp in the encoder.

따라서 본 발명은 상기와 같은 문제점을 해결하기 위하여 디코더에서 아무런 조작이 없이 인코더에서 코딩 및 디코딩 지연 요소를 고려한 타임스탬프를 삽입하여 미디어인 비디오와 오디오간의 동기를 이루기 위한 방법 및 이를 이용한 인코더 장치를 제공하고자 한다.Therefore, in order to solve the above problems, the present invention provides a method for synchronizing video and audio, which is a medium, by inserting a time stamp considering coding and decoding delay factors in an encoder without any operation in a decoder and an encoder device using the method I want to.

상술한 목적 및 기타의 목적과 특징, 장점은 첨부된 도면과 관련하여 다음의 상세한 설명을 통하여 보다 분명해질 것이다. 이하, 첨부된 도면을 참고하여 본 발명의 실시예를 상세히 설명하기로 한다.BRIEF DESCRIPTION OF THE DRAWINGS The above and other objects, features, and advantages will become more apparent from the following detailed description taken in conjunction with the accompanying drawings. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

제 1 도는 일반적인 비디오 및 오디오 코덱 시스팀의 구성 및 각 구성요소의 지연모델을 나타낸다. MPEG-2 인코더는 비디오 인코더(11)와 오디오 인코더(12) 및 다중화부(13)로 구성이 된다. MPEG-2 디코더는 역다중화부(14), 비디오 디코더(15) 및 오디오 디코더(16)로 구성이 된다. 이 모델의 각 구성요소는 고유의 코딩, 디코딩, 다중화 및 역다중화지연이 발생하게 된다. 인코더의 비디오 및 오디오 인코더지연은 각각 d_ve, d_ae라 하고, 디코더의 비디오 및 오디오 디코더 지연을 각각 d_vd, d_ad라 하자. 다중화부(13) 및 역다중화부(14)의 지연은 각각 d_mux, d_demux라 정의한다. 인코더에서 발생된 트랜스포트 스트림은 통신망을 통하여 전송이 되는데, 이 때의 망에서 발생하는 지연은 d_net라 칭한다. 이 지연파라미터들에 대한 설명은 다음과 같다.FIG. 1 shows the configuration of a general video and audio codec system and a delay model of each component. The MPEG-2 encoder is composed of a video encoder 11, an audio encoder 12 and a multiplexer 13. The MPEG-2 decoder is composed of a demultiplexer 14, a video decoder 15 and an audio decoder 16. Each component of the model has inherent coding, decoding, multiplexing, and demultiplexing delays. Let the video and audio encoders of the encoder is delayed _ve d, d _ae la, each of the video and audio decoder in the decoder delay d _vd, d _ad d respectively. The delays of the multiplexer 13 and the demultiplexer 14 are defined as d _mux and d _demux , respectively. The transport stream generated by the encoder is transmitted through the communication network. The delay occurring in the network at this time is referred to as d _net . A description of these delay parameters is as follows.

d_ve는 카메라로부터 입력된 아날로그 영상신호가 디지틀 신호로 변환되어 비디오 인코더(11)가 코딩하기 위한 색신호 성분을 만들어 프레임 메모리에 데이타를 저장하기까지 소요되는 전처리 시간과, 전처리 과정을 거친 프레임 영상 데이타를 부호화한 최초의 압축영상데이타가 발생하기까지 걸린 시간을 뜻한다.d _ve is subjected to pre-treatment time, pre-treatment required is the analog image signal input from the camera converted to a digital signal created by the color signal component to the video encoder 11 coding to store the data in the frame memory frame image data Is the time taken for the first compressed image data to be encoded.

d_ae는 비디오의 경우와 마찬가지로 마이크로부터 입력된 아날로그 음성신호가 디지틀 신호로 변환된 후, 메모리에 저장되어 코딩된 첫 데이타가 발생하기까지 소요된 시간을 의미한다. 이들에 상응하는 디코더에서의 지연은 각각 d_vd, d_ad가 된다. 한편, d_mux는 압축 부호화된 비디오 및 오디오 데이타가 다중화되어 전송되기까지의 소요시간을 의미하며, d_demux는 다중화부(13)로부터 전송된 데이타가 역다중화되어 각 비디오 및 오디오 디코더(15, 16)로 전송되기까지의 소요시간을 의미한다.d _ae is the time taken for the analog data input from the microphone to be converted into a digital signal and then stored in the memory until the first coded data is generated. The delays in the corresponding decoders are d _vd and d _ad , respectively. In the meantime, _dmux denotes a time required for multiplexing and transmitting compression-coded video and audio data. In the d _demux , data transmitted from the multiplexing unit 13 is demultiplexed and transmitted to the video and audio decoders 15 and 16 To the time it is transmitted.

한편, 비디오는 오디오와 달리 특유의 지연요소가 더 존재하는데, 이는 가상버퍼에 의한 지연이다. 가상버퍼는 가변비트율 출력을 가지는 비디오 인코더의 출력을 항등비트율로 출력하기 위해 사용되며, 일정 양을 이 버퍼에 저장한 후에 데이타를 출력하므로써 최종적으로 디코더에서 버퍼에의 부족 또는 넘침을 방지하기 위한 것이다. 이 지연요소는 d_{vbv_delay}라 정의한다.On the other hand, video has a unique delay element unlike audio, which is delayed by the virtual buffer. A virtual buffer is used to output the output of a video encoder with a variable bit rate output at an arbitrary bit rate, and to store a certain amount of data in this buffer and then output data to prevent the decoder from overflowing or buffering the buffer . This delay element is defined as d _{vbv_delay} .

이러한 지연요소들은 공통된 시간지수를 사용하는데, 이는 27Mhz 국부발진기로부터 생성된 33비트의 90Khz 계수기 값이다. 이 계수기의 값을 시스팀 클럭 계수기(System Time clock Counter : 이하, STC라 한다)라 칭한다. STC는 시스템 디코더에서의 시각 기준이 된다.These delay factors use a common time index, which is a 33-bit, 90 KHz counter value generated from a 27 MHz local oscillator. The value of this counter is called a system time clock counter (STC). STC is the time reference in the system decoder.

비디오 및 오디오에 대한 디코딩타임스탬프(Decoding Time Stamp; 이하, DTS라 한다) 및 표현 타임스탬프(Presentation Time Stamp; 이하, PTS라 한다)는 인코더에서 삽입이 된다. DTS는 복호시각을 관리하기 위한 정보이며 PTS는 재생출력의 시각을 관리하기 위한 정보를 나타내는 타임스탬프이다.A Decoding Time Stamp (DTS) and a Presentation Time Stamp (PTS) for video and audio are inserted in the encoder. DTS is information for managing the decoding time, and PTS is a time stamp indicating information for managing the reproduction output time.

이 값들의 삽입은 일정한 단위로 수행되는데, 이를 액세스 유니트(Access Unit; 이하, AU라 한다)라 하며, 프레임은 그 한 예가 된다. 따라서 비디오의 삽입단위는 하나의 프레임 영상이 될 수 있으며, 오디오의 경우에는 오디오 프레임이라 불린다.The insertion of these values is performed in a predetermined unit, which is referred to as an access unit (AU), and a frame is an example thereof. Therefore, the insertion unit of the video can be one frame image, and in the case of audio, it is called an audio frame.

카메라로부터 입력되는 k번째 AU에 대한 DTS는 다음의 식으로 주어진다.The DTS for the kth AU input from the camera is given by the following equation.

DTS_video(k) = E_video(k)+d_ve(k)+d_{vbv_delay}(k)+d_net, k≥0(1) _{_{DTS video (k) = E video}} (k) + d ve (k) + d vbv_delay (k) + d net, k≥0 (1)

이 식에서 E_video(k)는 k번째 영상이 입력될 당시의 STC값이다. 식 1이 의미하는 것은 k번째 영상이 입력되어 전처리과정을 거치고 최초의 비디오 데이타가 발생하기까지의 지연(d_ve(k))과 가상버퍼지연(d_{vbv_delay}(k)) 및 망 전송지연(d_net)이 고려되어야 정확한 DTS를 알아낼 수 있다는 것이다. 영상의 코딩모드 및 버퍼상태에 따라 변화하는 값을 갖는 DTS와 달리, 영상이 디코더에서 표현되는 시점을 나타내는 PTS는 다음의 식과 같이 나타낼 수 있다.In this equation, E _video (k) is the STC value at the time when the kth video is input. Equation 1 means it is the k-th image is input to go through the pre-treatment process of a delay until the first video data is generated (d _ve (k)) and the virtual buffer delay (d _{vbv_delay} (k)), and network transmission delay (d _net ) must be taken into account in order to find the correct DTS. Unlike the DTS, which has a value varying according to the coding mode and the buffer state of the image, the PTS indicating the time at which the image is represented in the decoder can be expressed by the following equation.

이 식에서 d_ve(k)는 압축전송된 영상을 비디오 디코더가 디코딩하기 시작하여 영상데이타를 복원한 후 프레임 메모리에 저장하여 표현하기까지 소요되는 시간을 의미한다. MPEG-2 비디오는 인트라(intra:I) 영상, 순방향 예측(predictive : P)영상, I(혹은 P)영상과 P영상을 참조하는 양방향 예측(bidirectionally interpolated : B)영상으로 구분되는데, B영상의 유무에 따라 디코더에서는 재생시 프레임 재배열이 생긴다. 식 2는 이를 고려한 PTS이며, α는 이 프레임 재배열로 인한 디코딩 시간과 재생시간의 차를 의미한다. B영상이 없는 모드로 코딩한 경우에는 α = 0이 되며, d_ve(k)는 상대적으로 작은 값이므로 무시하면 PTS_video(k) = DTS_video(k)가 된다.In this equation d _ve (k) is the time it takes to represent the storage of a compressed image sent to the frame memory, restore the image data by the video decoder to start decoding. MPEG-2 video is divided into an intra (I) video, a predictive (P) video, a bidirectionally interpolated (B) video referring to an I (or P) video and a P video. Depending on the presence or absence of the decoder, a frame rearrangement occurs in the decoder. Equation 2 is the PTS considering this, and α means the difference between the decoding time and the reproduction time due to the frame rearrangement. If the coding since they do not have the B mode image and is _{α = 0, d ve (k} ) is ignored because it is relatively small when the _video value of PTS (k) = DTS _video (k).

오디오도 비디오와 마찬가지로 DTS와 PTS가 정의되는데, 이는 다음의 수식과 같다.Audio, like video, defines DTS and PTS as follows:

DTS_audio(n) = E_audio(n)+d_ae(n)+d_net,n≥0(3)DTS _audio (n) = E _audio (n) + d _ae (n) + d _net , n?

PTS_audio(n) = DTS_audio(n)+d_ad(n)PTS _audio (n) = DTS _audio (n) + d _ad (n)

= E_audio(n)+d_ae(n)+d_ad(n)+d_net(4)= E _audio (n) + d _ae (n) + d _ad (n) + d _net (4)

식 3과 4에서 E_audio(n)은 n번째 오디오 AU가 입력된 시점에서의 STC값이며, d_ae(n)은 n번째 음성이 입력되어 최초의 압축된 오디오 AU의 데이타가 출력되는데 소요되는 시간이다. d_ad(n)은 압축전송된 오디오 데이타가 복원되어 출력되는데 소요되는 시간을 나타낸다. 오디오의 PTS와 DTS는 오디오 코덱에서 사용하는 표본화 주파수와 밀접한 관계가 있으며, 이에 따라 각각 다른 값을 갖는다. 여기서 망 지연 d_net는 비디오의 경우와 동일한 값을 갖는다.In Eqs. 3 and 4, E _audio (n) is the STC value at the time when the nth audio AU is input, and d _ae (n) It is time. d _ad (n) represents the time taken for the compressed and transmitted audio data to be restored and output. The PTS and DTS of the audio are closely related to the sampling frequency used in the audio codec and thus have different values. Here, the network delay d _net has the same value as in the case of video.

이상에서 살펴본 것과 같이 미디어간의 동기에서 고려되어져야 할 사항은 각 미디어의 코딩 및 디코딩 지연이다. 일반적으로 MPEG-2 코덱 시스팀에서 비디오의 코딩 및 디코딩 지연은 오디오의 경우보다 크며, 이는 양방향 예측모드인 B 영상의 갯수에 따라 변화하게 된다. 따라서 디코더에서 각 미디어간의 동기를 맞추기 위해서는 오디오의 DTS 및 PTS에 일정한 값을 더함으로써 코딩 및 디코딩 지연차를 보상해 주어야 하며, 한 번 보상된 값은 서비스의 종료시까지 지속되어져야 한다.As described above, the coding and decoding delay of each media is a consideration in the synchronization between media. Generally, the coding and decoding delay of video in MPEG-2 codec system is larger than that of audio, and it changes according to the number of B images which are bidirectional prediction mode. Therefore, in order to synchronize the media with each other in the decoder, it is necessary to compensate the coding and decoding delay difference by adding a certain value to the DTS and PTS of the audio, and the compensated value should be maintained until the end of the service.

본 발명에서 분석한 비디오와 오디오의 DTS를 사용하면 미디어 객체간의 동기를 맞출 수 있는데, 오디오의 코딩 및 디코딩 지연의 비디오에 비해 상대적으로 작기 때문에 최초의 지연차에 대한 값을 보정하면 디코더에서 미디어간의 동기를 맞출 수 있다. 이 보정값은 최초의 비디오 및 오디오 DTS에 대한 차가 되며, 각각의 최초 지연은 다음과 같다.Using the video and audio DTS analyzed in the present invention, the synchronization between the media objects can be synchronized, which is relatively small compared to the video coding and decoding delay of the audio. Therefore, if the value of the first delay difference is corrected, You can synchronize. This correction value is the difference for the initial video and audio DTS, and each initial delay is as follows.

DTS_video(0) = E_video(0)+d_ve(0)+d_{vbv_delay}(0)+d_net(5) _{_{DTS video (0) = E video}} (0) + d ve (0) + d vbv_delay (0) + d net (5)

DTS_audio(0) = E_audio(0)+d_ae(0)+d_net(6)DTS _audio (0) = E _audio (0) + d _ae (0) + d _net (6)

식 5, 6에서 각 미디어의 최초 입력 시점인 E_video(0)와 E_audio(0)는 동일한 시점이므로 같은 STC값을 갖는다. 따라서 미디어간의 동기를 맞추기 위한 오디오 DTS의 보상값인 DTS_compen는 다음의 식이 된다.In Eqs. 5 and 6, E _video (0) and E _audio (0), which are the initial input points of each media, have the same STC value since they are at the same time. Therefore, DTS _{compen, which} is the compensation value of the audio DTS for synchronizing the media, is expressed by the following equation.

DTS_compen= DTS_video(0)-DTS_audio(0)DTS _compen = DTS _video (0) -DTS _audio (0)

= d_ve(0)+d_{vbv_delay}(0)-d_ae(0)(7) _{= D ve (0) + d} vbv_delay (0) -d ae (0) (7)

따라서 오디오가 비디오와 같이 서비스되는 경우에는 미디어간의 동기를 맞추기 위하여 이 보정된 값이 반영된 다음의,값이 된다.Therefore, when the audio is served as video, the following values , Lt; / RTI >

_audio _audio _compen _audio _audio _compen

= E_audio(n)+d_ae(n)+d_net+D_compen(8)= E _audio (n) + d _ae (n) + d _net + D _compen (8)

= E_audio(n)+d_ae(n)-d_ae(0)+d_ve(0)+d_{vbv_delay}(0)+d_net, n≥0 _{= E audio (n) + d} ae (n) -d ae (0) + d ve (0) + d vbv_delay (0) + d net, n≥0

_audio _audio _ad(9) _audio _audio _ad (9)

이상의 정량적인 분석을 통하여 완전한 비디오와 오디오의 동기를 이루기 위한 오디오의 타임스탬프 보정값을 살펴보았다. 여기에서는 인코더가 정보의 제공없이는 알 수 없는 지연요소가 있는데, 이는 디코더의 디코딩 지연요소인 d_vd(0), d_ad(0) 및 d_net가 포함된다. 이러한 지연요소는 전체 지연요소와 비교하였을 때 무시하여도 되는 값들이다.Through the above quantitative analysis, the time stamp correction value of the audio for synchronizing the complete video and audio was examined. Here, there is an unknown delay element without the encoder providing the information, which includes the decoding delay elements d _vd (0), d _ad (0) and d _net of the decoder. These delay factors are the values that can be ignored when compared to the total delay factor.

제 2 도는 본 발명에서 d_vd(0), d_ad(0) 및 d_net지연요소를 제외한 경우에서 타임스탬프 보상을 이용하는 MPEG-2 인코더 장치의 구성을 나타낸다. 종래의 인코더 장치는 비디오 부호화기(21)와, 오디오 부호화기(22), 비디오의 부호화된 신호를 패킷화하여 트랜스포트 스트림 다중화기(27)로 출력하는 비디오 PES 패킷화기(23), 오디오의 부호화된 신호를 패킷화하여 트랜스포트 스트림 다중화기(27)로 출력하기 위한 오디오 PES 패킷화기(24), 상기 비디오 부호화기(21)로부터 비디오 데이타를 수신받아 상기 비디오 PES 패킷화기(23)로 전송하기 위한 수신버퍼(25) 및, 상기 오디오 부호화기(22)로부터 오디오 데이타를 수신받아 상기 오디오 PES 패킷화기(24)로 전송하기 위한 수신버퍼(26)로 이루어져 있다.FIG. 2 shows a configuration of an MPEG-2 encoder apparatus using timestamp compensation in the case where d _vd (0), d _ad (0) and d _net delay elements are excluded in the present invention. The conventional encoder apparatus includes a video encoder 21, an audio encoder 22, a video PES packetizer 23 for packetizing a coded signal of video and outputting it to a transport stream multiplexer 27, An audio PES packetizer 24 for packetizing signals and outputting them to a transport stream multiplexer 27 and a receiver for receiving video data from the video encoder 21 and for transmitting them to the video PES packetizer 23 A buffer 25 and a reception buffer 26 for receiving audio data from the audio encoder 22 and transmitting the audio data to the audio PES packetizer 24.

본 발명은 상기의 인코더 장치에 보상값 D_compen을 구하여 보상된,값을 만들어내는 위한 타임스탬프 보상기(28)를 도입하였다.In the present invention, the compensation value D _compen is obtained in the encoder device, , A time stamp compensator 28 for generating a value is introduced.

비디오 부호화기(21)에서는 카메라로부터 프레임영상이 입력되는 시간정보를 부호화 시각 지시비트로 90Khz 계수기 및 타임스탬프 보상기(28)로 알리면 보상기(28)에서는 이 정보를 이용하여 90Khz 계수기의 값을 저장한다. 이 값은 E_video(0)가 된다. 이 때 오디오 데이타를 A/D 변환하기 시작한다(E_audio(0)). 이 시간으로부터 비디오 AU의 첫번째 바이트가 수신버퍼(25)에 입력되는 시간정보인 비디오 AU 지시 비트가 보상기로 입력되면 d_ve(0)가 계산되고, 마찬가지로 오디오에 대해서도 d_ae(0)가 계산된다. d_{vbv_delay}(0)는 비디오 데이타 중에서 첫번째의 영상 데이타 헤더가 가상버퍼에서 정의된 양만큼 쌓이기까지의 소요시간으로 이는 일정양만큼 쌓였을 때의 시간 정보를 제공받을 수도 있다. 이상의 과정을 거치면 식 7에서 제시한 보상값인 D_compen값을 구할 수 있으며, 결국 이 값을 이용하여 보상된 오디오의 DTS와 PTS를 얻을 수 있으며, 디코더에서는 비디오와 오디오의 동기를 일치시켜 재생가능하게 된다.When the video encoder 21 notifies the 90-kHz counter and the time stamp compensator 28 of the time information indicating the time at which the frame image is input from the camera, the value of the 90-kHz counter is stored in the compensator 28 using this information. This value becomes E _video (0). At this time, the audio data is A / D converted (E _audio (0)). From this time, if the video AU instruction bit, which is the time information when the first byte of the video AU is input to the reception buffer 25, is input to the compensator, d _ve (0) is calculated and d _ae . d _{vbv_delay} (0) is the time required for the first image data header among the video data to accumulate by a defined amount in the virtual buffer, which may be provided with time information when the image data header is accumulated by a certain amount. In this way, we can obtain the D _compen value, which is the compensation value given in Eq. 7, and finally obtain the DTS and PTS of the compensated audio using this value. In the decoder, .

이상에서와 같이 본 발명은 AU가 디코딩되어야 하는 시점을 나타내는 비디오 및 오디오의 DTS를 정확히 구하여 비디오 및 오디오의 DTS 차이를 인코더에서 보상하므로써 디코더에서 비디오와 디코더의 오디오간의 동기를 이룰 수 있도록 한 것으로서, HDTV, 주문형 비디오 서비스, DTV 등의 MPEG-2 인코더 및 디코더를 사용하는 장치들에 널리 사용이 기대된다.As described above, according to the present invention, the DTS of video and audio, which indicates the point in time at which the AU should be decoded, is accurately obtained and the DTS difference of video and audio is compensated in the encoder so that the synchronization between the video and the audio of the decoder can be achieved in the decoder. It is expected to be widely used in devices using MPEG-2 encoders and decoders such as HDTV, video-on-demand service, DTV, and the like.

본 발명의 바람직한 실시예는 예시의 목적을 위해 개시된 것이며, 당업자라면 본 발명의 사상과 범위안에서 다양한 수정, 변경, 부가 등이 가능할 것이며 이러한 수정, 변경 등은 이하의 특허 청구의 범위에 속하는 것으로 보아야 할 것이다.It will be apparent to those skilled in the art that various modifications, additions and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims. something to do.

Claims

In a method for synchronizing MPEG-2 video and audio,

A method of synchronizing video and audio using timestamp compensation, comprising compensating for audio a difference value D _compen of a decoding time stamp of video and audio of an MPEG-2 encoder, so that video and audio are synchronized in the decoder.

The method according to claim 1,

_Wherein the decoding time stamp difference value D _compen of the video and audio compensated for the audio is expressed by the following equation.

_{_{D compen = d ve (0)}} + d vbv_delay (0) -d ae (0)

(d _ve (0): the first image input is undergoing a pre-processing the first video data is

Delay before occurrence

d _{vbv_delay} (0): Virtual buffer delay for the first video input

d _ae (0): When the first audio is input and the first compressed audio AU data is output

Time spent)

A video PES packetizer for packetizing a coded signal of a video and outputting the packetized signal to a transport stream multiplexer, an audio PES packetizer for packetizing audio encoded signals and outputting the packetized audio signals to a transport stream multiplexer, An MPEG-2 encoder device comprising a receiving buffer for receiving video data from the video encoder and transmitting the video data to the video PES packetizer, and a receiving buffer for receiving audio data from the audio encoder and transmitting the data to the audio PES packetizer As a result,

A video access unit indication bit indicating time information that the first byte of the video access unit is input to the reception buffer from the video encoder, an encoding start instruction bit indicating time information when a frame image is input to the encoder, A decoding time stamp difference value between audio and video is calculated by taking an audio access unit indication bit indicating the time information in which the first byte of the audio access unit is input to the reception buffer as an input and the difference value is compensated for the audio decoding time stamp And a time stamp compensator for outputting a decoding time stamp and a presentation time stamp of video and audio to the video and audio PES packetizer side.