KR20060010829A

KR20060010829A - Device for recording video data and audio data

Info

Publication number: KR20060010829A
Application number: KR1020057022680A
Authority: KR
Inventors: 테츠야 오카다; 다이스케 히라나카
Original assignee: 소니 가부시끼 가이샤
Priority date: 2003-06-12
Filing date: 2004-06-03
Publication date: 2006-02-02
Also published as: JP4305065B2; KR101006593B1; EP1633138B1; CN100521766C; EP1633138A1; JP2005006095A; US20060140280A1; US7738772B2; EP1633138A4; CN1802851A; WO2004112391A1

Abstract

When a pause request is made, an audio delay time as a delay time of an audio data frame based on a video data frame is calculated. During a pause, a frame shift time as a shift between the video data and audio data frame start time is monitored. When a pause release request is made, an audio correction time to be corrected in the pause request is calculated according to the audio delay time and the frame shift time. According to the audio correction time accumulated for each pause request, when it is judged that the audio data advances with respect to the video data, the video data is delayed by one frame with respect to the audio data, and when it is judged that the audio data is delayed with respect to the video data, the audio data is delayed by one frame with respect to the video data.

Description

Recording device for video data and audio data {DEVICE FOR RECORDING VIDEO DATA AND AUDIO DATA}

본 발명은, 영상 데이터 및 음성 데이터의 동기를 취하기 위한 음성/영상 동기 처리 장치 및 음성/영상 동기 처리 방법 및 음성/영상 기록 장치에 관한 것이다. 특히, 영상 데이터 및 음성 데이터의 일시 정지(포즈; pause)시의 AV(음성/영상) 동기 기술에 관한 것이다. The present invention relates to an audio / video synchronization processing apparatus for synchronizing video data and audio data, an audio / video synchronization processing method, and an audio / video recording apparatus. In particular, the present invention relates to an AV (audio / video) synchronization technique at the time of pause (pause) of video data and audio data.

예를 들면, MPEG의 인코더의 입력 장치 등의 음성/영상 동기 처리 장치(AV 기록 장치)에서는, 영상 데이터와 음성 데이터의 입력 신호의 프레임 길이(프레임 주기)가 다른 경우가 일반적이다. 또한, 음성 데이터 및 영상 데이터의 받아들이는 주기가 각각 프레임 단위로 행하여지는 점에 특징이 있다. 이하, 이와 같은 종래의 AV 기록 장치의 구성 및 작용에 관해 설명한다. For example, in an audio / video synchronization processing device (AV recording device) such as an input device of an MPEG encoder, the frame length (frame period) of an input signal of video data and audio data is generally different. In addition, there is a feature in that a period of receiving audio data and video data is performed in units of frames, respectively. Hereinafter, the configuration and operation of such a conventional AV recording apparatus will be described.

도 15는 종래의 AV 기록 장치의 시스템 구성도이다. 15 is a system configuration diagram of a conventional AV recording apparatus.

이 시스템은, 호스트(HOST)(1a)로부터 제어 지시를 받는 데이터 제어부(2a) 및 시스템 인코더(3a)로 구성된다. This system is comprised of the data control part 2a and system encoder 3a which receive control instruction from the host HOST 1a.

데이터 제어부(2a)는, 호스트(1a)로부터의 제어 지시를 음성/영상 제어부(AV_CTRL)(21a)가 받고, 타이머(TIMER)(24a)로부터의 시간 정보에 의거하여, 음성 제어부(22a), 영상 제어부((26a))에 대한 제어를 행한다. The data control unit 2a receives the control instruction from the host 1a by the audio / video control unit (AV_CTRL) 21a and based on the time information from the timer (TIMER) 24a, the audio control unit 22a, Control is performed on the video control unit 26a.

그리고, 이하, 음성/영상 제어부를, AV 제어부라고 칭한다. In the following, the audio / video control unit is referred to as an AV control unit.

AV 제어부(21a)는, 음성 제어부(A_CTRL)(22a)에 제어 지시를 냄으로써, 음성 데이터(A_DATA)의 입력 제어를 행한다. 입력된 음성 데이터는 음성 데이터 메모리(A_MEM)(23a)에 격납된다.The AV control unit 21a controls the input of the voice data A_DATA by giving a control instruction to the voice control unit A_CTRL 22a. The input voice data is stored in the voice data memory (A_MEM) 23a.

또한, AV 제어부(21a)는, 영상 제어부(V_CTRL)(26a)에 제어 지시를 냄으로써, 영상 데이터(V_DATA)의 입력 제어를 행한다. 입력된 영상 데이터는 영상 데이터 메모리(V_MEM)(25a)에 격납된다. The AV control unit 21a also controls the input of the video data V_DATA by giving a control instruction to the video control unit V_CTRL 26a. The input video data is stored in the video data memory (V_MEM) 25a.

데이터 제어부(2a)는, 타이머(24a)로부터의 시간 정보에 의거하여, 시스템 인코더(3a)에 대해, 시간 정보로서의 PTS(Presentation Time Stamp)를 부가한 음성 데이터(A_PTS) 및 영상 데이터(V_PTS)를 제공한다.Based on the time information from the timer 24a, the data control unit 2a adds audio data A_PTS and video data V_PTS to which the system encoder 3a is added a PTS (Presentation Time Stamp) as time information. To provide.

시스템 인코더(3a)는, 호스트(1a)로부터의 제어 지시에 의해 제어된다. 음성 인코더(A_ENC)(31a)는, 데이터 제어부(2a)로부터의 PTS를 부가한 음성 데이터를 인코드하고, 부호화한다. 영상 인코더(V_ENC)(33a)는 데이터 제어부(2a)로부터의 PTS를 부가한 영상 데이터를 인코드하고, 부호화한다. 멀티플렉서(MPX)(32a)는, 음성 인코더(31a)와 영상 인코더(33a)에 의해 부호화된 데이터를 다중화하고, 비트 스트림(BSD)을 생성한다. The system encoder 3a is controlled by the control instruction from the host 1a. The speech encoder (A_ENC) 31a encodes and encodes the speech data to which the PTS from the data control unit 2a is added. The video encoder (V_ENC) 33a encodes and encodes the video data to which the PTS from the data control unit 2a is added. The multiplexer (MPX) 32a multiplexes the data encoded by the audio encoder 31a and the video encoder 33a to generate a bit stream BSD.

그러나, MPEG의 인코더를 포함하는 AV 기록 장치에서는, 하드웨어상의 제약 때문에 영상 데이터와 음성 데이터의 프레임 주기가 변화되지 않는 것이 많다. 이러한 경우에, 영상 데이터의 프레임을 기준으로 포즈 처리를 행하면, 그 후에 포즈 해제를 행하는 때에, 영상 데이터에 대한 음성 데이터의 어긋남이 발생한다다는 문제(AV 동기 어긋남)가 있다. However, in an AV recording apparatus including an encoder of MPEG, the frame periods of video data and audio data do not change due to hardware limitations. In such a case, if the pose processing is performed on the basis of the frame of the video data, there is a problem (AV synchronization misalignment) that a deviation of the audio data with respect to the video data occurs when the pose is released later.

이 문제에 대해, 적절한 처치를 행하지 않는 경우는 동기의 어긋남이 축적되고, 시청자에게 이화감(異和感)으로 되어 지각(知覺)되게 된다. If the proper treatment is not performed on this problem, disparity in motivation accumulates and the viewer becomes perceived as a sense of ecstasy.

이하, 종래의 문제점을 도 16을 이용하여 구체적으로 설명한다. Hereinafter, the conventional problem will be described in detail with reference to FIG.

도 16은, 포즈 및 포즈 해제의 제어를 행하는 때의 AV 동기 어긋남의 예를 도시한 도면이다. Fig. 16 is a diagram showing an example of AV synchronization misalignment when controlling pose and release of pause.

도 15에 도시한 종래의 AV 기록 장치에서는, 데이터의 받아들임 제어는 프레임 단위로 밖에 행할 수 없고, 또한 포즈중에도 영상 데이터와 음성 데이터의 각 프레임 주기(각각, video_frame_time, audio_frame_time)를 바꿀 수 없다. In the conventional AV recording apparatus shown in Fig. 15, data reception control can be performed only in units of frames, and each frame period (video_frame_time, audio_frame_time) of video data and audio data cannot be changed during a pause.

도 16에서, 호스트(1a)로부터 포즈 요구(도면중 「P」로 나타낸다)를 받으면, 데이터 제어부(2a)에서 포즈 요구가 반영되는 것은, 영상 데이터(1)의 프레임의 단락 시각인 t161이다. 음성 데이터는, 시각(t161)에서는 프레임 주기가 도중(途中)이고, 다음의 음성 프레임에서 포즈 요구가 반영되기 때문에, 포즈시의 영상 데이터와 음성 데이터의 차분(差分)으로서 tp161이 생긴다. In FIG. 16, when the pause request (represented by "P" in the figure) is received from the host 1a, the pause request is reflected by the data control unit 2a at t161, which is the short time of the frame of the video data 1. The audio data has a frame period midway at time t161, and the pause request is reflected in the next audio frame. Thus, tp161 is generated as a difference between video data and audio data at the time of pause.

포즈중에는, 영상 데이터의 프레임 주기 및 음성 데이터의 프레임 주기는 그대로 변하지 않고, 포즈시의 영상 데이터와 음성 데이터의 차분인 tp161이 생긴 채로 보정되지 않은 상태이다. During the pause, the frame period of the video data and the frame period of the audio data remain unchanged, and are not corrected with tp161 being the difference between the video data and the audio data during the pause.

CPU(1a)로부터 포즈 해제 요구(도면중 「P_RL」로 나타낸다)를 받으면, 데이터 제어부(2a)에서, 포즈 해제 요구가 반영되는 것은, 영상 데이터(n)(VDn)의 입력 시작인 시각(t162)의 타이밍이다. 여기서, 포즈 해제시에, 포즈시의 영상 데이터와 음성 데이터의 차분인 tp161을 고려하여 영상 데이터에 대한 음성 데이터의 타이밍을 조정하면, AV 동기의 어긋남은 생기지 않는다. When the pause release request (represented by " P_RL " in the figure) is received from the CPU 1a, it is the time t162 at which the pause release request is reflected in the data control unit 2a at the start of input of the video data n (VDn). ) Is the timing. Here, if the timing of the audio data with respect to the video data is adjusted in consideration of tp161, which is the difference between the video data and the audio data at the time of pause, the AV synchronization does not occur.

그러나, 영상 데이터와 음성 데이터의 프레임 주기의 차(差)에 의해, 포즈 해제의 시각인 t162로부터 음성 입력 데이터(n)(ADn)의 입력 시작 시각까지의 차분(tp162)이, 포즈 해제시의 음성 데이터와 영상 데이터의 차분으로서 발생하기 때문에, 결과로서, 시각(t161)과 시각(tp162)으로부터 포즈 해제시에 AV 동기의 어긋남(tp163)이 생긴다. However, due to the difference between the frame periods of the video data and the audio data, the difference tp162 from t162 which is the time of pause release to the input start time of the audio input data n (ADn) is determined. As a result of the difference between the audio data and the video data, as a result, a deviation (tp163) of AV synchronization occurs when the pause is released from the time t161 and the time tp162.

특히, 영상 데이터와 음성 데이터의 프레임 주기가 변화되지 않는 경우는, 이 tp163이 포즈 요구마다 누적될 가능성이 있기 때문에, 위화감으로서 지각되는 일이 있다. In particular, when the frame periods of the video data and the audio data do not change, this tp163 may accumulate for each pause request, which may result in perceived discomfort.

본 발명의 목적은, 영상 데이터와 음성 데이터의 프레임 길이가 다르고, 게다가 영상 데이터와 음성 데이터의 프레임 길이를 변화할 수 없는 AV 기록 장치에서, AV 동기 어긋남을 일으키지 않는 AV 동기 처리 장치 및 방법을 제공하는 것에 있다. SUMMARY OF THE INVENTION An object of the present invention is to provide an AV synchronization processing apparatus and method which do not cause AV synchronization deviation in an AV recording apparatus in which the frame lengths of the video data and the audio data are different and the frame lengths of the video data and the audio data cannot be changed. It is in doing it.

본 발명은, 상기 과제를 참작하여 이루어진 것으로, 그 제 1의 관점은, 각각 다른 소정의 프레임 길이를 갖는 영상 데이터 및 음성 데이터에 대해 동기 처리를 행하는 음성/영상 동기 처리 장치로서, SUMMARY OF THE INVENTION The present invention has been made in view of the above problems, and a first aspect thereof is an audio / video synchronization processing apparatus which performs synchronization processing on video data and audio data having different predetermined frame lengths, respectively.

타이머 수단과, Timer means,

상기 타이머 수단에 의해 계시(計時)된 상기 영상 데이터 및 음성 데이터의 각 프레임의 시작 시각, 포즈 요구의 시각, 및 포즈 해제 요구의 시각을 기억하는 기억 수단과, Storage means for storing the start time of each frame of the video data and audio data timed by the timer means, the time of a pause request, and the time of a pause release request;

상기 영상 데이터와 음성 데이터의 각 프레임의 시작 시각, 상기 포즈 요구의 시각, 및 상기 포즈 해제 요구의 시각에 의거하여, 상기 포즈 해제 요구 후에 영상 데이터와 음성 데이터의 어느 하나를 프레임 단위로 지연시키는지, 또는 어느 것도 지연시키지 않는지를 결정하는 제어 수단을 갖는 음성/영상 동기 처리 장치이다. On the basis of the start time of each frame of the video data and audio data, the time of the pause request, and the time of the pause release request, which one of the video data and the audio data is delayed after the pause release request in units of frames. Audio / video synchronization processing device having control means for determining whether or not to delay anything.

상기 제어 수단은, The control means,

포즈 요구시에, 영상 데이터의 프레임의 단락을 기준으로 하여, 음성 데이터의 프레임의 지연 시간인 음성 지연 시간을 산출하고, When a pause request is made, an audio delay time which is a delay time of a frame of audio data is calculated on the basis of a paragraph of a frame of video data,

상기 포즈 요구 후에, 영상 데이터의 각 프레임의 시작 시각마다, 상기 영상 데이터에 대한 음성 데이터의 프레임 시작 시각의 차분인 프레임 시프트 시간을 모니터하고, After the pause request, each frame start time of the frame of the video data is monitored for a frame shift time which is a difference of the frame start time of the audio data with respect to the video data,

상기 음성 지연 시간과, 상기 포즈 요구에 대한 포즈 해제 요구시의 프레임 시프트 시간에 의거하여 음성 보정 시간을 산출하고, Calculate a speech correction time based on the speech delay time and the frame shift time at the time of the pause release request for the pause request,

각 포즈 요구마다 산출한 음성 보정 시간을 누적한 누적 음성 보정 시간에 의거하여, 상기 포즈 해제 요구 후에 영상 데이터와 음성 데이터의 어느 하나를 프레임 단위로 지연시키는지, 또는 어느 것도 지연시키지 않는지를 결정한다. Based on the cumulative speech correction time accumulated in the speech correction time calculated for each pause request, it is determined whether one of the video data and the audio data is delayed by frame or none after the pause release request. .

본 발명의 제 1의 관점에 의하면, 포즈 요구가 있는 시점의 영상 데이터에 대한 음성 데이터의 지연 시간(음성 지연 시간)을 취득하고, 그 후의 포즈중 영상 데이터와 음성 데이터의 프레임의 시프트 시간을 항상 모니터함으로써, 언제 포즈 해제 요구가 있다고 하더라도, 영상 데이터에 대한 음성 데이터의 어긋남을 1음성 데이터 프레임 이하로 억제하도록, 포즈 해제 후의 음성 데이터의 재생 타이밍을 조정하기 때문에, AV 동기 어긋남을 대폭적으로 억제할 수 있다. According to the first aspect of the present invention, the delay time (audio delay time) of the audio data with respect to the video data at the time of the pause request is obtained, and the shift time of the frame of the video data and the audio data during the subsequent pause is always By monitoring, even if there is a pause release request, the reproduction timing of the audio data after the pause release is adjusted to suppress the deviation of the audio data with respect to the video data to one audio data frame or less, thereby significantly suppressing the AV synchronization deviation. Can be.

본 발명의 제 2의 관점은, 각각 다른 소정의 프레임 길이를 갖는 영상 데이터 및 음성 데이터를 포함하는 다중화 데이터를 생성하는 음성/영상 기록 장치로서, A second aspect of the present invention is an audio / video recording apparatus for generating multiplexed data including video data and audio data having different predetermined frame lengths, respectively.

타이머 수단과, Timer means,

상기 타이머 수단에 의해 계시된 상기 영상 데이터 및 음성 데이터의 각 프레임의 시작 시각, 포즈 요구의 시각, 및 포즈 해제 요구의 시각을 기억하는 기억 수단과, Storage means for storing a start time of each frame of the video data and audio data, time of a pause request, and time of a pause release request, revealed by said timer means;

상기 영상 데이터와 음성 데이터의 각 프레임의 시작 시각, 상기 포즈 요구의 시각, 및 상기 포즈 해제 요구의 시각에 의거하여, 상기 포즈 해제 요구 후의 음성 데이터의 동기 처리를 프레임 단위로 행하는 동기 제어 수단과, Synchronization control means for performing synchronization processing of the audio data after the pause release request on a frame-by-frame basis based on a start time of each frame of the video data and audio data, the time of the pause request, and the time of the pause release request;

영상 데이터, 및 상기 동기 제어 수단에 의해 동기 처리된 음성 데이터에 대해, 시간 정보를 부가하여 상기 다중화 데이터를 생성하는 다중화 데이터 생성 수단을 갖는다. And multiplexed data generating means for generating the multiplexed data by adding time information to the video data and the audio data synchronized by the synchronous control means.

본 발명의 제 2의 관점에 의하면, 포즈 요구가 있는 시점의 영상 데이터에 대한 음성 데이터의 지연 시간(음성 지연 시간)을 취득하고, 그 후의 포즈중 영상 데이터와 음성 데이터의 프레임의 시프트 시간을 항상 모니터함으로써, 언제 포즈 해제 요구가 있다고 하더라도, 영상 데이터에 대한 음성 데이터의 어긋남을 1음성 데이터 프레임 이하로 억제하도록, 포즈 해제 후의 음성 데이터의 재생 타이밍을 조정하기 때문에, AV 동기 어긋남이 대폭적으로 억제된 다중화 데이터를 생성할 수 있다. According to the second aspect of the present invention, the delay time (audio delay time) of the audio data with respect to the video data at the time of the pause request is obtained, and the shift time of the frame of the video data and the audio data during the subsequent pause is always By monitoring, even when there is a pause release request, since the reproduction timing of the audio data after the pause release is adjusted so as to suppress the deviation of the audio data with respect to the video data to one audio data frame or less, the AV synchronization shift is significantly suppressed. Multiplexed data can be generated.

도 1은 본 발명의 한 실시 형태로서의 AV 기록 장치의 시스템 구성을 도시한 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 shows the system configuration of an AV recording apparatus as one embodiment of the present invention.

도 2는 AV 제어부(21)가 호스트(1)로부터 START 요구를 받은 경우의 처리를 도시한 플로우 차트. 2 is a flowchart showing processing when the AV control unit 21 receives a START request from the host 1;

도 3은 데이터 입력 시작에 따라 생성된 영상 PTS(V_PTS) 및 음성 PTS(A_PTS)를 설명하기 위한 타이밍 차트. 3 is a timing chart for explaining an image PTS (V_PTS) and an audio PTS (A_PTS) generated according to the start of data input.

도 4는 데이터 제어부(2)가 시스템 인코더(3)에 음성 데이터를 제공할 때에, PTS를 부가하는 처리를 도시한 플로우 차트. 4 is a flowchart showing a process of adding a PTS when the data control unit 2 provides voice data to the system encoder 3;

도 5는 데이터 제어부(2)가 시스템 인코더(3)에 영상 데이터를 제공할 때에, PTS를 부가하는 처리를 도시한 플로우 차트. 5 is a flowchart showing a process of adding a PTS when the data control unit 2 provides the video data to the system encoder 3;

도 6은 호스트(1)로부터의 포즈 요구에 의거하여 AV 제어부(21)가 행하는 처리를 도시한 플로우 차트. FIG. 6 is a flowchart showing a process performed by the AV control unit 21 based on a pause request from the host 1. FIG.

도 7은 포즈 요구에 대한 처리를 도시한 타이밍 차트. 7 is a timing chart showing processing for a pause request.

도 8은 호스트(1)로부터의 포즈 요구 처리후의 처리(포즈중 처리)를 도시한 플로우 차트. FIG. 8 is a flowchart showing a process (process during pose) after a pause request process from the host 1; FIG.

도 9는 프레임 시프트 시간(f_count)의 측정 방법을 도해한 도면.9 illustrates a method of measuring frame shift time f_count.

도 10은 호스트(1)로부터 포즈 해제 요구가 있은 때에, AV 제어부(21)에서 행하여지는 처리를 도시한 플로우 차트. FIG. 10 is a flowchart showing processing performed by the AV control unit 21 when a pause release request is made from the host 1. FIG.

도 11은 포즈중 시프트 시간 측정중의 경우의 음성 보정 시간(a_diff)의 산출 방법을 도해한 타이밍 차트. Fig. 11 is a timing chart illustrating a method of calculating the speech correction time a_diff when the shift time is measured during a pause.

도 12는 포즈중 시프트 시간 측정중이 아닌 경우의 음성 보정 시간(a_diff)의 산출 방법을 도해한 타이밍 차트. 12 is a timing chart illustrating a method of calculating a speech correction time (a_diff) when the shift time is not measured during a pause.

도 13은 영상 데이터의 입력 재개를 1프레임 지연시키는 처리에 의해, AV 동기 어긋남을 해소하는 처리를 설명하기 위한 도면. FIG. 13 is a diagram for explaining a process of eliminating AV synchronization deviation by a process of delaying input resume of video data by one frame; FIG.

도 14는 음성 데이터의 입력 재개를 1프레임 지연시키는 처리에 의해, AV 동기 어긋남을 해소하는 처리를 설명하기 위한 도면. FIG. 14 is a diagram for explaining a process of eliminating AV synchronization deviation by a process of delaying input resume of audio data by one frame; FIG.

도 15는 종래의 AV 기록 장치의 시스템 구성을 도시한 도면. 15 is a diagram showing a system configuration of a conventional AV recording apparatus.

도 16은 종래의 AV 기록 장치의 포즈 및 포즈 해제 처리를 도시한 타이밍 차트. Fig. 16 is a timing chart showing a pause and de-pause process of a conventional AV recording apparatus.

이하, 본 발명의 알맞는 실시의 형태에 관해, 첨부 도면을 참조하여 기술한다.EMBODIMENT OF THE INVENTION Hereinafter, preferred embodiment of this invention is described with reference to an accompanying drawing.

도 1은, 본 발명에 관한 음성/영상 동기 처리 장치의 한 실시의 형태인 AV 기록 장치이다. 또한, 도 1에 도시한 AV 기록 장치는, 도 15에 도시한 종래의 AV 기록 장치와 비교하여, 시스템 구성은 동일하지만, AV 제어부(21)에서의 제어에 특징이 있다. 1 is an AV recording apparatus which is an embodiment of an audio / video synchronization processing apparatus according to the present invention. In addition, the AV recording apparatus shown in FIG. 1 has the same system configuration as the conventional AV recording apparatus shown in FIG. 15, but is characterized by control by the AV control unit 21.

이하, 순서를 따라, AV 제어부(AV_CTRL)(21)에 있어서의, 호스트(HOST)(1)로부터의 START 요구에 의거한 처리, 정상시의 처리, 호스트(1)로부터의 포즈 요구에 의거한 처리, 포즈중에의 처리, 호스트(1)로부터의 포즈 해제 요구에 의거한 처리, 그리고 포즈 및 포즈 해제 요구에 의해 생기는 AV 동기 어긋남을 해소하는 처리에 관해 설명한다. In the following procedure, the processing based on the START request from the host HOST 1, the normal processing, and the pause request from the host 1 in the AV control unit AV_CTRL 21 will be described below. Processing, processing during pause, processing based on a pause release request from the host 1, and a process of eliminating AV synchronization shift caused by the pause and pause release request will be described.

우선, 호스트(1)로부터의 START 요구에 의거하여, AV 제어부(21)에서 행하여지는 처리에 관해 설명한다.First, the processing performed by the AV control unit 21 based on the START request from the host 1 will be described.

도 2는, AV 제어부(21)가 호스트(1)로부터 START 요구를 받은 경우의 처리를 도시한 플로우 차트이다.2 is a flowchart showing processing when the AV control unit 21 receives a START request from the host 1.

여기서, AV 제어부(21)는, 호스트(1)로부터 START 요구를 받으면, 타이머(24)로부터 시간 정보를 취득하고, STC_offset로서 도시하지 않은 메모리에 격납한다. 또한, 타이머(TIMER)(24)는, 예를 들면, 90kHz의 클록으로 동작하는 타이머이다. When the AV control unit 21 receives the START request from the host 1, the AV control unit 21 obtains time information from the timer 24 and stores it in a memory (not shown) as STC_offset. The timer (TIMER) 24 is, for example, a timer that operates with a clock of 90 kHz.

도 2에 데이터 제어부(2)에 있어서의 호스트(1)로부터의 START 요구의 처리 플로우를 도시한다. 2 shows a processing flow of the START request from the host 1 in the data control unit 2.

우선, AV 제어부(21)는, 호스트(1)로부터 START 요구를 수취하면, 영상 데이터의 프레임의 단락을 기다리고, 영상 데이터의 프레임의 단락을 검출하면(ST21), 타이머(24)로부터 시간 정보를 취득하고, 그 시간 정보를 STC_offset로서 보존한다 (ST22). First, when the AV control unit 21 receives the START request from the host 1, the AV control unit 21 waits for a short circuit of the video data frame and detects a short circuit of the video data frame (ST21). It acquires and stores the time information as STC_offset (ST22).

다음에, 영상 제어부(V_CTRL)(26)에 영상 데이터의 입력 시작 지시를 행하고(ST23), 또한 음성 제어부(A_CTRL)(22)에 음성 데이터의 입력 시작 지시를 행하고(ST24), CPU1로부터의 START 요구의 처리가 종료된다. Next, an instruction to start inputting video data is given to the video controller (V_CTRL) 26 (ST23), and an instruction to start input of audio data to the audio controller (A_CTRL) 22 (ST24), and the START from the CPU1. Processing of the request ends.

도 3은, 데이터 입력 시작에 따라 생성되는 영상 PTS(V_PTS) 및 음성 PTS(A_PTS)를 설명하기 위한 타이밍 차트이다. 3 is a timing chart for explaining an image PTS (V_PTS) and an audio PTS (A_PTS) generated according to the start of data input.

도 3에서, 호스트(1)로부터 START 요구를 받으면, 데이터 제어부(2)의 AV 제어부(21)는 영상의 프레임을 기준으로 하여, 영상 데이터와 음성 데이터의 입력을 시작한다. 그리고, 시작한 때의 시각(t31)을 타이머(24)로부터 취득하고, STC_offset로서 보존한다. In FIG. 3, upon receiving the START request from the host 1, the AV control unit 21 of the data control unit 2 starts input of the video data and the audio data on the basis of the frame of the video. Then, the time t31 at the start is obtained from the timer 24 and stored as STC_offset.

그 후, 영상 데이터 및 음성 데이터 각각의 프레임의 단락에서, 타이머(24)로부터 현재 시각을 순서대로 취득하고, START시의 STC_offset(t31)을 뺀 값을 PTS로서, 시스템 인코더(3)에 출력한다. After that, the current time is sequentially obtained from the timer 24 in the section of each frame of video data and audio data, and the value obtained by subtracting STC_offset (t31) at the start time is output to the system encoder 3 as a PTS. .

예를 들면, 도 3에서는, 영상 데이터 프레임의 단락을 검출하면, 타이머(24)로부터 시각(t32)을 취득하고, 시스템 인코더(3)에는 영상 입력 데이터와 함께 영상 데이터의 PTS를 통지한다. 마찬가지로 음성 데이터 프레임의 단락을 검출하면, 타이머(24)로부터 시각(t33)을 취득하고, 시스템 인코더(3)에는 음성 데이터와 함께 음성 데이터의 PTS를 통지한다.For example, in FIG. 3, when detecting the short circuit of a video data frame, time t32 is acquired from the timer 24, and the system encoder 3 notifies PTS of video data with video input data. Similarly, when a short circuit of the audio data frame is detected, the time t33 is obtained from the timer 24, and the system encoder 3 is notified of the PTS of the audio data together with the audio data.

다음에, 호스트(1)로부터의 START 요구의 처리후의 정상시의 처리에 관해 설명한다. Next, the normal processing after the START request processing from the host 1 will be described.

도 4는, 데이터 제어부(2)가 시스템 인코더(3)에 음성 데이터를 제공할 때에, PTS를 부가하는 처리를 도시한 플로우 차트이다. 4 is a flowchart showing a process of adding a PTS when the data control unit 2 provides voice data to the system encoder 3.

AV 제어부(21)는, 음성 데이터의 프레임의 단락을 검출하면(ST41), 타이머(24)로부터 시간 정보를 취득하여 보존한다(ST42). 그리고, START시에 보존한 STC_offset과, 취득한 시간 정보로부터 음성 PTS를 생성한다(ST43). 최후로, 시스템 인코더(3)의 음성 인코더(A_ENC)(31)에, 음성 프레임 데이터에 PTS 정보를 부가한 정보를 통지한다(ST44). The AV control unit 21 acquires and stores time information from the timer 24 when detecting the short circuit of the frame of the audio data (ST41) (ST42). Then, the voice PTS is generated from the STC_offset stored at the start time and the obtained time information (ST43). Finally, the voice encoder (A_ENC) 31 of the system encoder 3 is notified of information in which PTS information is added to the voice frame data (ST44).

이상의 처리를 정상 처리시, 음성 입력 프레임마다 행한다. The above processing is performed for each audio input frame in the normal processing.

도 5는, 데이터 제어부(2)가 시스템 인코더(3)에 영상 데이터를 제공할 때에, PTS를 부가하는 처리를 도시한 플로우 차트이다. FIG. 5 is a flowchart showing a process of adding a PTS when the data control unit 2 provides the video data to the system encoder 3.

AV 제어부(21)는, 영상 데이터의 프레임의 단락을 검출하면 CST51), 타이머(24)로부터 시간 정보를 취득하여 보존한다(ST52). 그리고, START시에 보존한 STC_offset과, 취득한 시간 정보로부터 영상 PTS를 생성한다(ST53). 최후로, 시스템 인코더(3)의 영상 인코더(33)에, 영상 프레임 데이터에 PTS 정보를 부가한 정보를 통지한다(ST54) The AV control unit 21 acquires and stores time information from the CST51 and the timer 24 when detecting the short circuit of the frame of the video data (ST52). Then, the video PTS is generated from the STC_offset stored at the start time and the obtained time information (ST53). Finally, the video encoder 33 of the system encoder 3 is informed of the information in which the PTS information is added to the video frame data (ST54).

이상의 처리를 정상 처리시, 영상 입력 프레임마다 행한다. The above processing is performed for each video input frame in the normal processing.

도 4 및 도 5에 도시한 플로우 차트에 따라, 각 데이터의 입력 시작이 행하여지고, PTS를 부가한 AV 동기가 취해진 음성 데이터와 영상 데이터가 데이터 제어부(2)로부터 시스템 인코더(3)에 제공된다. In accordance with the flowcharts shown in Figs. 4 and 5, input of each data is started, and audio data and video data obtained by AV synchronization with a PTS is provided from the data controller 2 to the system encoder 3; .

다음에, 호스트(1)로부터의 포즈 요구에 대한 처리에 관해 설명한다. Next, the processing for the pause request from the host 1 will be described.

도 6은, 호스트(1)로부터의 포즈 요구에 의거하여 AV 제어부(21)가 행하는 처리를 도시한 플로우 차트이다. 또한, 호스트(1)로부터 포즈 요구를 받은 때, AV 제어부(21)가 타이머(24)로부터 취득하는 시간 정보를, pause_STC_offset로 하고 있다. FIG. 6 is a flowchart showing a process performed by the AV control unit 21 based on the pause request from the host 1. When the pause request is received from the host 1, the time information acquired by the AV control unit 21 from the timer 24 is set to pause_STC_offset.

AV 제어부(21)는, 호스트(1)로부터 포즈 요구를 수취하면, 영상 데이터의 프레임의 단락을 기다리고, 영상 데이터의 프레임의 단락을 검출하면(ST61), 타이머(24)로부터 시간 정보로서 pause_STC_offset를 취득한다(sT62). 또한, 영상 제어부(26)에 영상 데이터의 입력 정지 지시를 행하고(ST63), 타이머(24)로부터의 시간 정보에 의거하여, 음성 데이터와 영상 데이터의 시프트 시간 측정을 시작한다(ST64). When the AV control unit 21 receives the pause request from the host 1, the AV control unit 21 waits for a short circuit of the video data frame and detects a short circuit of the video data frame (ST61). Then, the AV controller 21 sets pause_STC_offset as time information from the timer 24. (ST62). Further, the video control unit 26 gives an instruction to stop inputting the video data (ST63), and starts measuring the shift time of the audio data and the video data based on the time information from the timer 24 (ST64).

다음에, 음성 데이터의 프레임의 단락을 기다리고, 음성 프레임의 단락을 검출하면(ST65), 타이머(24)로부터의 시간 정보에 의거하여, 음성 데이터와 영상 데이터의 시프트 시간 측정을 종료한다(ST66). 동시에, 음성 데이터와 영상 데이터의 시프트 시간을 음성 지연 시간(a_delay)으로서 보존한다(ST67). 또한, 음성 데이터의 입력 정지 지시를 행하고(ST68), 호스트(1)로부터의 포즈 요구의 처리가 종료된다. Next, when waiting for a short circuit of the audio data frame and detecting a short circuit of the audio frame (ST65), the shift time measurement of the audio data and the video data is terminated based on the time information from the timer 24 (ST66). . At the same time, the shift time of the audio data and the video data is stored as the audio delay time a_delay (ST67). Further, an instruction to stop inputting voice data is made (ST68), and the processing of the pause request from the host 1 is terminated.

도 7은, 도 6에 도시한 포즈 요구에 대한 처리를 도시한 타이밍 차트이다. FIG. 7 is a timing chart showing processing for the pause request shown in FIG.

도 7에서는, 호스트(1)로부터 포즈 요구를 받으면, AV 제어부(21)는, 영상 데이터의 프레임을 기준으로 하여, 영상 데이터의 입력을 정지한다. 이때의 타이머(24)로부터 취득한 시각(t71)을 pause_STC_offset로서 보존한다. 그리고, 영상 데 이터 입력을 일시 정지한 시각(t71)으로부터, 다음에 음성 데이터의 프레임의 단락을 검출한 때, 타이머(24)로부터 시각(t72)을 취득한다.In FIG. 7, upon receiving a pause request from the host 1, the AV control unit 21 stops input of the video data on the basis of the frame of the video data. The time t71 acquired from the timer 24 at this time is stored as pause_STC_offset. Then, from the time t71 at which the video data input is paused, the time t72 is obtained from the timer 24 when the short circuit of the frame of the audio data is detected next.

또한, 시각(t72)과 시각(t71)의 차분을 a_delay로서 보존하고, 음성 데이터의 입력 일시 정지를 행한다. In addition, the difference between the time t72 and the time t71 is stored as a_delay, and the input of the audio data is paused.

다음에, 호스트(1)로부터의 포즈 요구 처리 후(도 7에서 시각(t72) 이후)의 처리(포즈중 처리)에 관해, 도 8에 도시한 플로우 차트에 관련지어 설명한다. Next, processing (during processing) after the pause request processing from the host 1 (after time t72 in FIG. 7) will be described with reference to the flowchart shown in FIG. 8.

포즈중에는, 이하에 기술하는 바와 같이, 음성 데이터와 영상 데이터의 프레임의 시프트 시간인 프레임 시프트 시간(f_count)의 측정을 행한다. During the pause, as described below, the frame shift time f_count, which is a shift time of a frame of audio data and video data, is measured.

도 8에서, 우선 현재 포즈중인지 여부의 판정을 행하고(ST81), 포즈중이라면, 음성 데이터의 프레임의 단락을 기다리고, 음성의 프레임 단락을 검출하면(ST82), 타이머(24)로부터 시간 정보를 취득하여 보존하고, 음성 데이터와 영상 데이터의 프레임 시프트 시간 측정을 시작한다(ST83). In Fig. 8, first, a determination is made as to whether a current pause is made (ST81). If a pause is made, a short circuit of a frame of voice data is awaited, and if a short frame of voice is detected (ST82), time information is acquired from the timer 24. And the frame shift time is measured for the audio data and the video data (ST83).

다음에 영상 데이터의 프레임의 단락을 기다리고, 영상의 프레임 단락을 검출하면(ST84), 타이머(24)로부터 시간 정보를 취득하여 보존하고, 음성 데이터와 영상 데이터의 프레임 시프트 시간 측정을 종료한다(ST85). Next, when waiting for a short circuit of the frame of the video data and detecting a short frame of the video (ST84), time information is obtained and stored from the timer 24, and the frame shift time measurement of the audio data and the video data is finished (ST85). ).

그리고, ST83에서의 음성 데이터와 영상 데이터의 시프트 시간 측정 시작 시각과, ST85에서의 음성 데이터와 영상 데이터의 시프트 시간 측정 종료 시각으로부터, 프레임 시프트 시간(f_count)을 기록한다(ST86). The frame shift time f_count is recorded from the shift time measurement start time of the audio data and the video data in ST83 and the shift time measurement end time of the audio data and the video data in ST85 (ST86).

이상의 처리를 포즈중에 반복하여 행하고, 프레임 시프트 시간(f_count)의 측정을 계속한다. 프레임 시프트 시간(f_count)은, AV 제어부(21) 내의 메모리에 재기록하여 가기 때문에, 포즈중 최신의 음성 데이터와 영상 데이터의 시프트 시간을 나타내고 있다. 여기서, 항상 f_count를 갱신하는 것은, 언제 포즈 해제 요구가 있을찌 예측할 수 없고, 그 요구에 대비할 필요가 있기 때문이다. The above process is repeated during the pause, and the measurement of the frame shift time f_count is continued. Since the frame shift time f_count is rewritten in the memory in the AV control unit 21, the frame shift time f_count represents the shift time of the latest audio data and video data during the pause. Here, always updating f_count is because it is impossible to predict when a pause release request will occur, and it is necessary to prepare for the request.

도 9는, 도 8의 플로우 차트를 기초로 설명한 프레임 시프트 시간(f_count)의 측정 방법을, 도시한 것이다. 9 illustrates a method of measuring the frame shift time f_count described based on the flowchart of FIG. 8.

음성 데이터의 단락을 검출하면, AV 제어부(21)는 타이머(24)로부터 시간 정보(t91)를 취득하고, 음성 데이터와 영상 데이터의 시프트 시간 측정을 시작한다.Upon detecting the short circuit of the audio data, the AV control unit 21 acquires the time information t91 from the timer 24, and starts the shift time measurement of the audio data and the video data.

다음에 영상 데이터의 단락을 검출하면, AV 제어부(21)는 타이머(24)로부터 시간 정보(t92)를 취득하고, 음성 데이터와 영상 데이터의 시프트 시간 측정(t92-t91)을 행한다. 여기서, 측정된 음성 데이터와 영상 데이터의 시프트 시간이, 프레임 시프트 시간(f_count)으로 된다.Next, when a short circuit of the video data is detected, the AV control unit 21 obtains time information t92 from the timer 24, and performs shift time measurement t92-t91 between the audio data and the video data. Here, the shift time of the measured audio data and the video data is the frame shift time f_count.

이 제어를 포즈중, 음성 데이터를 기준으로 반복하여 행하고, 항상 최신의 프레임 시프트 시간(f_count)을 보존한다. 도 9에서는, 프레임 시프트 시간(f_count)의 최신치는, 시각(t95)과 시각(t96)의 차분(t96-t95)이다. During the pause, this control is repeated based on the audio data, and the latest frame shift time f_count is always stored. In FIG. 9, the latest value of the frame shift time f_count is the difference t96-t95 between the time t95 and the time t96.

다음에, 호스트(1)로부터의 포즈 해제 요구에 대한 처리에 관해 설명한다. Next, the processing for the pause release request from the host 1 will be described.

즉, 호스트(1)로부터 포즈 해제 요구가 있은 때에, 포즈 요구가 있은 때와 포즈중에 측정한 음성 데이터와 영상 데이터의 어긋남에 의거하여, AV 제어부(21)가, 음성 데이터의 입력 재개를 지연시키는지, 영상 데이터의 입력 재개를 지연시키는지, 또는 어느 쪽도 지연시키지 않는지를 결정하고, AV 동기의 어긋남을 해소하는 방법을 이하에 기술한다. In other words, when there is a pause release request from the host 1, the AV control unit 21 delays the resumption of the input of the audio data based on the deviation of the audio data and the video data measured during the pause request and during the pause. Next, a method of determining whether to delay the resumption of input of the video data or neither of them is delayed, and how to eliminate the deviation of the AV synchronization is described below.

도 10은, 호스트(1)로부터 포즈 해제 요구가 있은 때에, AV 제어부(21)에서 행하여지는 처리를 도시한 플로우 차트이다. FIG. 10 is a flowchart showing a process performed by the AV control unit 21 when a pause release request is made from the host 1.

또한, 도 10의 플로우 차트의 a_diff는, 포즈시와 포즈 해제시에 있어서의 음성 데이터와 영상 데이터의 어긋남인 음성 보정 시간을 나타내고 있다. 또한, 도 10의 플로우 차트에서의 total_audio_delay는, 영상 데이터에 대한 음성 데이터의 어긋남을 축적하여 가는 변수인 누적 음성 보정 시간으로서, 시스템 초기화시에 0으로 초기화된다. In addition, a_diff of the flowchart of FIG. 10 has shown the audio | voice correction time which is a shift | offset | difference of audio | voice data and video data at the time of a pause and a pause release. In addition, total_audio_delay in the flowchart of FIG. 10 is a cumulative audio correction time which is a variable for accumulating deviation of audio data with respect to video data, and is initialized to 0 at system initialization.

또한, 이미 기술한 바와 같이, 프레임 시프트 시간(f_count)은 영상 데이터 프레임의 단락의 타이밍에서 갱신된다. 따라서, 시간축에서, 음성 데이터의 프레임의 단락부터 영상 데이터의 프레임의 단락까지의 시간이, 도 10의 플로우 차트에서, 「음성 데이터와 영상 데이터의 시프트 시간 측정중」인 것을 의미하고 있다. In addition, as already described, the frame shift time f_count is updated at the timing of the paragraph of the video data frame. Therefore, the time from the paragraph of the frame of audio data to the paragraph of the frame of video data on the time axis means that "in the shift time measurement of audio data and video data" in the flowchart of FIG.

예를 들면, 도 9의 횡방향의 화살표로 도시한 시간이, 음성 데이터와 영상 데이터의 시프트 시간 측정중인 것을 의미하고 있고, 그 이외의 시간은 음성 데이터와 영상 데이터의 시프트 시간 측정중이 아닌 것을 의미하고 있다. For example, the time indicated by the horizontal arrow in Fig. 9 means that the shift time of the audio data and the video data is being measured, and the time other than that is not during the shift time measurement of the audio data and the video data. It means.

도 10에서, AV 제어부(21)는 호스트(1)로부터 포즈 해제 요구를 수취하면, 영상 데이터의 프레임의 단락을 기다리고, 영상 데이터의 프레임의 단락을 검출하면(ST101), STC_offset의 갱신을 행한다(ST102). In FIG. 10, when the AV control unit 21 receives the pause release request from the host 1, the AV control unit 21 waits for a short circuit of the frame of the video data, and detects a short circuit of the frame of the video data (ST101), and updates the STC_offset ( ST102).

그 후, 음성 데이터와 영상 데이터의 시프트 시간 측정중인지의 여부 판정을 행하고(ST103), 음성 데이터와 영상 데이터의 시프트 시간 측정중이라면, 후술하는 식 (1)에 의거하여, 포즈시와 포즈 해제시의 음성 데이터와 영상 데이터의 시프트 시간인 음성 보정 시간(a_diff)을 구한다(ST104). 측정중이 아니면, 후술하는 식 (2)에 의거하여, 음성 보정 시간(a_diff)을 구한다(ST105). After that, it is determined whether the shift time measurement of the audio data and the video data is being performed (ST103), and if the shift time measurement of the audio data and the video data is being measured, the pause time and the pause release time are based on Equation (1) described later. The audio correction time a_diff, which is the shift time between the audio data and the video data, is obtained (ST104). If it is not measuring, the voice correction time a_diff is obtained based on equation (2) described later (ST105).

음성 보정 시간(a_diff)의 상세에 관해서는 후술하지만, 포즈시 음성 지연 시간(a_delay)과 프레임 시프트 시간(f_count)에 의거하여, 그 포즈 처리에서, 포즈 해제시에서 보정되어야 할 영상 데이터에 대한 음성 데이터의 어긋남을 나타내고 있다. 음성 보정 시간(a_diff)이 정(正)의 값일 때는, 영상 데이터에 대해 음성 데이터가 지연되어 있는 것을 의미하고, 음성 보정 시간(a _diff)이 부(負)일 때는, 영상 데이터에 대해 음성 데이터가 선행되어 있는 것을 의미하고 있다. The details of the voice correction time a_diff will be described later, but based on the voice delay time a_delay and the frame shift time f_count during the pause, the voice of the video data to be corrected at the time of pause release in the pause processing is performed. The data is misaligned. When the audio correction time a_diff is a positive value, it means that the audio data is delayed with respect to the video data. When the audio correction time a_diff is negative, the audio data with respect to the video data. Means preceded.

다음에, 스텝 ST104 또는 스텝 ST105에서 구하여진 음성 보정 시간(a_diff)을, 누적 음성 보정 시간(total_audio_delay)에 가산한다(ST106). Next, the speech correction time a_diff obtained in step ST104 or ST105 is added to the cumulative speech correction time total_audio_delay (ST106).

따라서 시스템 기동시에 초기치 0인 누적 음성 보정 시간(total_audio_delay)은, 시스템 동작중인 동안, 복수의 포즈 처리에 대해, 순서대로 스텝 ST106에서 누적 가산된다. 음성 보정 시간(a_diff)은 각 포즈 처리시에 보정되어야 할 음성 데이터의 어긋남분임에 대해, 누적 음성 보정 시간(total_audio_delay)는, 각 회의 음성 보정 시간(a_diff)을 가산한 누적치로 되기 때문에, 이것이 실제로 보정되어야 할 영상 데이터에 대한 음성 데이터 보정치로 된다. Therefore, the cumulative speech correction time (total_audio_delay), which is the initial value 0 at the time of system startup, is cumulatively added in step ST106 for a plurality of pause processes while the system is in operation. Since the speech correction time a_diff is a deviation of the speech data to be corrected in each pose processing, the cumulative speech correction time total_audio_delay is a cumulative value obtained by adding each speech correction time a_diff. The audio data is corrected for the video data to be corrected.

스텝 ST107 이후는, 시스템 동작중 포즈 처리마다 누적되는 누적 음성 보정 시간(total_audio_delay)의 값에 의거하여, AV 동기의 어긋남을 어떻게 제어하는지, 구체적으로는 영상 데이터에 대한 음성 데이터의 어긋남을 보정하여야 하는지 의 여부 및 보정하는 경우는 음성 데이터와 영상 데이터의 어느 것을 지연시키는지에 관해 결정하는 처리이다. After step ST107, on the basis of the accumulated voice correction time (total_audio_delay) value accumulated for each pause processing during system operation, how to control the deviation of the AV synchronization, specifically, how to correct the deviation of the audio data with respect to the video data. Is corrected, and the process of determining which of the audio data and the video data is delayed.

우선, 스텝 ST107에서, 누적 음성 보정 시간(total_audio_delay)이 부, 즉 음성 데이터가 선행되어 있는 때에는, 누적 음성 보정 시간(total_audio_delay)에 영상 데이터의 1프레임 길이의 시간을 가산한(ST108) 다음, 실제로 영상 데이터의 재개를 1프레임 지연시키는 처리를 행한다. 영상 데이터의 재개를 1프레임 지연시키는 처리는, 영상 데이터의 프레임의 단락을 검출할 때까지 영상 데이터의 입력 재개를 기다림에 의해서 실현된다(ST109). First, in step ST107, when the cumulative speech correction time (total_audio_delay) is negative, i.e., audio data is preceded, the cumulative speech correction time (total_audio_delay) is added to the cumulative speech correction time (total_audio_delay) by one frame length of time (ST108). A process of delaying the resumption of the video data by one frame is performed. The process of delaying the resumption of the video data by one frame is realized by waiting for resumption of input of the video data until a short circuit of the frame of the video data is detected (ST109).

영상 프레임의 단락을 검출하면, 영상 데이터의 입력을 재개한다(ST110). Upon detecting a short circuit in the video frame, input of video data is resumed (ST110).

스텝 ST107에서, 누적 음성 보정 시간(total_audio_delay)이 부가 아닌 경우, 즉 음성 데이터가 같거나 지연되어 있는 때에는, 영상 데이터는 지연시키는 일 없이 그대로 입력을 재개하고(ST110), 스텝 ST111로 진행한다. In step ST107, if the cumulative audio correction time total_audio_delay is not negative, i.e., the audio data is the same or delayed, the input of the video data is resumed without delay (ST110), and the process goes to step ST111.

스텝 ST111에 있어서는, 정(正)인 누적 음성 보정 시간(total_audio_delay)이 1음성 데이터 프레임(audio_frame_time) 이상인 경우는, 음성 데이터의 재개를 지연시킬 필요가 있기 때문에, 스텝 ST112 이후로 진행한다. In step ST111, when the positive cumulative speech correction time total_audio_delay is equal to or greater than one audio data frame audio_frame_time, it is necessary to delay the resumption of the audio data.

누적 음성 보정 시간(total_audio_delay)가, 정이지만 1음성 데이터 프레임 미만인 경우는, 음성 데이터를 지연시키는 일 없이 재개시킨다(ST114). If the cumulative speech correction time total_audio_delay is positive but less than one audio data frame, the speech data is resumed without delay (ST114).

스텝 ST112에서는, 누적 음성 보정 시간(total_audio_delay)에 음성 데이터의 1프레임 길이의 시간을 감산한(ST112) 다음, 실제로 음성 데이터의 재개를 1프레임 지연시키는 처리를 행한다. 이 음성 데이터의 재개를 1프레임 지연시키는 처 리는, 음성 데이터의 프레임의 단락을 검출할 때까지 음성 데이터의 입력 재개를 기다림에 의해서 실현된다(ST113). In step ST112, the cumulative speech correction time (total_audio_delay) is subtracted from the time of one frame length of speech data (ST112), and then the processing for actually delaying the resume of speech data by one frame is performed. The process of delaying the resumption of the audio data by one frame is realized by waiting for the input of the audio data to be resumed until a short circuit of a frame of the audio data is detected (ST113).

음성 프레임의 단락을 검출하면, 영상 데이터의 입력을 재개한다(ST114). Upon detecting a short circuit in the audio frame, input of video data is resumed (ST114).

다음에, 도 10의 스텝 ST104 및 ST105에서의 음성 보정 시간(a_diff)의 산출 방법에 관해, 도 11 및 도 12에 관련지어 설명한다. Next, the calculation method of the audio | voice correction time a_diff in step ST104 and ST105 of FIG. 10 is demonstrated with reference to FIG. 11 and FIG.

도 11은, 포즈중 시프트 시간 측정중의 경우의 음성 보정 시간(a_diff)의 산출 방법을 도해한 타이밍 차트이다. Fig. 11 is a timing chart illustrating a method of calculating the voice correction time a_diff when the shift time is measured during a pause.

도 11에 도시한 타이밍 차트는, f_count 측정중 경우, 즉, 호스트(1)로부터 AV 제어부(21)에 대한 포즈 해제 요구(P_RL)가, 음성 데이터의 프레임의 단락부터 영상 데이터의 프레임의 단락 사이에 행하여졌기 때문에, 그 포즈 해제 요구 후에 구하여지는 프레임 시프트 시간(f_count)의 값을 이용하여, 음성 보정 시간(a_diff)을 산출하는 경우를 나타낸다. In the timing chart shown in Fig. 11, during the f_count measurement, that is, the pause release request (P_RL) from the host 1 to the AV control unit 21 is performed between the paragraph of the frame of the audio data and the paragraph of the frame of the video data. In this case, the voice correction time a_diff is calculated using the value of the frame shift time f_count obtained after the pause release request.

이하, 도 11에 따라, 음성 보정 시간(a_diff)을 산출하기 위해, 도 10의 스텝 ST104에서 행하여지는 순서에 관해 설명한다. Hereinafter, the procedure performed in step ST104 of FIG. 10 will be described in order to calculate the speech correction time a_diff according to FIG. 11.

AV 제어부(21)는, 호스트(1)로부터 포즈 해제 요구를 수취하면 영상의 프레임 주기에 맞추어서, 타이머(24)로부터 시각(t111)을 취득하고, 포즈 요구시에 보존하고 있던 pause_STC_offset를 기준으로 하여 STC_offset을 재설정한다. When the AV control unit 21 receives the pause release request from the host 1, the AV control unit 21 obtains the time t111 from the timer 24 in accordance with the frame period of the video, and based on the pause_STC_offset stored at the time of the pause request. Reset STC_offset.

또한, 시각(t111)의 타이밍에서, 프레임 시프트 시간(f_count)의 측정도 행하여진다. In addition, the frame shift time f_count is also measured at the timing t111.

여기서, a_delay는, 이미 기술한 바와 같이, 포즈시에 있어서의 음성 데이터 와 영상 데이터의 프레임 시프트 시간이고, 포즈시에 산출하여 보존하고 있던 데이터이다. 또한, audio_frame_time은, 음성 데이터의 프레임 주기이다. Here, a_delay is the frame shift time of the audio data and the video data at the time of pause as described above, and is the data calculated and stored at the time of pause. In addition, audio_frame_time is a frame period of audio data.

도 11에서 분명한 바와 같이, 하기식 (1)에 의해 음성 보정 시간(a_diff)을 구할 수 있다.As is apparent from Fig. 11, the speech correction time a_diff can be obtained by the following equation (1).

a_diff = a_delay + f_count - audio_frame_time … (1) a_diff = a_delay + f_count-audio_frame_time... (One)

도 12는, 포즈중 시프트 시간 측정중이 아닌 경우의 음성 보정 시간(a_diff)의 산출 방법을 도해한 타이밍 차트이다. Fig. 12 is a timing chart illustrating a method of calculating the voice correction time a_diff when the shift time during pause is not measured.

도 12에 도시한 플로우 차트는, f_count 측정중 경우, 즉, 호스트(1)로부터 AV 제어부(21)에 대한 포즈 해제 요구가, 영상 데이터의 프레임의 단락부터 음성 데이터의 프레임의 단락 사이에 행하여졌기 때문에, 그 포즈 해제 요구 전에 구한 프레임 시프트 시간(f_count)을 이용하여, 음성 보정 시간(a_diff)을 산출하는 경우이다. In the flow chart shown in Fig. 12, during the f_count measurement, that is, a pause release request from the host 1 to the AV control unit 21 is made between a paragraph of a frame of video data and a paragraph of a frame of audio data. Therefore, the speech correction time a_diff is calculated using the frame shift time f_count obtained before the pause release request.

이하, 도 12에 따라, 음성 보정 시간(a_diff)을 산출하기 위해, 도 10의 스텝 ST105에서 행하여지는 순서에 관해 설명한다. Hereinafter, the procedure performed in step ST105 of FIG. 10 will be described in order to calculate the speech correction time a_diff according to FIG. 12.

AV 제어부(21)가, 호스트(1)로부터 포즈 해제 요구를 수취하면, 영상의 프레임 주기에 맞추어서, 타이머(24)로부터 시각(t121)을 취득하고, 포즈 요구시에 보존하고 있던 pause_STC_offset을 기준으로 하여 STC_offset을 재설정한다. When the AV control unit 21 receives the pause release request from the host 1, it acquires the time t121 from the timer 24 in accordance with the frame period of the video, and based on the pause_STC_offset stored at the time of the pause request. Reset STC_offset.

여기서, a_delay는, 이미 기술한 바와 같이, 포즈시의 음성 데이터와 영상 데이터 시프트 시간인 포즈시 음성 지연 시간이고, 포즈시에 산출하여 보존하고 있던 데이터이다. Here, a_delay is the pause audio delay time which is the audio data at the time of a pause and the video data shift time, as previously described, and is data which was calculated and stored at the time of pause.

또한, audio_frame_time은, 음성 데이터의 프레임 주기이다. In addition, audio_frame_time is a frame period of audio data.

video_frame_time은, 영상 데이터의 프레임 주기이다. video_frame_time is a frame period of video data.

도 12에서 분명한 바와 같이, 하기식 (2)에 의해 음성 보정 시간(a_diff)을 구할 수 있다. As is apparent from Fig. 12, the speech correction time a_diff can be obtained by the following equation (2).

a_diff = a_delay + f_count - audio_frame_time + video_frame_time … (2) a_diff = a_delay + f_count-audio_frame_time + video_frame_time... (2)

다음에, 포즈 해제시에 있어서의 AV 동기의 어긋남을 해소하는 처리에 관해, 도 13 및 도 14를 이용하여 구체적으로 설명한다. Next, a process for eliminating the deviation of the AV synchronization at the time of pause release will be described in detail with reference to FIGS. 13 and 14.

도 13은, 영상 데이터의 입력 재개를 1프레임 지연시키는 처리에 의해, AV 동기 어긋남을 해소하는 처리를 설명하기 위한 도면이다. FIG. 13 is a diagram for explaining a process of eliminating AV synchronization deviation by a process of delaying input resume of video data by one frame.

영상 데이터의 입력 재개를 지연시키는 제어는, 도 10의 포즈 해제시의 플로우 차트에 따라 이미 기술한 바와 같이, 누적 음성 보정 시간(total_audio_delay)이 부이기 때문에(ST107), AV 동기 어긋남의 보정을 행하고(ST108), 영상 데이터의 프레임의 단락을 찾을 때까지, 영상 데이터의 재개를 1프레임 지연시키는 처리를 행함(ST109)에 의해 실현된다. The control for delaying the resumption of the input of the video data is performed by correcting the AV synchronization shift because the cumulative voice correction time (total_audio_delay) is negative (ST107), as described above according to the flowchart at the time of release of the pause of FIG. (ST108) This is realized by performing a process of delaying the resumption of the video data by one frame until a short circuit of the frame of the video data is found (ST109).

도 13에 있어서, AV 제어부(21)가 호스트(1)로부터의 포즈 해제 요구를 수취하면, 영상 데이터의 프레임의 단락을 기다리고, 영상 데이터의 프레임의 단락을 검출하면(시각(t131)), 도 10의 처리 플로에 의거하여 누적 음성 보정 시간(total_audio_delay)을 산출한다. 이것이 부이기 때문에, 1영상 프레임 기다리고 나서 영상 데이터의 입력을 재개한다(시각(t132)). 13, when the AV control unit 21 receives the pause release request from the host 1, it waits for a short circuit of the frame of the video data and detects a short circuit of the frame of the video data (time t131). Based on the processing flow of 10, the cumulative speech correction time total_audio_delay is calculated. Since this is negative, input of video data is resumed after waiting for one video frame (time t132).

도 14는, 음성 데이터의 입력 재개를 1프레임 지연시키는 처리에 의해, AV 동기 어긋남을 해소하는 처리를 설명하기 위한 도면이다. 14 is a diagram for explaining a process of eliminating AV synchronization deviation by a process of delaying input resume of audio data by one frame.

음성 데이터의 입력 재개를 지연시키는 제어는, 도 10의 포즈 해제시의 플로우 차트를 참조하여 기술한 바와 같이, 누적 음성 보정 시간(total_audio_delay)이 1음성 프레임 이상인 경우에, AV 동기 어긋남의 보정을 행하고(ST112), 음성 데이터의 프레임의 단락을 찾을 때까지, 음성 데이터의 재개를 1프레임 지연시키는 처리를 행하는 처리(ST113)에 의해 실현된다. The control for delaying the resumption of the input of the audio data is performed by correcting the AV synchronization deviation when the cumulative voice correction time (total_audio_delay) is 1 voice frame or more, as described with reference to the flow chart at the time of release of the pause in FIG. (ST112) This is realized by a process (ST113) which performs a process of delaying the resumption of the voice data by one frame until a paragraph of the frame of the voice data is found.

도 14에서, AV 제어부(21)가, 호스트(1)로부터의 포즈 해제 요구를 수취하면, 영상 데이터의 프레임의 단락을 기다리고, 영상 데이터의 프레임의 단락을 검출하면(시각(t141)), total_audio_delay가 1음성 프레임을 초과하고 있기 때문에, 1음성 프레임 기다리고 나서 음성 데이터의 입력을 재개한다(시각(t142)). In FIG. 14, when the AV control unit 21 receives the pause release request from the host 1, the AV control unit 21 waits for a short circuit of the frame of the video data and detects a short circuit of the frame of the video data (time t141), and total_audio_delay. Since is over one audio frame, input of audio data is resumed after waiting for one audio frame (time t142).

도 10의 스텝 ST107 및 ST111로부터 분명한 바와 같이, total_audio_delay가 정이고 1음성 프레임을 초과하지 않는 경우는, 음성 데이터/영상 데이터의 입력 재개의 어느 것도 지연시키지 않는다. 이 경우는, 이때 포즈 처리에서 생긴 음성 데이터와 영상 데이터의 어긋남은, total_audio_delay에 축적되어 가게 된다. As apparent from steps ST107 and ST111 in Fig. 10, when total_audio_delay is positive and does not exceed one audio frame, neither input resume of audio data / video data is delayed. In this case, the discrepancy between the audio data and the video data generated by the pause process at this time is accumulated in total_audio_delay.

다만, 음성 데이터/영상 데이터의 입력 재개의 어느 하나를 지연시켜서 AV 동기 어긋남의 해소가 행하여진 경우에도, 도 10의 스텝 ST108 및 ST112의 처리에 의해 누적 음성 보정 시간(total_audio_delay)은 0으로 되지 않기 때문에, 완전하게 AV 동기 어긋남이 해소되는 일은 없다. However, even when AV synchronization misalignment is eliminated by delaying any of the resumption of input of the audio data / video data, the cumulative audio correction time (total_audio_delay) does not become zero by the processing of steps ST108 and ST112 in FIG. Therefore, the AV synchronization misalignment is not completely eliminated.

그러나, 본 발명에 관한 AV 기록 장치에 의하면, 해당 AV 기록 장치가 동작중에, 누적 음성 보정 시간(total_audio_delay)이 항상 1음성 데이터 프레임 이내 로 수습되게 되기 때문에, 그 차가 시청자에게 인식되는 일은 없고, 충분히 AV 동기 어긋남을 해소하는 것이 가능해진다. However, according to the AV recording apparatus according to the present invention, since the cumulative speech correction time (total_audio_delay) is always settled within one audio data frame while the AV recording apparatus is in operation, the difference is not recognized by the viewer, and it is sufficient. AV synchronization shift can be eliminated.

본 발명은, 음성 데이터와 영상 데이터를 동기시켜서 기록 또는 재생하는 장치에 적용 가능하다. The present invention is applicable to an apparatus for recording or reproducing synchronously with audio data and video data.

Claims

An audio / video synchronization processing apparatus for performing synchronization processing on video data and audio data having different predetermined frame lengths, respectively,

Timer means,

Storage means for storing a start time of each frame of the video data and audio data timed by the timer means, a time of a pause request, and a time of a pause release request;

On the basis of the start time of each frame of the video data and audio data, the time of the pause request, and the time of the pause release request, which one of the video data and the audio data is delayed after the pause release request in units of frames. Or control means for determining whether or not to delay anything.

The method of claim 1,

The control means,

When a pause request is made, an audio delay time which is a delay time of a frame of audio data is calculated on the basis of a paragraph of a frame of video data,

After the pause request, each frame start time of the frame of the image data is monitored for a frame shift time which is a difference of the frame start time of the audio data with respect to the image data,

Calculate a speech correction time based on the speech delay time and the frame shift time at the time of the pause release request for the pause request,

On the basis of the cumulative speech correction time accumulated in the speech correction time calculated for each pause request, it is determined whether to delay one of the video data and the audio data by frame or none after the pause release request. Voice / video synchronization processing device, characterized in that.

The method of claim 2,

The control means,

On the basis of the cumulative speech correction time, when it is determined that the audio data is preceded by the video data, the video data is delayed by one frame with respect to the audio data after the pause release request. Device.

The method of claim 2,

The control means,

On the basis of the cumulative speech correction time, when it is determined that the audio data is delayed by one frame or more with respect to the video data, the audio data is delayed by one frame with respect to the video data after the pause release request. / Video synchronization processing device.

In the audio / video synchronization processing method of performing synchronization processing on video data and audio data each having a different predetermined frame length,

Calculating a voice delay time which is a delay time of a frame of audio data on the basis of a paragraph of a frame of video data at the time of a pause request;

Monitoring the frame shift time which is the difference of the frame start time of the audio data with respect to the video data for each start time of each frame of the video data after the pause request;

Calculating a speech correction time based on the speech delay time and the frame shift time when the pause release request is made;

And determining whether to delay one of the video data and the audio data on a frame-by-frame basis or none of the video data and the audio data after the pause release request, based on the accumulated voice correction time. Audio / video synchronization processing method.

The method of claim 5,

On the basis of the cumulative speech correction time, when it is determined that the audio data is preceded by the video data, the video data is delayed by one frame with respect to the audio data after the pause release request. Lee way.

The method of claim 5,

On the basis of the cumulative speech correction time, when it is determined that the audio data is delayed by one frame or more with respect to the video data, the audio data is delayed by one frame with respect to the video data after the pause release request. / Video synchronization processing method.

An audio / video recording apparatus for generating multiplexed data including video data and audio data having different predetermined frame lengths, respectively,

Timer means,

Storage means for storing a start time of each frame of the video data and audio data, time of a pause request, and time of a pause release request, revealed by said timer means;

Synchronization control means for performing synchronization processing of the audio data after the pause release request on a frame-by-frame basis based on a start time of each frame of the video data and audio data, the time of the pause request, and the time of the pause release request;

And multiplexed data generating means for generating the multiplexed data by adding time information to the video data and the audio data synchronized by the synchronization control means.