JP2019125994A

JP2019125994A - Video and audio reproduction device

Info

Publication number: JP2019125994A
Application number: JP2018007061A
Authority: JP
Inventors: 隆弘津木; Takahiro Tsuki
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2018-01-19
Filing date: 2018-01-19
Publication date: 2019-07-25

Abstract

To synchronize video data and audio data.SOLUTION: Video data and audio data are respectively compressed by a compression unit and input to a memory, and each is delayed by the memory. The delayed video data and audio data are output from the memory and then expanded by an expansion unit. Further, the control unit determines a delay time in the memory of each of the video data and the audio data on the basis of format information of the video data and the audio data. By adjusting the delay time, the video data and the audio data are synchronized.SELECTED DRAWING: Figure 1

Description

本発明は、映像音声再生装置に関わり、特に再生時の映像と音声の同期を取る映像音声再生装置に関するものである。 The present invention relates to a video and audio reproduction apparatus, and more particularly to a video and audio reproduction apparatus for synchronizing video and audio at the time of reproduction.

テレビジョン受信機などに代表される映像と音声を再生する装置は、映像データの信号処理と音声データの信号処理が個別に行われる場合が有る。このような場合、映像データの信号処理にかかる時間と音声データの信号処理にかかる時間は必ずしも一致しない。一般的には、映像データの方が音声データよりもデータ量が多いので、それに起因して、映像データの信号処理にかかる時間の方が音声データの信号処理にかかる時間よりも長くなる。そのため、テレビ受信機の映像と音声の再生タイミングが一致せず、その結果、映像の内容と音声がずれるという問題が生じる。 In an apparatus for reproducing video and audio represented by a television receiver or the like, signal processing of video data and signal processing of audio data may be separately performed. In such a case, the time taken for signal processing of video data and the time taken for signal processing of audio data do not necessarily coincide. Generally, video data has a larger amount of data than audio data, so that the time taken for signal processing of video data is longer than the time taken for signal processing of audio data. Therefore, the reproduction timings of the video and the audio of the television receiver do not match, and as a result, there arises a problem that the content of the video and the audio are deviated.

このような問題に対して、テレビジョン受信機の映像と音声の再生タイミングを一致させる（映像と音声の再生タイミングの同期を取る）技術はリップシンクと称される。 To address this problem, a technique for matching the video and audio reproduction timing of the television receiver (synchronizing the video and audio reproduction timing) is called lip sync.

このようなリップシンクを取る技術として特許文献１に開示された技術が有る。 There is a technology disclosed in Patent Document 1 as a technology for obtaining such lip synchronization.

特許５５９０１８６号公報 Patent No. 5590186

特許文献１に開示された技術は、映像データの信号処理の系統と音声データの信号処理の系統のそれぞれにメモリを入れ、メモリへのデータの書き込み／読み出しタイミングを制御することで、映像と音声の再生タイミングの同期を取るものである。 The technique disclosed in Patent Document 1 includes a memory in each of a system of signal processing of video data and a system of signal processing of audio data, and controls timing of writing / reading data to the memory to control video and audio. The synchronization of the playback timing of

特許文献１に開示された技術では、メモリの容量により映像と音声のずれを補正できる量が決まってしまう。メモリ容量以上に映像データと音声データがずれた場合は、映像と音声の再生タイミングの同期を取ることができないという課題が有る。 In the technology disclosed in Patent Document 1, the capacity of the memory determines the amount by which the deviation between the video and the audio can be corrected. If the video data and the audio data deviate by more than the memory capacity, there is a problem that the reproduction timing of the video and the audio can not be synchronized.

上記課題を解決するため、本発明は以下のような構成を取る。 In order to solve the above-mentioned subject, the present invention takes the following composition.

即ち、映像データと音声データのタイミングを合わせて出力する映像音声再生装置であって、入力された映像データ及び／または音声データのフォーマット情報を抽出する映像処理部と、前記フォーマット情報を基に映像データ及び／または音声データの処理時間を特定し、映像データと音声データの時間差を検出する制御部と、前記制御部で検出した映像データと音声データの時間差を基に映像データ及び／または音声データを遅延するメモリと、を具備した。 That is, a video and audio reproduction apparatus that outputs video data and audio data at the same time and outputs video data and / or format information of the input video data, and video based on the format information. A control unit that specifies processing time of data and / or audio data and detects a time difference between video data and audio data, and video data and / or audio data based on the time difference between video data and audio data detected by the control unit And a memory for delaying.

ここで、前記制御部は前記フォーマット情報に加え、映像データ及び／または音声データの処理の仕方の情報を基に映像データ及び／または音声データの処理時間を特定しても良い。 Here, the control unit may specify processing time of video data and / or audio data based on information of processing of video data and / or audio data in addition to the format information.

また、本発明にかかる映像音声再生装置は、前記映像データ及び／または音声データを
圧縮する圧縮部と、前記圧縮された映像データ及び／または音声データを伸張する伸張部と、を具備し、前記圧縮部は、前記メモリへの入力前に映像データ及び／または音声データを圧縮し、前記伸張部は、前記メモリから出力された映像データ及び／または音声データを伸張する。 Further, a video and audio reproduction apparatus according to the present invention comprises a compression unit that compresses the video data and / or audio data, and a decompression unit that decompresses the compressed video data and / or audio data. The compression unit compresses video data and / or audio data before input to the memory, and the decompression unit decompresses the video data and / or audio data output from the memory.

ここで、前記圧縮部は、前記映像データの画素数、または、フレームレート、または、ビット数を削減する。 Here, the compression unit reduces the number of pixels, the frame rate, or the number of bits of the video data.

または、前記圧縮部は、前記音声データのサンプリングレート、または、ビット数を削減する。 Alternatively, the compression unit reduces the sampling rate or the number of bits of the audio data.

本発明によれば、少ないメモリ容量でも映像データと音声データの大きなずれを補正し、リップシンクを取ることができる。 According to the present invention, even with a small memory capacity, it is possible to correct a large deviation between video data and audio data and to take a lip sync.

本発明にかかる映像音声再生装置のブロック図である。1 is a block diagram of a video and audio reproduction apparatus according to the present invention. フォーマット情報と処理の仕方による処理時間のテーブル例である。It is an example of a table of processing time by the method of format information and processing. 圧縮、伸張にかかる時間のテーブル例である。It is an example of a table of the time which compression and decompression take. 本発明にかかる映像音声再生装置の他のブロック図である。FIG. 7 is another block diagram of the video and audio reproduction apparatus according to the present invention.

以下、図面を参照しながら、本発明の映像音声再生装置に係る好適な実施の形態について説明する。以下の説明において、異なる図面においても同じ符号を付した構成は同様のものであるとして、その説明を省略する場合がある。なお、本発明は、これらの実施形態での例示に限定されるものではなく、特許請求の範囲に記載された事項の範囲内および均等の範囲内におけるすべての変更を含む。
（第１の実施形態）
図１は、本発明の映像音声再生装置のブロック図である。
図１において、１は、映像音声再生装置である。具体的には、テレビジョン受信機等が該当する。２は、映像及び音声のソース源である。具体的には、ＤＶＤプレーヤ等が該当する。１０１は、映像処理部であり、１０２は、映像圧縮部であり、１０３は、映像メモリであり、１０４は映像伸張部であり、１０５は、表示部であり、１０６は、音声処理部であり、１０７は、音声圧縮部であり、１０８は、音声メモリであり、１０９は、音声伸張部であり、１１０は、スピーカであり、１１１は、制御部である。 Hereinafter, preferred embodiments of the video and audio reproduction apparatus of the present invention will be described with reference to the drawings. In the following description, components denoted by the same reference numerals in different drawings may be omitted as they are the same. The present invention is not limited to the examples in these embodiments, but includes all modifications within the scope and equivalents of the matters described in the claims.
First Embodiment
FIG. 1 is a block diagram of a video and audio reproduction apparatus according to the present invention.
In FIG. 1, reference numeral 1 denotes a video and audio reproduction apparatus. Specifically, a television receiver etc. correspond. 2 is a source of video and audio. Specifically, a DVD player or the like corresponds. Reference numeral 101 denotes a video processing unit, 102 denotes a video compression unit, 103 denotes a video memory, 104 denotes a video decompression unit, 105 denotes a display unit, and 106 denotes an audio processing unit. , 107 is an audio compression unit, 108 is an audio memory, 109 is an audio expansion unit, 110 is a speaker, and 111 is a control unit.

ソース源であるＤＶＤプレーヤ等から出力された映像データは、映像処理部１０１に入力される。また、同様にソース源であるＤＶＤプレーヤ等から出力された音声データは、音声処理部１０６に入力される。 Video data output from a DVD player or the like, which is a source source, is input to the video processing unit 101. Similarly, audio data output from a DVD player or the like as a source source is input to the audio processing unit 106.

映像処理部１０１は、入力された映像データのフォーマット情報を抽出する、映像フォーマットとしては、例えば、８Ｋ、４Ｋ、２Ｋ等が有る。８Ｋの映像フォーマットとは、例えば、画素数７６８０×４３２０画素、フレームレート６０Ｈｚプログレッシブ、輝度ビット数１０ビット、色差ビット数１０ビットである。また、４Ｋの映像フォーマットとは、例えば、画素数３８４０×２１６０画素、フレームレート６０Ｈｚプログレッシブ、輝度ビット数１０ビット、色差ビット数１０ビットである。また、２Ｋの映像フォーマットとは、例えば、画素数１９２０×１０８０画素、フレームレート６０Ｈｚインターレース、輝度ビット数８ビット、色差ビット数８ビットである。
映像処理部１０１で抽出された入力映像データのフォーマットの情報は、制御部１１１に入力される。 The video processing unit 101 extracts format information of the input video data, and the video format includes, for example, 8K, 4K, 2K, and the like. The 8K video format is, for example, 7680 × 4320 pixels, 60 Hz frame rate progressive, 10 luminance bits, and 10 chrominance bits. The 4K video format is, for example, 3840 × 2160 pixels, progressive 60 Hz frame rate, 10 luminance bits, and 10 chrominance bits. The 2K video format is, for example, 1920 × 1080 pixels, 60 Hz frame rate interlace, 8 luminance bits, and 8 chrominance bits.
Information on the format of input video data extracted by the video processing unit 101 is input to the control unit 111.

同様に、音声処理部１０６は、入力された音声データのフォーマット情報を抽出する。音声フォーマットとしては、例えば、ＡＡＣ、ＡＬＳ等が有る。
音声処理部１０６で抽出された入力音声データのフォーマットの情報は、制御部１１１に入力される。 Similarly, the audio processing unit 106 extracts format information of the input audio data. Examples of the audio format include AAC, ALS, and the like.
Information on the format of the input speech data extracted by the speech processing unit 106 is input to the control unit 111.

制御部１１１は、フォーマット情報と処理の仕方による処理時間のテーブルを持っている。ここで、処理の仕方とは、映像データであれば、「スタンダード」、「シアター」、「ダイナミック」、「カスタム」等と称し、輝度レベルや、コントラスト、色合い、等をそれぞれ調整し、映像の内容（ジャンル）やユーザの好みを反映した映像信号に加工する処理である。
同様に、音声データであれば、「スタンダード」、「シアター」、「ミュージック」、「オート」等と称し、周波数特性や、音圧、等をそれぞれ調整し、音楽の種類（ジャンル）や、ユーザの好みを反映した音声信号に加工する処理である。 The control unit 111 has a table of format information and processing time according to the processing method. Here, in the case of video data, the processing method is referred to as “standard”, “theater”, “dynamic”, “custom”, etc., and the luminance level, contrast, color tone, etc. are adjusted respectively to It is a process of processing into a video signal that reflects the content (genre) and the preference of the user.
Similarly, in the case of audio data, it is referred to as "standard", "theater", "music", "auto", etc., and the frequency characteristics, sound pressure, etc. are adjusted respectively, and the type of music (genre), user Processing to an audio signal reflecting the preference of

上記のようなテーブルは、例えば、図２のように表わされる。図２（a）は、映像デー
タのフォーマット情報と処理の仕方による処理時間のテーブルの例であり、図２（ｂ）は、音声データのフォーマット情報と処理の仕方による処理時間のテーブルの例である。フォーマット情報と処理の仕方の組み合わせごとに処理時間が規定されている。
かかるテーブルにより、映像処理部１０１および音声処理部１０６において、映像データ、音声データの処理にかかる時間が把握できる。これにより、制御部１１１は、映像処理部１０１及び音声処理部１０６から出力された映像データと音声データ時間差を把握することができる。 The table as described above is represented, for example, as shown in FIG. FIG. 2A is an example of a table of format information of video data and a processing time according to a method of processing, and FIG. 2B is an example of a table of format information of audio data and a processing time according to a method of processing is there. Processing time is defined for each combination of format information and processing method.
From this table, the video processing unit 101 and the audio processing unit 106 can grasp the time taken to process the video data and the audio data. Thus, the control unit 111 can grasp the time difference between the video data and the audio data output from the video processing unit 101 and the audio processing unit 106.

映像処理部１０１から出力された映像データは、次に、映像圧縮部１０２に入力される。映像圧縮部１０２は、映像データのデータ量を削減する処理を行う。映像データのデータ量の削減とは、例えば、８Ｋフォーマットの映像を４Ｋフォーマットの映像に変換する、同様に、８Ｋフォーマットの映像を２Ｋフォーマットの映像に変換する、同様に、４Ｋフォーマットの映像を２Ｋフォーマットの映像に変換する、などが考えられる。具体的には、画素数を削減する、フレームレートを低減する、ビット数を削減する等である。 The video data output from the video processing unit 101 is then input to the video compression unit 102. The video compression unit 102 performs processing to reduce the data amount of video data. The reduction of the data amount of video data means, for example, conversion of 8K format video to 4K format video, similarly, conversion of 8K format video to 2K format video, and similarly 2K video of 4K format It can be considered to convert to a format video. Specifically, the number of pixels is reduced, the frame rate is reduced, or the number of bits is reduced.

同様に、音声処理部１０６から出力された音声データは、次に、音声圧縮部１０７に入力される。音声圧縮部１０７は、音声データのデータ量を削減する処理を行う。音声データのデータ量の削減とは、例えば、サンプリングレート低下させる、ビット数を削減する等が考えられる。 Similarly, the audio data output from the audio processing unit 106 is input to the audio compression unit 107 next. The audio compression unit 107 performs processing to reduce the amount of audio data. The reduction of the data amount of audio data may be, for example, lowering the sampling rate or reducing the number of bits.

制御部１１１は、映像データ及び音声データのそれぞれのデータ量削減にかかる時間のテーブルを持っている。映像データのデータ量の削減にかかる時間のテーブルとは、例えば、図３（a）のようなものである。 The control unit 111 has a table of time taken to reduce the data amount of each of the video data and the audio data. The table of the time taken to reduce the data amount of video data is, for example, as shown in FIG.

映像圧縮部１０２でデータ量の削減が行われた映像データは、映像メモリ１０３に入力される。映像メモリ１０３では、メモリへのデータの書き込み、読み出しタイミングを制御することで映像データを所定の時間遅延させる。 The video data for which the data amount has been reduced by the video compression unit 102 is input to the video memory 103. The video memory 103 delays video data for a predetermined time by controlling the timing of writing and reading data to the memory.

同様に、音声圧縮部１０７でデータ量の削減が行われた音声データは、音声メモリ１０８に入力される。音声メモリ１０８では、メモリへのデータの書き込み、読み出しタイミングを制御することで音声データを所定の時間遅延させる。
このように、メモリへ入力する前に映像データ及び音声データのデータ量を削減することにより、メモリで遅延できる時間を長くすることができる。
尚、所定の時間の決め方については、後述する。 Similarly, the audio data whose amount of data has been reduced by the audio compression unit 107 is input to the audio memory 108. The voice memory 108 delays voice data by a predetermined time by controlling the timing of writing and reading data to the memory.
Thus, by reducing the data amount of video data and audio data before input to the memory, it is possible to lengthen the time that can be delayed in the memory.
The method of determining the predetermined time will be described later.

映像メモリ１０３で所定の時間遅延された映像データは、映像伸張部１０４へ入力される。映像伸張部１０４では、映像圧縮部１０２で削減されたデータ量を戻す処理を行う。例えば、映像圧縮部１０２で８Ｋフォーマットの映像を４Ｋフォーマットの映像のデータ量に削減された場合は、データ量の削減された４Ｋフォーマットの映像を８Ｋフォーマットの映像のデータ量に戻す。同様に、８Ｋフォーマットの映像を２Ｋフォーマットの映像のデータ量に削減された場合は、データ量の削減された２Ｋフォーマットの映像を８Ｋフォーマットの映像のデータ量に戻す。同様に、４Ｋフォーマットの映像を２Ｋフォーマットの映像のデータ量に削減された場合は、データ量の削減された２Ｋフォーマットの映像を４Ｋフォーマットの映像のデータ量に戻す。
具体的には、画素数を補完する、フレームレートを増やす、ビット数を増やす等である。 Video data delayed for a predetermined time by the video memory 103 is input to the video decompression unit 104. The video decompression unit 104 performs processing to restore the amount of data reduced by the video compression unit 102. For example, when the video compression unit 102 reduces the 8K format video to the 4K format video data amount, the 4K format video whose amount is reduced is returned to the 8K format video data amount. Similarly, when the 8K format video is reduced to the 2K format video data amount, the data amount reduced 2K format video is returned to the 8K format video data amount. Similarly, when the 4K format video is reduced to the 2K format video data amount, the data amount reduced 2K format video is returned to the 4K format video data amount.
Specifically, the number of pixels is complemented, the frame rate is increased, the number of bits is increased, and the like.

同様に、音声メモリ１０８で所定の時間遅延された音声データは、音声伸張部１０９へ入力される。音声伸張部１０９では、音声圧縮部１０７で削減されたデータ量を戻す処理を行う。
例えば、音声圧縮部１０７でサンプリングレートを低下させて音声のデータを削減した場合は、サンプリングレートを高くして音声のデータ量を戻す。同様に、音声圧縮部１０７でビット数を削減して音声のデータを削減した場合は、ビット数を増やして音声のデータ量を戻す。 Similarly, audio data delayed for a predetermined time by the audio memory 108 is input to the audio expansion unit 109. The audio expansion unit 109 performs processing to restore the amount of data reduced by the audio compression unit 107.
For example, when the voice compression unit 107 reduces the sampling rate to reduce voice data, the sampling rate is increased to return the voice data amount. Similarly, when the audio compression unit 107 reduces the number of bits to reduce audio data, the number of bits is increased and the amount of audio data is returned.

制御部１１１は、映像データ及び音声データのそれぞれのデータ量の復元にかかる時間のテーブルを持っている。映像データのデータ量の復元にかかる時間のテーブルとは、例えば、図３（ｂ）のようなものである。 The control unit 111 has a table of time taken to restore the data amount of each of the video data and the audio data. The table of the time taken to restore the data amount of the video data is, for example, as shown in FIG. 3 (b).

このように、制御部１１１は、映像処理部１０１及び映像圧縮部１０２及び映像伸張部１０４のそれぞれの処理でかかる時間を把握している。従って、映像処理部１０１及び映像圧縮部１０２及び映像伸張部１０４の全体でかかる時間も把握することができる。同様に、制御部１１１は、音声処理部１０６及び音声圧縮部１０７及び音声伸張部１０９のそれぞれの処理でかかる時間を把握している。従って、音声処理部１０６及び音声圧縮部１０７及び音声伸張部１０９の全体でかかる時間も把握することができる。 As described above, the control unit 111 recognizes the time required for processing of each of the video processing unit 101, the video compression unit 102, and the video decompression unit 104. Therefore, the time taken by the video processing unit 101, the video compression unit 102, and the video decompression unit 104 as a whole can also be grasped. Similarly, the control unit 111 grasps the time required for processing of each of the audio processing unit 106, the audio compression unit 107, and the audio expansion unit 109. Therefore, the time taken for the audio processing unit 106, the audio compression unit 107, and the audio expansion unit 109 as a whole can also be grasped.

以上のことから、制御部１１１は、映像処理部１０１及び映像圧縮部１０２及び映像伸張部１０４の全体でかかる時間と音声処理部１０６及び音声圧縮部１０７及び音声伸張部１０９の全体でかかる時間の時間差も把握することができる。かかる時間差を無くすように映像メモリ１０３及び音声メモリ１０８の遅延時間（前述の所定の時間）を制御すれば映像データと音声データのリップシンク（同期）を取ることができる。 From the above, the control unit 111 takes the time taken by the video processing unit 101, the video compression unit 102, and the video decompression unit 104, and the time taken by the audio processing unit 106, the audio compression unit 107, and the audio decompression unit 109 in total. Time difference can also be grasped. By controlling the delay time (the above-mentioned predetermined time) of the video memory 103 and the audio memory 108 so as to eliminate such a time difference, it is possible to take lip sync (synchronization) between the video data and the audio data.

かかる実施形態によれば、映像データ及び音声データをそれぞれ圧縮（データ削減）した上でメモリで遅延させているので、メモリ容量が少ない場合でも大きな遅延量を得ることができ、映像データ及び音声データのずれが大きい場合でも映像データ及び音声データのリップシンク(同期)を取ることができる。 According to this embodiment, since video data and audio data are each compressed (data reduction) and delayed by the memory, a large amount of delay can be obtained even when the memory capacity is small, and video data and audio data can be obtained. Even when the gap between the video data and the audio data is large, lip synchronization (synchronization) can be performed.

このようにリップシンクの取られた映像データは表示部１０５で表示され、音声データはスピーカ１１０から放音される。
（第２の実施形態）
第１の実施形態では、映像データ及び音声データ共に圧縮／伸張処理を行うものとして説明した。しかし、メモリの容量と映像データと音声データのずれ量によっては、必ずしも映像データ及び音声データ共に圧縮／伸張処理を行う必要の無い場合もある。 The video data thus lip-synced is displayed on the display unit 105, and the audio data is emitted from the speaker 110.
Second Embodiment
The first embodiment has been described as performing compression / decompression processing for both video data and audio data. However, depending on the memory capacity and the amount of deviation between video data and audio data, it may not be necessary to perform compression / decompression processing for both video data and audio data.

第２の実施形態では、メモリの容量と映像データと音声データのずれ量によって映像デ
ータ及び音声データ共に圧縮／伸張処理を行うか否かを制御する場合の実施形態を説明する。 In the second embodiment, an embodiment in which whether to perform compression / decompression processing for both video data and audio data is controlled based on the memory capacity and the shift amount between the video data and the audio data will be described.

制御部１１１は、フォーマット情報と処理の仕方による処理時間のテーブルと、映像データ及び音声データのそれぞれのデータ量削減にかかる時間のテーブルと、映像データ及び音声データのそれぞれのデータ量の復元にかかる時間のテーブルを持っている。
これらのテーブルにより映像処理部１０１及び映像圧縮部１０２及び映像伸張部１０４のそれぞれの処理でかかる時間が把握できる。同様に、音声処理部１０６及び音声圧縮部１０７及び音声伸張部１０９のそれぞれの処理でかかる時間も把握できる。 The control unit 111 restores the data amount of each of the video data and the audio data, the table of the processing time according to the format information and the processing method, the table of the time required to reduce the data amount of each of the video data and the audio data I have a table of time.
The time taken by each processing of the video processing unit 101, the video compression unit 102, and the video decompression unit 104 can be grasped from these tables. Similarly, the time taken for processing of each of the audio processing unit 106, the audio compression unit 107, and the audio expansion unit 109 can be grasped.

これに加え、制御部１１１に映像メモリ１０３のメモリ容量及び音声メモリ１０８のメモリ容量の情報を持たせる。
これにより、制御部１１１は、映像処理部１０１でかかる時間と音声処理部１０６でかかる時間の時間差が、映像メモリ１０３及び音声メモリ１０８で遅延できる時間より短い場合は、映像データ及び音声データの圧縮／伸張処理を行わず、映像処理部１０１でかかる時間と音声処理部１０６でかかる時間の時間差が、映像メモリ１０３及び音声メモリ１０８で遅延できる時間より長い場合は、映像データ及び音声データの圧縮／伸張処理を行うように制御する。
ここで、映像データと音声データの一方のみを圧縮／伸張する処理をするように制御しても良い。
（第３の実施形態）
第１の実施形態及び第２の実施形態では、映像データ及び音声データをそれぞれメモリで遅延させてリップシンクを取っていたが、必ずしも、映像データ及び音声データの両方にメモリを具備する必要が無い場合もある。
例えば、ソース源（例えば、ＤＶＤプレーヤ）と映像音声再生装置との間にＡＶアンプが入る場合は、明らかに音声データの方が映像データより遅れるため、映像音声再生装置は、映像データの処理系統にだけメモリを具備すれば良い。第３の実施形態は、そのような場合の実施形態について説明する。 In addition to this, the control unit 111 has information of the memory capacity of the video memory 103 and the memory capacity of the audio memory 108.
Thus, when the time difference between the time taken by the video processing unit 101 and the time taken by the audio processing unit 106 is shorter than the time that can be delayed by the video memory 103 and the audio memory 108, the control unit 111 compresses the video data and audio data. When the time difference between the time taken by the video processing unit 101 and the time taken by the audio processing unit 106 is longer than the time that can be delayed by the video memory 103 and the audio memory 108 without compression / decompression processing, compression of video data and audio data Control to perform decompression processing.
Here, control may be performed to perform processing of compressing / decompressing only one of video data and audio data.
Third Embodiment
In the first embodiment and the second embodiment, the video data and the audio data are respectively delayed by the memory to take a lip sync, but it is not necessary to provide the memory for both the video data and the audio data. In some cases.
For example, when an AV amplifier is inserted between a source (for example, a DVD player) and a video and audio reproduction apparatus, the audio and video reproduction apparatus clearly delays the video data compared to the video data. You only need to have a memory. The third embodiment describes such an embodiment.

図４は、本発明の第３の実施形態にかかる映像音声再生装置のブロック図である。
図４において、３は、ＡＶアンプである。ソース源２(例えば、ＤＶＤプレーヤ)から出力された映像データは映像音声再生装置１の映像処理部１０１に入力される。一方、ソース源２(例えば、ＤＶＤプレーヤ)から出力された音声データはＡＶアンプ３に入力される。ＡＶアンプ３から出力された音声データは、映像音声再生装置１の音声処理部１０６に入力される。 FIG. 4 is a block diagram of a video and audio reproduction apparatus according to a third embodiment of the present invention.
In FIG. 4, 3 is an AV amplifier. Video data output from a source source 2 (for example, a DVD player) is input to the video processing unit 101 of the video and audio reproduction device 1. On the other hand, audio data output from the source 2 (for example, a DVD player) is input to the AV amplifier 3. The audio data output from the AV amplifier 3 is input to the audio processing unit 106 of the video and audio reproduction device 1.

ここで、音声データは、ＡＶアンプ３を経由することで、映像データより遅れるため、映像音声再生装置１では、映像データのみをメモリ１０３により遅延させる。
音声データは、メモリで遅延することなくスピーカ１１０から放音される。
これにより、音声データ系統のメモリを削減しつつ、リップシンクを取ることができる。 Here, since the audio data is delayed from the video data by passing through the AV amplifier 3, in the video and audio reproduction device 1, only the video data is delayed by the memory 103.
Audio data is emitted from the speaker 110 without delay in the memory.
Thus, it is possible to take lip sync while reducing the memory of the audio data system.

１：映像音声再生装置、２：ソース源、３：ＡＶアンプ、１０１：映像処理部、１０２：映像圧縮部、１０３：映像メモリ、１０４：映像伸張部、１０５：表示部、１０６：音声処理部、１０７：音声圧縮部、１０８：音声メモリ、１０９：音声圧縮部、１１０：スピーカ、１１１：制御部
1: Video / audio reproduction device 2: Source source 3: AV amplifier 101: Video processing unit 102: Video compression unit 103: Video memory 104: Video expansion unit 105: Display unit 106: Audio processing unit , 107: audio compression unit, 108: audio memory, 109: audio compression unit, 110: speaker, 111: control unit

Claims

A video / audio reproduction apparatus that synchronizes and outputs video data and audio data,
A video processing unit that extracts format information of input video data and / or audio data;
A control unit that specifies processing time of video data and / or audio data based on the format information and detects a time difference between the video data and the audio data;
A memory for delaying video data and / or audio data based on a time difference between the video data and audio data detected by the control unit;
An audio / video reproduction apparatus characterized by comprising:

The video and audio reproduction apparatus according to claim 1, wherein
The video / audio reproduction apparatus, wherein the control unit specifies processing time of video data and / or audio data based on information of processing of the video data and / or audio data in addition to the format information.

The video and audio reproduction apparatus according to claim 1 or 2, wherein
A compression unit that compresses the video data and / or audio data;
An expansion unit for expanding the compressed video data and / or audio data;
And the compression unit compresses video data and / or audio data before input to the memory, and the decompression unit decompresses video data and / or audio data output from the memory.
A video and audio reproduction device characterized by

The video and audio reproduction apparatus according to claim 3, wherein
The video and audio reproduction apparatus, wherein the compression unit reduces the number of pixels, the frame rate, or the number of bits of the video data.

The video and audio reproduction apparatus according to claim 3, wherein
The video and audio reproduction apparatus, wherein the compression unit reduces the sampling rate or the number of bits of the audio data.