JP6343171B2

JP6343171B2 - Receiver

Info

Publication number: JP6343171B2
Application number: JP2014086585A
Authority: JP
Inventors: 正芳大西; 松村　欣司; 欣司松村
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2014-04-18
Filing date: 2014-04-18
Publication date: 2018-06-13
Anticipated expiration: 2034-04-18
Also published as: JP2015207873A

Description

本発明は、複数の異なる伝送路から伝送されてくる信号を受信する受信装置に関する。 The present invention relates to a receiving apparatus that receives signals transmitted from a plurality of different transmission paths.

近年、放送のデジタル化と通信のブロードバンド化の進展に伴い、放送通信連携サービスを実現するシステムについての研究開発が行われている。なお、当該システムは、第１撮像装置と、第１撮像装置から送られてくるデータ（音声信号と映像信号）に基づいて生成される放送信号を放送波として送出する第１送出装置と、第２撮像装置と、第２撮像装置から送られてくるデータ（音声信号と映像信号）に基づいて生成される通信信号をＩＰ伝送により送出する第２送出装置と、放送信号と通信信号を受信する受信装置（受信機）により構成される。なお、以下では、第１送出装置と第２送出装置をまとめて送出装置（送出機）と呼ぶ。 In recent years, with the progress of digitalization of broadcasting and broadbandization of communication, research and development have been conducted on a system that realizes a broadcasting / communication cooperation service. The system includes a first imaging device, a first transmission device that transmits a broadcast signal generated based on data (audio signal and video signal) transmitted from the first imaging device, as a broadcast wave, Two image pickup devices, a second sending device for sending communication signals generated based on data (audio signals and video signals) sent from the second image pickup device by IP transmission, and receiving broadcast signals and communication signals It is composed of a receiving device (receiver). Hereinafter, the first sending device and the second sending device are collectively referred to as a sending device (sending machine).

また、受信装置は、複数の情報（映像、音声、字幕等）のコンポーネントで構成されているマルチメディアコンテンツが複数の異なる伝送路（例えば、放送と通信）を利用して配信されてきた場合には、各コンポーネントの再生タイミングを同期して提示する機能を有する必要がある。
これは、伝送経路によってコンポーネントを構成する各パケットが受信機に到達までに要する時間が異なるためである。 In addition, the receiving apparatus receives a case where multimedia content composed of components of a plurality of pieces of information (video, audio, subtitles, etc.) is distributed using a plurality of different transmission paths (for example, broadcasting and communication). Needs to have a function of presenting the reproduction timing of each component in synchronization.
This is because the time required for each packet constituting the component to reach the receiver differs depending on the transmission path.

例えば、送出機は、各コンポーネントにタイムスタンプを付加する。
タイムスタンプは、各コンポーネントの配信ユニット（映像・音声フレーム等の配信単位）ごとに付加されるものであり、その配信ユニットのデータが受信機で提示されるべき時刻を示している。受信機は、このタイムスタンプを利用して、各コンポーネントの再生タイミングの同期を図っている。 For example, the transmitter adds a time stamp to each component.
The time stamp is added to each distribution unit (distribution unit of video / audio frames, etc.) of each component, and indicates the time when the data of the distribution unit should be presented by the receiver. The receiver uses this time stamp to synchronize the playback timing of each component.

具体的には、受信機は、タイムスタンプの値に基づいて、早く到着したコンポーネントを適宜バッファリングし、遅く到着するコンポーネントと同期させることにより、提示タイミングを合わせている。 Specifically, the receiver buffers the components that arrive early based on the value of the time stamp, and synchronizes with the components that arrive late, thereby matching the presentation timing.

また、上述したように、タイムスタンプの付加は、送出機側で行われる。送出機は、システム全体で共通なシステムクロックを持ち、それを基準に各コンポーネントの配信ユニットにタイムスタンプを付加して配信している。 Further, as described above, the time stamp is added on the transmitter side. The transmitter has a system clock common to the entire system, and a time stamp is added to the distribution unit of each component based on the system clock.

特開２０１２−１０００９号公報JP 2012-10009 A 特開２０１２−２０５０７５号公報JP 2012-205075 A

このようなシステムにおいては、経路ごとに独立した送出機があり、タイムスタンプは送出機ごとのシステムクロックに依存している。また、各送出機のシステムクロックを同期することにより、提示同期を行う方式が種々提案されている（例えば、特許文献１、２）。
しかし、このようなシステムにおいては、サービス提供者側が、各送出機のシステムクロックの同期をとるための機構を整備する必要がある。 In such a system, there is an independent transmitter for each path, and the time stamp depends on the system clock for each transmitter. Various methods for performing presentation synchronization by synchronizing the system clock of each transmitter have been proposed (for example, Patent Documents 1 and 2).
However, in such a system, it is necessary for the service provider side to provide a mechanism for synchronizing the system clock of each transmitter.

本発明は、送出機側においてタイムスタンプを調整せずに、複数の異なる伝送路から伝送されてくる信号の同期制御を行うことができる受信装置を提供することを目的とする。 An object of the present invention is to provide a receiving apparatus capable of performing synchronization control of signals transmitted from a plurality of different transmission paths without adjusting a time stamp on the transmitter side.

本発明に係る受信装置は、第１伝送路から送信されてきた第１信号を受信する第１通信部と、前記第１信号を第１映像信号と第１音声信号に分離する第１分離部と、前記第１映像信号と前記第１音声信号を蓄積する第１バッファー部と、前記第１映像信号を復号する第１映像復号部と、前記第１音声信号を復号する第１音声復号部と、前記第１音声復号部により復号された前記第１音声信号から特徴情報を抽出して第１特徴信号を生成する第１特徴抽出部と、前記第１伝送路とは異なる第２伝送路から送信されてきた第２信号を受信する第２通信部と、前記第２信号を第２映像信号と第２音声信号に分離する第２分離部と、前記第２映像信号と前記第２音声信号を蓄積する第２バッファー部と、前記第２映像信号を復号する第２映像復号部と、前記第２音声信号を復号する第２音声復号部と、前記第２音声復号部により復号された前記第２音声信号から特徴情報を抽出して第２特徴信号を生成する第２特徴抽出部と、前記第１特徴信号に基づいて、前記第２特徴信号の遅延量を算出する遅延処理部と、前記遅延処理部により算出した遅延量に基づいて、再生クロックを出力するタイミングを決定する再生クロック生成部とを備え、前記第１バッファー部は、前記再生クロック生成部により生成された再生クロックが入力されたときに、前記第１映像信号を前記第１映像復号部に出力し、前記第１音声信号を前記第１音声復号部に出力し、前記第１特徴抽出部は、前記第１音声信号をフーリエ変換することにより周波数領域波形の情報に変換し、前記第２特徴抽出部は、前記第２音声信号をフーリエ変換することにより周波数領域波形の情報に変換する構成である。 A receiving apparatus according to the present invention includes a first communication unit that receives a first signal transmitted from a first transmission path, and a first separation unit that separates the first signal into a first video signal and a first audio signal. A first buffer for storing the first video signal and the first audio signal, a first video decoding unit for decoding the first video signal, and a first audio decoding unit for decoding the first audio signal A first feature extraction unit that extracts feature information from the first speech signal decoded by the first speech decoding unit to generate a first feature signal, and a second transmission path that is different from the first transmission path A second communication unit that receives the second signal transmitted from the second communication unit; a second separation unit that separates the second signal into a second video signal and a second audio signal; and the second video signal and the second audio. A second buffer unit for storing signals; a second video decoding unit for decoding the second video signal; A second speech decoding unit that decodes the second speech signal; a second feature extraction unit that extracts feature information from the second speech signal decoded by the second speech decoding unit and generates a second feature signal; A delay processing unit that calculates a delay amount of the second feature signal based on the first feature signal, and a recovered clock that determines a timing for outputting the recovered clock based on the delay amount calculated by the delay processing unit The first buffer unit outputs the first video signal to the first video decoding unit when the reproduction clock generated by the reproduction clock generation unit is input, and the first buffer unit outputs the first video signal to the first video decoding unit. An audio signal is output to the first audio decoding unit, the first feature extraction unit converts the first audio signal into information of a frequency domain waveform by Fourier transform, and the second feature extraction unit Second voice It is configured to convert the information in the frequency domain waveform by Fourier transform No..

かかる構成によれば、受信装置は、第１特徴信号に基づいて、第２特徴信号の遅延量を算出し、当該遅延量に相当する時間分遅らせて、第１バッファー部に蓄積されている第１映像信号を第１映像復号部に出力し、第１音声信号を第１音声復号部に出力し、第１信号と第２信号を同期させるので、送出機側においてタイムスタンプを調整せずに、複数の異なる伝送路から伝送されてくる信号の同期制御を行うことができる。 According to this configuration, the receiving device calculates the delay amount of the second feature signal based on the first feature signal, delays it by a time corresponding to the delay amount, and stores the delay amount in the first buffer unit. Since one video signal is output to the first video decoding unit, the first audio signal is output to the first audio decoding unit, and the first signal and the second signal are synchronized, the time stamp is not adjusted on the transmitter side. Thus, synchronization control of signals transmitted from a plurality of different transmission paths can be performed.

受信装置では、前記第１特徴抽出部は、前記第１音声信号を所定のサンプリング周波数でサンプリングすることにより情報量を減少し、情報量が減少された前記第１音声信号に対してフーリエ変換することにより周波数領域波形の情報に変換し、前記第２特徴抽出部は、前記第２音声信号を所定のサンプリング周波数でサンプリングすることにより情報量を減少し、情報量が減少された前記第２音声信号に対してフーリエ変換することにより周波数領域波形の情報に変換する構成でもよい。 In the receiving device, the first feature extraction unit reduces the information amount by sampling the first audio signal at a predetermined sampling frequency, and performs a Fourier transform on the first audio signal with the reduced information amount. The second feature extraction unit reduces the amount of information by sampling the second audio signal at a predetermined sampling frequency, and the second audio with the information amount reduced. A configuration may be employed in which information is converted into frequency domain waveform information by performing Fourier transform on the signal.

かかる構成によれば、受信装置は、第１通信部で受信した第１信号の特徴的な部分を残しつつ情報量を減少し、また、第２通信部で受信した第２信号の特徴的な部分を残しつつ情報量を減少するので、遅延処理部において効率的に遅延量を算出することができる。 According to such a configuration, the receiving apparatus reduces the amount of information while leaving a characteristic part of the first signal received by the first communication unit, and a characteristic of the second signal received by the second communication unit. Since the amount of information is reduced while leaving a portion, the delay amount can be calculated efficiently in the delay processing unit.

受信装置では、前記第１特徴抽出部は、前記フーリエ変換により得られた第１音声信号から高域成分を除去することにより前記第１特徴信号を生成し、前記第２特徴抽出部は、前記フーリエ変換により得られた第２音声信号から高域成分を除去することにより前記第２特徴信号を生成する構成でもよい。 In the receiving device, the first feature extraction unit generates the first feature signal by removing a high frequency component from the first audio signal obtained by the Fourier transform, and the second feature extraction unit The second feature signal may be generated by removing a high frequency component from the second sound signal obtained by Fourier transform.

かかる構成によれば、受信装置は、第１通信部で受信した第１信号の特徴的な部分を残しつつ情報量を減少して第１特徴信号を生成し、また、第２通信部で受信した第２信号の特徴的な部分を残しつつ情報量を減少して第２特徴信号を生成するので、遅延処理部において効率的に遅延量を算出することができる。 According to such a configuration, the receiving device generates the first feature signal by reducing the amount of information while leaving the characteristic portion of the first signal received by the first communication unit, and also receives the signal by the second communication unit. Since the second feature signal is generated by reducing the information amount while leaving the characteristic part of the second signal, the delay processing unit can efficiently calculate the delay amount.

本発明によれば、複数の異なる伝送路から伝送されてくる信号の同期制御を行うことができる。 According to the present invention, synchronization control of signals transmitted from a plurality of different transmission paths can be performed.

放送通信連携システムの構成を示す図である。It is a figure which shows the structure of a broadcast communication cooperation system. 放送通信連携システムの具体的な構成を示す図である。It is a figure which shows the specific structure of a broadcast communication cooperation system. 受信機の処理手順についての説明に供するフローチャートである。It is a flowchart with which it uses for description about the process sequence of a receiver. 第１特徴抽出部の処理手順についての説明に供するフローチャートである。It is a flowchart with which it uses for description about the process sequence of a 1st feature extraction part. 遅延処理部の処理手順についての説明に供するフローチャートである。It is a flowchart with which it uses for description about the process sequence of a delay process part.

以下、本発明の実施の形態について図面を参照して説明する。
放送通信連携システム１００は、図１に示すように、放送局１と、放送用アンテナ２と、受信機３と、ＩＰ伝送サーバ４により構成されており、放送と通信の連携サービスを実現するシステムである。具体的には、放送通信連携システム１００では、受信機３において、ＩＳＤＢ（ＩｎｔｅｇｒａｔｅｄＳｅｒｖｉｃｅｓＤｉｇｉｔａｌＢｒｏａｄｃａｓｔｉｎｇ：統合デジタル放送サービス）方式による放送サービスと、インターネットＮ等のＩＰ網による通信サービスとを連携する。ユーザは、受信機３によって、放送通信連携サービスを利用することができる。なお、本実施例では、放送通信連携システム１００は、ＩＳＤＢ方式に対応するものとして説明するが、ＩＳＤＢ方式に限られず他の方式でもよい。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
As shown in FIG. 1, the broadcasting / communication cooperation system 100 includes a broadcasting station 1, a broadcasting antenna 2, a receiver 3, and an IP transmission server 4, and realizes a broadcasting / communication cooperation service. It is. Specifically, in the broadcasting / communication cooperation system 100, in the receiver 3, a broadcasting service based on an ISDB (Integrated Services Digital Broadcasting) system and a communication service using an IP network such as the Internet N are linked. The user can use the broadcasting / communication cooperation service by the receiver 3. In this embodiment, the broadcasting / communication cooperation system 100 is described as being compatible with the ISDB method, but is not limited to the ISDB method, and may be another method.

放送局１（放送用アンテナ２）から放送される放送信号は、放送設備によって放送されている従来のデジタル放送の放送信号と同一であり、詳細は後述するが、ＡＲＩＢ（ＡｓｓｏｃｉａｔｉｏｎｏｆＲａｄｉｏＩｎｄｕｓｔｒｉｅｓａｎｄＢｒｏａｄｃａｓｔ：社団法人電波産業会）標準規格で規定される。 A broadcast signal broadcast from the broadcast station 1 (broadcast antenna 2) is the same as a broadcast signal of a conventional digital broadcast broadcast by a broadcast facility, and will be described in detail later, but will be described in ARIB (Association of Radio Industries and Broadcast). : Radio Industry Association of Japan) Specified in the standard.

放送局１は、図示しないが、番組編成設備、番組送出設備、送信設備等から構成される一般的なデジタル放送用の放送設備を有している。 Although not shown, the broadcasting station 1 has a general digital broadcasting broadcasting facility including a program organization facility, a program transmission facility, a transmission facility, and the like.

また、放送局１は、コンテンツを含んだ放送信号を放送する。コンテンツには、放送スケジュールにしたがって放送される通常コンテンツである番組や、番組とは非同期に発生する緊急コンテンツであるイベントがある。 The broadcast station 1 broadcasts a broadcast signal including content. The content includes a program that is a normal content broadcast according to a broadcast schedule and an event that is an emergency content that occurs asynchronously with the program.

このように構成される放送通信連携システム１００の受信機３は、映像と音声を扱う規格であるＩＳＤＢ方式により放送波とＩＰ網から配信されてきた映像と音声を同時に処理する機能を有している。以下に、当該機能を実現するための一の構成例について説明する。 The receiver 3 of the broadcasting / communication cooperation system 100 configured as described above has a function of simultaneously processing broadcast waves and video and audio distributed from the IP network by the ISDB method which is a standard for handling video and audio. Yes. Hereinafter, a configuration example for realizing the function will be described.

受信機３は、図２に示すように、第１通信部１１と、第１分離部１２と、第１バッファー部１３と、第１映像復号部１４と、第１音声復号部１５と、第１特徴抽出部１６と、第２通信部１７と、第２分離部１８と、第２バッファー部１９と、第２映像復号部２０と、第２音声復号部２１と、第２特徴抽出部２２と、遅延処理部２３と、再生クロック生成部２４とを備える。なお、以下では、第１通信部１１は、放送信号を受信し、第２通信部１７は、インターネットＮ等のＩＰ網を介して送信されてくる通信信号を受信するものとして説明するが、これに限られず、第１通信部１１によって通信信号を受信し、第２通信部１７によって放送信号を受信する構成でもよい。 As illustrated in FIG. 2, the receiver 3 includes a first communication unit 11, a first separation unit 12, a first buffer unit 13, a first video decoding unit 14, a first audio decoding unit 15, One feature extraction unit 16, second communication unit 17, second separation unit 18, second buffer unit 19, second video decoding unit 20, second audio decoding unit 21, and second feature extraction unit 22 And a delay processing unit 23 and a reproduction clock generation unit 24. In the following description, the first communication unit 11 receives a broadcast signal, and the second communication unit 17 receives a communication signal transmitted via an IP network such as the Internet N. However, the present invention is not limited to this, and the first communication unit 11 may receive a communication signal and the second communication unit 17 may receive a broadcast signal.

例えば、野球の放送を行う場合において、放送局１は、第１カメラ１０１によって投手の映像を撮像し、第２カメラ１０２によって打者の映像を撮像する。マイク１０３は、球場の声援又はアナウンサーの音声を集音する。
放送局１は、第１カメラ１０１で撮像した映像を第１映像符号化部１０４で符号化し、マイク１０３で集音した音声を第１音声符号化部１０５で符号化し、符号化した映像と音声を第１多重化部１０６で多重化し、多重化した信号を放送信号として、放送用アンテナ２を介して各家庭に放送する。
また、放送局１は、第２カメラ１０２で撮像した映像と、マイク１０３で集音した音声をＩＰ伝送サーバ４に送信する。
ＩＰ伝送サーバ４は、第２カメラ１０２で撮像した映像を第２映像符号化部１０７で符号化し、マイク１０３で集音した音声を第２音声符号化部１０８で符号化し、符号化した映像と音声を第２多重化部１０９で多重化し、多重化した信号を通信信号として、インターネットＮ等のＩＰ網を介して伝送する。
つまり、放送信号と通信信号では、含まれている映像はそれぞれ異なっているが、音声は同一である。 For example, when a baseball broadcast is performed, the broadcasting station 1 captures a video of a pitcher by the first camera 101 and an image of a batter by the second camera 102. The microphone 103 collects the voice of the stadium support or the announcer.
The broadcast station 1 encodes video captured by the first camera 101 using the first video encoding unit 104, encodes audio collected by the microphone 103 using the first audio encoding unit 105, and encodes the encoded video and audio. Are multiplexed by the first multiplexing unit 106, and the multiplexed signal is broadcasted as a broadcast signal to each home via the broadcast antenna 2.
In addition, the broadcast station 1 transmits the video captured by the second camera 102 and the sound collected by the microphone 103 to the IP transmission server 4.
The IP transmission server 4 encodes the video captured by the second camera 102 with the second video encoding unit 107, encodes the audio collected by the microphone 103 with the second audio encoding unit 108, and encodes the encoded video The voice is multiplexed by the second multiplexing unit 109, and the multiplexed signal is transmitted as a communication signal via an IP network such as the Internet N.
That is, the broadcast signal and the communication signal include different images, but the sound is the same.

第１通信部１１は、第１伝送路から送信されてきた第１信号を受信する。具体的には、第１通信部１１は、放送用アンテナ２を介して放送局１から放送されてきた放送信号（第１信号）を受信する。 The first communication unit 11 receives the first signal transmitted from the first transmission path. Specifically, the first communication unit 11 receives a broadcast signal (first signal) broadcast from the broadcast station 1 via the broadcast antenna 2.

第１分離部１２は、第１信号を第１映像信号と第１音声信号に分離する。
第１バッファー部１３は、第１映像信号と第１音声信号を蓄積する。
第１映像復号部１４は、第１映像信号を復号する。
第１音声復号部１５は、第１音声信号を復号する。
第１特徴抽出部１６は、第１音声復号部１５により復号された第１音声信号から特徴情報を抽出して第１特徴信号を生成する。 The first separation unit 12 separates the first signal into a first video signal and a first audio signal.
The first buffer unit 13 stores the first video signal and the first audio signal.
The first video decoding unit 14 decodes the first video signal.
The first audio decoding unit 15 decodes the first audio signal.
The first feature extraction unit 16 extracts feature information from the first speech signal decoded by the first speech decoding unit 15 to generate a first feature signal.

第２通信部１７は、第１伝送路とは異なる第２伝送路から送信されてきた第２信号を受信する。具体的には、第２通信部１７は、インターネットＮ等のＩＰ網を介してＩＰ伝送サーバ４から伝送されてきた通信信号（第２信号）を受信する。 The second communication unit 17 receives a second signal transmitted from a second transmission path different from the first transmission path. Specifically, the second communication unit 17 receives a communication signal (second signal) transmitted from the IP transmission server 4 via an IP network such as the Internet N.

なお、受信機３は、第１通信部１１により放送信号（第１信号）を受信したときに、放送信号に含まれている固有のコンテンツＩＤを抽出し、当該コンテンツＩＤをＩＰ伝送サーバ４に送信することにより、第１通信部１１で受信した放送信号に関連する通信信号を要求するものとする。ＩＰ伝送サーバ４は、受信機３から送信されてきたコンテンツＩＤに対応するコンテンツを受信機３に伝送する。 When the receiver 3 receives the broadcast signal (first signal) by the first communication unit 11, the receiver 3 extracts a unique content ID included in the broadcast signal and sends the content ID to the IP transmission server 4. By transmitting, a communication signal related to the broadcast signal received by the first communication unit 11 is requested. The IP transmission server 4 transmits content corresponding to the content ID transmitted from the receiver 3 to the receiver 3.

第２分離部１８は、第２信号を第２映像信号と第２音声信号に分離する。
第２バッファー部１９は、第２映像信号と第２音声信号を蓄積する。
第２映像復号部２０は、第２映像信号を復号する。
第２音声復号部２１は、第２音声信号を復号する。
第２特徴抽出部２２は、第２音声復号部２１により復号された第２音声信号から特徴情報を抽出して第２特徴信号を生成する。 The second separator 18 separates the second signal into a second video signal and a second audio signal.
The second buffer unit 19 stores the second video signal and the second audio signal.
The second video decoding unit 20 decodes the second video signal.
The second audio decoding unit 21 decodes the second audio signal.
The second feature extraction unit 22 extracts feature information from the second audio signal decoded by the second audio decoding unit 21 to generate a second feature signal.

遅延処理部２３は、第１特徴信号に基づいて、第２特徴信号の遅延量を算出する。
再生クロック生成部２４は、遅延処理部２３により算出した遅延量に基づいて、再生クロックを出力するタイミングを決定する。
第１バッファー部１３は、再生クロック生成部２４により生成された再生クロックが入力されたときに、第１映像信号を第１映像復号部１４に出力し、第１音声信号を第１音声復号部１５に出力する。 The delay processing unit 23 calculates the delay amount of the second feature signal based on the first feature signal.
The reproduction clock generation unit 24 determines the timing for outputting the reproduction clock based on the delay amount calculated by the delay processing unit 23.
The first buffer unit 13 outputs the first video signal to the first video decoding unit 14 when the reproduction clock generated by the reproduction clock generation unit 24 is input, and the first audio signal is output to the first audio decoding unit. 15 is output.

また、第１映像復号部１４で復号された第１映像信号と、第１音声復号部１５で復号された第１音声信号と、第２映像復号部２０で復号された第２映像信号と、第２音声復号部２１で復号された第２音声信号とは、出力部２５に供給される。出力部２５は、図示しないディスプレイとスピーカにより構成されている。 Also, the first video signal decoded by the first video decoding unit 14, the first audio signal decoded by the first audio decoding unit 15, the second video signal decoded by the second video decoding unit 20, The second audio signal decoded by the second audio decoding unit 21 is supplied to the output unit 25. The output unit 25 includes a display and a speaker (not shown).

ここで、受信機３の処理手順について図３に示すフローチャートを参照しながら説明する。なお、以下では、第１通信部１１は、放送用アンテナ２を介して放送局１から放送されてきた放送信号（第１信号）を受信し、第２通信部１７は、インターネットＮ等のＩＰ網を介してＩＰ伝送サーバ４から伝送されてきた通信信号（第２信号）を受信したものとする。 Here, the processing procedure of the receiver 3 will be described with reference to the flowchart shown in FIG. In the following description, the first communication unit 11 receives a broadcast signal (first signal) broadcast from the broadcast station 1 via the broadcast antenna 2, and the second communication unit 17 receives an IP such as the Internet N. It is assumed that a communication signal (second signal) transmitted from the IP transmission server 4 via the network is received.

ステップＳ１において、第１分離部１２は、第１信号を第１映像信号と第１音声信号に分離する。
ステップＳ２において、第１音声復号部１５は、第１音声信号を復号する。
ステップＳ３において、第１特徴抽出部１６は、第１音声信号から特徴情報を抽出して第１特徴信号を生成する。 In step S1, the first separation unit 12 separates the first signal into a first video signal and a first audio signal.
In step S2, the first audio decoding unit 15 decodes the first audio signal.
In step S3, the first feature extraction unit 16 extracts feature information from the first audio signal and generates a first feature signal.

ステップＳ４において、第２分離部１８は、第２信号を第２映像信号と第２音声信号に分離する。
ステップＳ５において、第２音声復号部２１は、第２音声信号を復号する。
ステップＳ６において、第２特徴抽出部２２は、第２音声信号から特徴情報を抽出して第２特徴信号を生成する。
ステップＳ７において、遅延処理部２３は、第１特徴信号に基づいて、第２特徴信号の遅延量を算出する。 In step S4, the second separator 18 separates the second signal into a second video signal and a second audio signal.
In step S5, the second audio decoding unit 21 decodes the second audio signal.
In step S6, the second feature extraction unit 22 extracts feature information from the second audio signal to generate a second feature signal.
In step S7, the delay processing unit 23 calculates the delay amount of the second feature signal based on the first feature signal.

ステップＳ８において、再生クロック生成部２４は、遅延処理部２３により算出した遅延量に基づいて、再生クロックを出力するタイミングを決定する。再生クロック生成部２４は、決定したタイミングで再生クロックを第１バッファー部１３に供給する。 In step S <b> 8, the recovered clock generation unit 24 determines the timing for outputting the recovered clock based on the delay amount calculated by the delay processing unit 23. The reproduction clock generation unit 24 supplies the reproduction clock to the first buffer unit 13 at the determined timing.

このようにして、受信機３は、第１特徴信号に基づいて、第２特徴信号の遅延量を算出し、当該遅延量に相当する時間分遅らせて、第１バッファー部１３に蓄積されている第１映像信号を第１映像復号部に出力し、第１音声信号を第１音声復号部に出力することにより、第１信号と第２信号を同期させるので、信号の送出（放送局１とＩＰ伝送サーバ４）側においてタイムスタンプを調整せずに、受信機３自体で複数の異なる伝送路から伝送されてくる信号の同期制御を行うことができる。 In this way, the receiver 3 calculates the delay amount of the second feature signal based on the first feature signal, delays it by a time corresponding to the delay amount, and stores the delay amount in the first buffer unit 13. Since the first video signal is output to the first video decoding unit and the first audio signal is output to the first audio decoding unit, the first signal and the second signal are synchronized. Without adjusting the time stamp on the IP transmission server 4) side, the receiver 3 itself can perform synchronization control of signals transmitted from a plurality of different transmission paths.

第１特徴抽出部１６は、第１音声信号を所定のサンプリング周波数でサンプリングすることにより情報量を減少し、情報量が減少された第１音声信号をフーリエ変換し、フーリエ変換により得られた第１音声信号から高域成分を除去することにより第１特徴信号を生成する構成でもよい。 The first feature extraction unit 16 reduces the amount of information by sampling the first audio signal at a predetermined sampling frequency, Fourier-transforms the first audio signal with the reduced information amount, and obtains the first obtained by the Fourier transform. The configuration may be such that the first feature signal is generated by removing the high frequency component from one audio signal.

第２特徴抽出部２２は、第２音声信号を所定のサンプリング周波数でサンプリングすることにより情報量を減少し、情報量が減少された第２音声信号をフーリエ変換し、フーリエ変換により得られた第２音声信号から高域成分を除去することにより第２特徴信号を生成する構成でもよい。 The second feature extraction unit 22 reduces the amount of information by sampling the second audio signal at a predetermined sampling frequency, performs a Fourier transform on the second audio signal with the reduced information amount, and obtains a second obtained by the Fourier transform. The second feature signal may be generated by removing the high frequency component from the two audio signals.

ここで、第１特徴抽出部１６の処理手順について図４に示すフローチャートを参照しながら説明する。なお、第２特徴抽出部２２の処理手順も基本的に下記手順と同様である。
ステップＳ１１において、第１特徴抽出部１６は、放送局１において第１音声信号に対して用いられたサンプリング周波数（例えば、４４．１ｋＨｚ）よりも低い周波数ｆｓ（例えば、２２ｋＨｚ等）でサンプリングを行う。本工程により、第１音声信号の情報を間引き、全体のデータ量を減少する効果がある。なお、サンプリング周波数ｆｓは、可変であるものとする。
ステップＳ１２において、第１特徴抽出部１６は、サンプリング後の第１音声信号をフーリエ変換する。本工程により、第１音声信号を周波数領域波形の情報に変換する。
ステップＳ１３において、第１特徴抽出部１６は、カットオフ周波数ｆｃに基づいて、第１音声信号から高域成分を除去し、第１特徴信号を生成する。なお、カットオフ周波数ｆｃは、可変であるものとする。 Here, the processing procedure of the first feature extraction unit 16 will be described with reference to the flowchart shown in FIG. The processing procedure of the second feature extraction unit 22 is basically the same as the following procedure.
In step S11, the first feature extraction unit 16 performs sampling at a frequency fs (for example, 22 kHz) lower than the sampling frequency (for example, 44.1 kHz) used for the first audio signal in the broadcast station 1. . This step has the effect of thinning out the information of the first audio signal and reducing the entire data amount. Note that the sampling frequency fs is variable.
In step S12, the first feature extraction unit 16 performs a Fourier transform on the sampled first audio signal. By this step, the first audio signal is converted into frequency domain waveform information.
In step S13, the first feature extraction unit 16 removes the high frequency component from the first audio signal based on the cutoff frequency fc, and generates a first feature signal. Note that the cutoff frequency fc is variable.

かかる構成によれば、受信機３は、第１通信部１１で受信した第１信号の特徴的な部分を残しつつ情報量を減少して第１特徴信号を生成し、また、第２通信部１７で受信した第２信号の特徴的な部分を残しつつ情報量を減少して第２特徴信号を生成するので、後段の遅延処理部２３において短い処理時間で効率的に遅延量を算出することができる。 According to such a configuration, the receiver 3 generates the first feature signal by reducing the amount of information while leaving the characteristic portion of the first signal received by the first communication unit 11, and the second communication unit. Since the second feature signal is generated by reducing the amount of information while leaving the characteristic portion of the second signal received in step 17, the delay amount can be calculated efficiently in a short processing time in the delay processing unit 23 at the subsequent stage. Can do.

遅延処理部２３は、第１特徴信号と第２特徴信号に基づいて相関関数を算出し、算出した相関関数に対して逆フーリエ変換を行い、逆フーリエ変換を行った相関関数から遅延量を算出する構成でもよい。 The delay processing unit 23 calculates a correlation function based on the first feature signal and the second feature signal, performs an inverse Fourier transform on the calculated correlation function, and calculates a delay amount from the correlation function obtained by performing the inverse Fourier transform. The structure to do may be sufficient.

ここで、遅延処理部２３の処理手順について図５に示すフローチャートを参照しながら説明する。なお、以下では、第１特徴信号を周波数領域波形情報Ｘ（ｆ）といい、第２特徴信号を周波数領域波形情報Ｙ（ｆ）という。 Here, the processing procedure of the delay processing unit 23 will be described with reference to the flowchart shown in FIG. Hereinafter, the first feature signal is referred to as frequency domain waveform information X (f), and the second feature signal is referred to as frequency domain waveform information Y (f).

ステップＳ２１において、遅延処理部２３は、周波数領域における相関関数Ｚ（ｆ）を算出する。
Ｚ（ｆ）＝Ｘ^＊（ｆ）×Ｙ（ｆ）・・・（１）
なお、＊は、共役を示す。 In step S21, the delay processing unit 23 calculates a correlation function Z (f) in the frequency domain.
Z (f) = X ^* (f) × Y (f) (1)
Note that * indicates conjugation.

ステップＳ２２において、遅延処理部２３は、逆高速フーリエ変換（逆ＦＦＴ）の演算を行い、時間領域における相関関数Ｚ（ｔ）を算出する。
Ｚ（ｔ）＝Ｆ^−１（Ｘ^＊（ｆ）×Ｙ（ｆ））・・・（２） In step S22, the delay processing unit 23 performs inverse fast Fourier transform (inverse FFT) to calculate a correlation function Z (t) in the time domain.
Z (t) = F ⁻¹ (X ^* (f) × Y (f)) (2)

ステップＳ２３において、遅延処理部２３は、遅延量τを算出する。例えば、遅延処理部２３は、相関関数Ｚ（ｔ）が最大となるｔを算出し、最大値ｔを遅延量τとする。 In step S23, the delay processing unit 23 calculates a delay amount τ. For example, the delay processing unit 23 calculates t that maximizes the correlation function Z (t), and sets the maximum value t as the delay amount τ.

かかる構成によれば、受信機３は、第１特徴信号と第２特徴信号に基づいて相関関数を算出し、算出した相関関数に対して逆ＦＦＴを行い、逆ＦＦＴを行った相関関数から遅延量を算出し、当該遅延量に相当する時間分遅らせて、第１バッファー部１３に蓄積されている第１映像信号を第１映像復号部１４に出力し、第１音声信号を第１音声復号部１５に出力し、第１信号と第２信号を同期させるので、信号の送出（放送局１とＩＰ伝送サーバ４）側においてタイムスタンプを調整せずに、受信機３自体で複数の異なる伝送路から伝送されてくる信号の同期制御を行うことができる。 According to this configuration, the receiver 3 calculates a correlation function based on the first feature signal and the second feature signal, performs inverse FFT on the calculated correlation function, and delays from the correlation function that has performed inverse FFT. The first video signal stored in the first buffer unit 13 is output to the first video decoding unit 14 and the first audio signal is decoded by the first audio decoding. Since the first signal and the second signal are synchronized with each other, the receiver 3 itself transmits a plurality of different transmissions without adjusting the time stamp on the signal transmission (broadcast station 1 and IP transmission server 4) side. Synchronous control of signals transmitted from the path can be performed.

また、本実施例では、主に、送出機側においてタイムスタンプを調整せずに、複数の異なる伝送路から伝送されてくる信号の同期制御を行う受信機の構成と動作について説明したが、これに限られず、各構成要素を備え、送出機側においてタイムスタンプを調整せずに、複数の異なる伝送路から伝送されてくる信号の同期制御を行うための方法、及びプログラムとして構成されてもよい。 In the present embodiment, the configuration and operation of the receiver that mainly controls the synchronization of signals transmitted from a plurality of different transmission lines without adjusting the time stamp on the transmitter side have been described. However, the present invention may be configured as a method and a program for performing synchronization control of signals transmitted from a plurality of different transmission paths without adjusting the time stamp on the transmitter side without including each component. .

さらに、受信機の機能を実現するためのプログラムをコンピュータで読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。 Further, the program for realizing the function of the receiver may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read by the computer system and executed.

ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータで読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 The “computer system” here includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a portable medium such as a flexible disk, a magneto-optical disk, a ROM, and a CD-ROM, and a hard disk built in the computer system.

さらに「コンピュータで読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時刻の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時刻プログラムを保持しているものも含んでもよい。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Furthermore, “computer-readable recording medium” means that a program is dynamically held for a short time, like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. It is also possible to include one that holds a program for a certain time, such as a volatile memory inside a computer system that becomes a server or client in that case. Further, the program may be for realizing a part of the above-described functions, and may be capable of realizing the above-described functions in combination with a program already recorded in the computer system. .

１放送局
２放送用アンテナ
３受信機
４ＩＰ伝送サーバ
１１第１通信部
１２第１分離部
１３第１バッファー部
１４第１映像復号部
１５第１音声復号部
１６第１特徴抽出部
１７第２通信部
１８第２分離部
１９第２バッファー部
２０第２映像復号部
２１第２音声復号部
２２第２特徴抽出部
２３遅延処理部
２４再生クロック生成部
２５出力部
１００放送通信連携システム DESCRIPTION OF SYMBOLS 1 Broadcasting station 2 Broadcasting antenna 3 Receiver 4 IP transmission server 11 1st communication part 12 1st isolation | separation part 13 1st buffer part 14 1st video decoding part 15 1st audio | voice decoding part 16 1st characteristic extraction part 17 2nd Communication unit 18 Second separation unit 19 Second buffer unit 20 Second video decoding unit 21 Second audio decoding unit 22 Second feature extraction unit 23 Delay processing unit 24 Reproduced clock generation unit 25 Output unit 100 Broadcast communication cooperation system

Claims

A first communication unit that receives the first signal transmitted from the first transmission path;
A first separation unit for separating the first signal into a first video signal and a first audio signal;
A first buffer for storing the first video signal and the first audio signal;
A first video decoding unit for decoding the first video signal;
A first voice decoding unit for decoding the first voice signal;
A first feature extraction unit that extracts feature information from the first speech signal decoded by the first speech decoding unit to generate a first feature signal;
A second communication unit that receives a second signal transmitted from a second transmission path different from the first transmission path;
A second separation unit for separating the second signal into a second video signal and a second audio signal;
A second buffer for storing the second video signal and the second audio signal;
A second video decoding unit for decoding the second video signal;
A second voice decoding unit for decoding the second voice signal;
A second feature extraction unit that extracts feature information from the second speech signal decoded by the second speech decoding unit to generate a second feature signal;
A delay processing unit that calculates a delay amount of the second feature signal based on the first feature signal;
A reproduction clock generation unit that determines a timing for outputting a reproduction clock based on the delay amount calculated by the delay processing unit;
The first buffer unit outputs the first video signal to the first video decoding unit when the reproduction clock generated by the reproduction clock generation unit is input, and outputs the first audio signal to the first video signal. Output to the voice decoder,
The first feature extraction unit converts the first audio signal into information of a frequency domain waveform by Fourier transform,
The second feature extraction unit is a receiving device that converts the second audio signal into information of a frequency domain waveform by Fourier transform.

The first feature extraction unit reduces the amount of information by sampling the first audio signal at a predetermined sampling frequency, and performs a Fourier transform on the first audio signal with the reduced information amount to perform a frequency domain. Converted into waveform information,
The second feature extraction unit reduces the amount of information by sampling the second audio signal at a predetermined sampling frequency, and performs a Fourier transform on the second audio signal with the reduced amount of information, thereby generating a frequency domain. The receiving apparatus according to claim 1, wherein the receiving apparatus converts into waveform information.

The first feature extraction unit generates the first feature signal by removing a high frequency component from the first audio signal obtained by the Fourier transform,
The receiving apparatus according to claim 1 or 2, wherein the second feature extraction unit generates the second feature signal by removing a high frequency component from the second audio signal obtained by the Fourier transform.