JP2005057638A

JP2005057638A - Digital broadcasting device and reception terminal device corresponding to digital broadcasting

Info

Publication number: JP2005057638A
Application number: JP2003288718A
Authority: JP
Inventors: Hiroyoshi Urakawa; 裕喜浦川; Noriaki Oomoto; 紀顕大本; Etsumi Sakaguchi; 悦美坂口
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2003-08-07
Filing date: 2003-08-07
Publication date: 2005-03-03
Anticipated expiration: 2023-08-07
Also published as: JP4269840B2

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem that it is difficult for TV viewers whose sight, quick reading ability and hearing ability have deteriorated, such as the elderly, to understand the contents of characters quickly displayed or speeches rapidly spoken when acquiring speech and subtitle character information. <P>SOLUTION: A digital broadcasting device is constituted of a digital broadcasting transmission device 50 and a reception device 60 for reception of broadcasting. The digital broadcasting transmission device 50 can broadcast a plurality of kinds of speech information and includes speech data having a speech speed converted in at least one of the plurality of kinds of speech information, can broadcast a plurality of kinds of subtitle information and can include subtitle data having a subtitle display speed converted in at least one of the plurality of kinds of subtitle information. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、デジタル信号の処理装置に係わり、更に詳しくはデジタル放送信号処理、データ信号処理を行う受信端末装置に関するものである。 The present invention relates to a digital signal processing apparatus, and more particularly to a receiving terminal apparatus that performs digital broadcast signal processing and data signal processing.

近年、テレビジョン受信機の機能も多様化し、さまざまな動作を行うものが知られている。例えば、特許文献１に記されているのもその一例であり、従来の放送受信装置として図面を使って説明する。図７は従来のテレビジョン受信機の一構成例であり、その動作例を図８に示す。ここでは「高齢者向けテレビ」について説明する。以下、従来の放送受信装置について図７、図８を参照しながら説明する。 In recent years, functions of television receivers have been diversified, and those that perform various operations are known. For example, what is described in Patent Document 1 is an example, and a conventional broadcast receiving apparatus will be described with reference to the drawings. FIG. 7 shows an example of the configuration of a conventional television receiver, and an operation example thereof is shown in FIG. Here, “television TV for the elderly” will be described. A conventional broadcast receiving apparatus will be described below with reference to FIGS.

図７は従来の放送受信装置のブロック図である。ここでは、高齢者向けテレビのテロップを読み上げる実施例について構成を説明する。１０１はテレビ本体であって、テレビ信号を選局受信・復調して映像信号と音声信号を取り出す選局・復調部と、リモコン信号を受信し復調するリモコン部と、供給された映像信号と自身で復調した映像信号とを切替えて表示出力する映像切替部（１１ｂ）と、供給された音声信号と自身で復調した音声信号とを切替えて出力する音声切替部（１１ａ）と、音声を出力するスピーカ部と、画像を表示するディスプレイ部とで構成する。 FIG. 7 is a block diagram of a conventional broadcast receiving apparatus. Here, the configuration of an embodiment that reads out a television telop for elderly people will be described. Reference numeral 101 denotes a television body, which is a channel selection / demodulation unit that receives and demodulates a television signal to extract a video signal and an audio signal, a remote control unit that receives and demodulates a remote control signal, and the supplied video signal and itself A video switching unit (11b) for switching and displaying the video signal demodulated in step (b), a voice switching unit (11a) for switching and outputting the supplied audio signal and the audio signal demodulated by itself, and outputting the audio A speaker unit and a display unit for displaying an image are included.

１０２は支援装置であって、復調されたリモコン信号のコードを解読する解読部（１２２）と、テレビの映像信号をフレーム単位で記憶する画像メモリ部（１２３）と、テレビの映像信号に含まれたテロップの文字を認識する文字認識部（１２１）と、認識した文字に基づきテロップの文節を抽出する文節抽出部（１３３）と、文節と文節間にポーズを挿入した文章を生成する文章生成部（１２６）と、生成された文章の読み出し速度を調整する話速調節部（１３４）と、読み出し速度を調整した文章を音声合成して音声信号に変換する音声合成部（１２５）と、各部を制御する制御部（１３６）とを備える。実際の装置構成は、映像切替部（１１ｂ）と、音声切替部（１１ａ）とをテレビ本体１０１側に設け、一方、上記画像メモリ部（１２３）と、文字認識部（１２１）と、文節抽出部（１３３）と、文章生成部（１２６）と、話速調節部（１３４）と、音声合成部（１２５）と、制御部（１３６）とを支援装置側に設け、この支援装置を、例えば、ユニット又は、基板形態などで構成し、テレビ本体１０１に設けたコネクタに着脱するようにして、テレビ本体１０１と、支援装置とを別体に構成できる。 Reference numeral 102 denotes a support device, which is included in a decoding unit (122) that decodes a demodulated remote control signal code, an image memory unit (123) that stores a television video signal in units of frames, and a television video signal. A character recognition unit (121) for recognizing a telop character, a phrase extraction unit (133) for extracting a telop phrase based on the recognized character, and a sentence generation unit for generating a sentence with a pause inserted between the phrases (126), a speech speed adjustment unit (134) that adjusts the reading speed of the generated sentence, a voice synthesis unit (125) that synthesizes the sentence with the adjusted reading speed and converts it into a voice signal, And a control unit (136) for controlling. The actual device configuration is that the video switching unit (11b) and the audio switching unit (11a) are provided on the TV main body 101 side, while the image memory unit (123), the character recognition unit (121), and the phrase extraction are provided. A unit (133), a sentence generation unit (126), a speech speed adjustment unit (134), a speech synthesis unit (125), and a control unit (136) are provided on the support device side. The TV main body 101 and the support device can be configured separately by being configured in a unit or substrate form and being attached to and detached from a connector provided on the TV main body 101.

従来例による高齢者向けテレビのテロップを読み上げる実施例の動作を図７、図８に従い説明する。なお、図８は本発明による高齢者向けテレビのテロップの読み上げ方法を示す説明図であり、テロップの例を示す画面（ａ）及び生成した文章例（ｂ）である。復調されたテレビの映像信号を、所要周期、例えば、１秒周期等で、フレーム単位で画像メモリ部（１２３）記憶させる。文字認識部（１２１）は画像メモリ部（１２３）に記憶しているテレビの映像信号に含まれたテロップの文字を認識し、一例として、図８（ａ）に示したテロップ「○○市を中心に地震が・・・」から１文字ずつ認識し文字を取り出す。 The operation of the embodiment that reads out the television telop for elderly people according to the conventional example will be described with reference to FIGS. FIG. 8 is an explanatory view showing a method for reading out a telop of a television for elderly people according to the present invention, which is a screen (a) showing an example of the telop and a generated sentence example (b). The demodulated television video signal is stored in the image memory unit (123) in units of frames at a required period, for example, a one second period. The character recognition unit (121) recognizes the telop characters included in the video signal of the television stored in the image memory unit (123). As an example, the telop “◯◯ city” shown in FIG. Recognize one character at a time from "Earthquake in the center ..."

文節抽出部（１３３）は、認識した文字に基づきテロップの文節、例えば、「○○市を」、「中心に」、「地震が」、・・・を抽出する。文章生成部（２６）は文節と文節間にポーズを挿入した文章を生成する。例えば、図８（ｂ）の（イ）などに示したように、文節と文節間にポーズ「・」を挿入した文章を生成する。 The phrase extraction unit (133) extracts a telop phrase, for example, “XX city”, “center”, “earthquake”,... Based on the recognized characters. The sentence generation unit (26) generates a sentence with a pause inserted between the phrases. For example, as shown in (a) of FIG. 8B, a sentence in which a pause “·” is inserted between phrases is generated.

話速調節部（１３４）は生成された文章の読み出し速度を調整し、例えば、１秒間に２文字ずつの速度で文章を読み出して、音声合成部（１２５）へ供給し、同音声合成部（１２５）は供給された読み出し速度を調整した文章に基づき音声合成して音声信号を生成し、次段に接続した音声切替部（１１ａ）へ供給する。同音声切替部（１１ａ）は供給された音声信号と、自身で復調したテレビ音声信号とを切替えてスピーカ部へ出力する。また、図７に示したように、上述の構成に、メモリ部（１３５）を付加して設ける実施例によれば、上記文章生成部（１２６）で生成された文章を該メモリ部（１３５）に記憶させ、適宜、繰り返し読み出す方法を実現できるので、より一層確実にニュース内容を伝えることができる。上述した文章生成部（１２６）について、文章を生成する他の詳細動作を以下に説明する。 The speech speed adjustment unit (134) adjusts the reading speed of the generated sentence, reads out the sentence at a rate of two characters per second, for example, and supplies it to the voice synthesis unit (125). 125) generates a speech signal by synthesizing speech based on the supplied text with the read speed adjusted, and supplies the speech signal to the speech switching unit (11a) connected to the next stage. The audio switching unit (11a) switches between the supplied audio signal and the TV audio signal demodulated by itself and outputs it to the speaker unit. Further, as shown in FIG. 7, according to the embodiment in which the memory unit (135) is added to the above-described configuration, the sentence generated by the sentence generation unit (126) is stored in the memory unit (135). It is possible to realize a method of storing and storing the information repeatedly and appropriately, so that the news content can be transmitted more reliably. Other detailed operations for generating a sentence for the above-described sentence generation unit (126) will be described below.

図８（ｂ）の（イ）、（ロ）、（ハ）に示したように、文章生成部（１２６）ではポーズ（図では、「・」）を１種類に限定するばかりでなく、例えば、１０ミリ秒間隔等で任意の長さのポーズを挿入する。また、文章の先頭にチャイム音データ（図では、「☆」）を付加するようにする。尚、音声合成部（１２５）で任意周波数のチャイム音、例えば、４００Ｈｚと８００Ｈｚなど、を合成する。 As shown in (a), (b), and (c) of FIG. 8B, the sentence generator (126) not only limits the pause (in the figure, “•”) to one type, Insert pauses of arbitrary length at intervals of 10 milliseconds. Also, chime sound data (“☆” in the figure) is added to the head of the sentence. The voice synthesis unit (125) synthesizes a chime sound having an arbitrary frequency, for example, 400 Hz and 800 Hz.

また、図７に示したように、上述の構成に、語句を登録する第一記憶部（１３１）と、語句を判別する判別部（１２７）とを付加して設ける実施例によれば、判別部（１２７）がテロップ中の重要語句を判別し、判別結果に基づき、上記文章生成部（１２６）は、前記重要語句を繰り返した文章を生成する。例えば、図８（ｂ）の（ロ）に示したように、「地震」を第一記憶部（１３１）に登録し、判別部（１２７）がテロップ中の重要語句「地震」を判別した判別結果に基づき、「地震」を繰り返した文章を生成すると、ニュース内容が一層確実に伝わる。 Further, as shown in FIG. 7, according to the embodiment in which the first storage unit (131) for registering a phrase and the determination unit (127) for determining a phrase are added to the above-described configuration. The part (127) discriminates an important word / phrase in the telop, and based on the discrimination result, the sentence generation part (126) generates a sentence in which the important word / phrase is repeated. For example, as shown in (b) of FIG. 8B, “earthquake” is registered in the first storage unit (131), and the discrimination unit (127) discriminates the important phrase “earthquake” in the telop. Based on the results, the content of the news can be transmitted more reliably by generating sentences that repeat the “earthquake”.

さらに、判別部（１２７）がテロップ中の文末語句を判別することにより、この判別結果に基づき、上記文章生成部（１２６）はテロップの文末語句を省略した文章を生成できる。例えば、図８（ｂ）の（ハ）に示したように、「しました」、「した」等の文末語句を関連付けて第一記憶部（１３１）に登録し、文章生成部（１２６）は文末語句、例えば、「しました」等を省略した文章を生成できる。さらに、判別部（１２７）がテロップ中の文末語句「しました」を判別し、この判別結果に基づき、上記文章生成部はテロップの文末語句を他の短い文末語句「した」に置き換えた文章を生成するようにしても良い（図８（ｂ）の（ニ））。 Further, the discrimination unit (127) discriminates the sentence end phrase in the telop, and based on the discrimination result, the sentence generation unit (126) can generate a sentence without the telop sentence end phrase. For example, as shown in (c) of FIG. 8 (b), end sentence phrases such as “I did” and “I did” are associated and registered in the first storage unit (131), and the sentence generation unit (126) It is possible to generate a sentence in which a sentence ending phrase such as “I did” is omitted. Further, the discriminating unit (127) discriminates the sentence end phrase “I did” in the telop, and based on the discrimination result, the sentence generation unit replaces the sentence in which the telop sentence end phrase is replaced with another short sentence end phrase “Done”. You may make it produce | generate ((d) of FIG.8 (b)).

これらの機能を実現することにより、文章の読み上げ時間を短縮できるので、テロップが複数ページにわたっている場合に、画面のテロップ表示と、読み上げの同期を取り易くできる。また、図７に示したように、上述の構成に、オンスクリーン部（１２４）を付加して設置した実施例によれば、このオンスクリーン部（１２４）は、生成された文章に基づきオンスクリーン用の映像信号を生成し、上記映像切替部（１１ｂ）に供給するものである。従って、放送されたニュース速報などの定型のテロップを、大きな文字で表示し又、重要語句を繰り返して表示する等、理解し易い形態で自由に画面表示できる。 By realizing these functions, it is possible to shorten the reading time of the text. Therefore, when the telop covers a plurality of pages, it is possible to easily synchronize the telop display on the screen with the reading. Further, as shown in FIG. 7, according to the embodiment in which the on-screen unit (124) is added to the above-described configuration, the on-screen unit (124) is based on the generated text. Video signal is generated and supplied to the video switching unit (11b). Accordingly, it is possible to freely display the screen in a form that is easy to understand, such as displaying a fixed-form telop such as a broadcast news bulletin in large characters or displaying important words repeatedly.

以上のように、上記の従来の放送受信機は、映像表示されているテロップから文字認識を行い、表示速度変換を行うものであり、またテロップ表示に同期させて音声速度変換を行う仕組みであった。
特開２０００−６９３９０号公報（図１） As described above, the conventional broadcast receiver described above recognizes characters from the telop displayed on the video, converts the display speed, and converts the audio speed in synchronization with the telop display. It was.
JP 2000-69390 A (FIG. 1)

従来の放送受信機において視聴者が文字情報を入手する場合、高齢者など視力、読速力の低下した視聴者は、文字表示速度が速い時などは内容を理解するのが難しいという問題があったが、画像フレームの文字情報を解析し、特定の文字の表示速度を変更可能にすることで問題を解決している。しかし、アナログ放送信号では速度変換された音声データ、字幕（テロップ）データは送られて来ないため、受信機で一旦画像を取り込み、認識、解読、表示調整までの処理を行う必要があった。そのため、取り込み用メモリ、認識、解読のための回路等を追加する必要があり回路規模も増大していた。さらに、テロップ部分の音声案内の読み上げ速度変換のみ音声に関しては機能として持っていないため、放送番組自体の音声の速度変換をさせることは不可能であった。 When viewers obtain text information in a conventional broadcast receiver, viewers with reduced vision or reading ability such as elderly people have difficulty in understanding the content when the text display speed is fast. However, the problem is solved by analyzing the character information of the image frame and changing the display speed of a specific character. However, since the audio data and subtitle (telop) data subjected to speed conversion are not sent in the analog broadcast signal, it is necessary to capture the image once by the receiver and perform processing from recognition, decoding, and display adjustment. Therefore, it is necessary to add a memory for capture, a circuit for recognition and decoding, and the circuit scale has been increased. Further, since only the voice reading speed conversion of the voice guidance in the telop part is not provided as a function, it is impossible to convert the speed of the voice of the broadcast program itself.

この課題を解決するために本発明は、映像情報や音声情報、あるいは文字情報等で構成される放送データを送信可能なデジタル放送において、音声情報を複数種類放送することが可能で、そのうち少なくとも一種類に音声速度を変換した音声データが含まれるようにした。 In order to solve this problem, the present invention is capable of broadcasting a plurality of types of audio information in digital broadcasting capable of transmitting broadcast data composed of video information, audio information, text information, etc., and at least one of them is broadcast. Added audio data converted voice speed to type.

また、デジタル放送において字幕情報を複数種類放送することが可能で、そのうち少なくとも一種類に字幕表示速度を変換した字幕データが含まれるようにした。 Also, it is possible to broadcast a plurality of types of subtitle information in digital broadcasting, and at least one type of subtitle information includes subtitle data obtained by converting the subtitle display speed.

また、デジタル放送を受信可能なチューナーと、チューナー出力より所望のトランスポートストリームを選局する復調手段と、復調手段の出力するトランスポートストリームから映像、音声データの選択、出力、および字幕情報のデータの選択、取得を行うトランスポートデコード手段と、字幕情報のデータの展開・蓄積を行うＲＡＭメモリと、トランスポートデコード手段で取得された映像データをデコード可能な映像デコード手段と、トランスポートデコード手段で取得された音声データをデコード可能な音声デコード手段と、映像デコード手段の出力とデコードされた字幕データとを合成可能な映像合成手段と、前記復調手段、前記トランスポートデコード手段、前記映像デコード手段、前記音声デコード手段、前記映像合成手段の制御を行う制御手段で構成し、ユーザーの選択により放送で送られてくる音声速度を変換した音声データをデコード可能で、速度変換した音声に合わせて字幕速度を変換可能とした。 Also, a tuner capable of receiving digital broadcasting, a demodulating unit for selecting a desired transport stream from the tuner output, and selection and output of video and audio data from the transport stream output from the demodulating unit, and subtitle information data A transport decoding means for selecting and obtaining the data, a RAM memory for developing and storing subtitle information data, a video decoding means capable of decoding the video data obtained by the transport decoding means, and a transport decoding means. Audio decoding means capable of decoding the acquired audio data, video synthesizing means capable of synthesizing the output of the video decoding means and decoded subtitle data, the demodulation means, the transport decoding means, the video decoding means, Control of the audio decoding means and the video synthesizing means Constituted by control means for the audio data obtained by converting the sound speed transmitted by broadcast by a user of the selective decodable, it was capable of converting subtitles speed to match the voice and speed conversion.

また、上記と同じ構成で、表示速度が変換された字幕データも音声速度を変換したデータとともに放送で送られてきた場合、ユーザーが音声速度を変換した音声データを表示選択した時は字幕表示も速度変換されたデータを自動的に選択することが可能とした。また逆に、ユーザーが字幕表示速度を変換したデータを表示選択した時は音声表示も速度変換されたデータを自動的に選択、デコード、表示可能とした。 In the same configuration as above, when subtitle data whose display speed has been converted is also sent together with the data whose audio speed has been converted, subtitles are also displayed when the user selects to display the audio data whose audio speed has been converted. It was made possible to automatically select the speed converted data. On the other hand, when the user selects to display the data with the converted subtitle display speed, the audio display can automatically select, decode and display the speed converted data.

以上のように本発明によれば、映像情報や音声情報、あるいは文字情報等で構成される放送データを送信可能なデジタル放送において、音声情報を複数種類放送することが可能で、そのうち少なくとも一種類に音声速度を変換した音声データが含まれることで、高齢者ユーザーなどがテレビ放送で速い音声放送では聞き取りにくい場合、なかなか内容の理解をできない問題点を、速度変換された音声放送を行うことで、聴力の落ちた高齢者の方々など音声情報取得が容易に出来るようになる。 As described above, according to the present invention, a plurality of types of audio information can be broadcast in a digital broadcast capable of transmitting broadcast data composed of video information, audio information, character information, etc., and at least one of them can be broadcast. If voice data with converted voice speed is included, it may be difficult to understand the contents of elderly users, etc., with high-speed voice broadcasting. In addition, voice information can be easily obtained for elderly people who have lost their hearing.

また、映像情報や音声情報、あるいは文字情報等で構成される放送データを送信可能な、デジタル放送において、字幕情報を複数種類放送することが可能で、そのうち少なくとも一種類に字幕表示速度を変換した字幕データが含まれることで、高齢者ユーザーなどが、速い速度で表示される字幕では読み切れないまま次の表示へ移ってしまって内容をなかなか理解できない問題点を、速度変換された字幕放送を行うことで、視力の落ちた高齢者の方々など字幕情報取得が容易に出来るようになる。 In addition, it is possible to broadcast multiple types of caption information in digital broadcasting that can transmit broadcast data composed of video information, audio information, text information, etc., and the caption display speed is converted to at least one of them. By including subtitle data, elderly users, etc., perform subtitle broadcasting that has been speed-converted to the problem that it is difficult to understand the content because it has moved to the next display without being able to read with subtitles displayed at high speed As a result, it is possible to easily obtain subtitle information such as elderly people with low vision.

また、デジタル放送を受信可能なチューナーと、前記チューナー出力より所望のトランスポートストリームを選局する復調手段と、復調手段の出力するトランスポートストリームから映像、音声データの選択、出力、および字幕情報のデータの選択、取得を行うトランスポートデコード手段と、前記字幕情報のデータの展開・蓄積を行うＲＡＭメモリと、前記トランスポートデコード手段で取得された映像データをデコード可能な映像デコード手段と、前記トランスポートデコード手段で取得された音声データをデコード可能な音声デコード手段と、前記映像デコード手段の出力とデコードされた字幕データとを合成可能な映像合成手段と、前記復調手段、前記トランスポートデコード手段、前記映像デコード手段、前記音声デコード手段、前記映像合成手段の制御を行う制御手段を備えることで、ユーザーの選択により放送で送られてくる音声速度を変換した音声データをデコード可能で、速度変換した音声に合わせて字幕速度を変換可能となる。 In addition, a tuner capable of receiving digital broadcasting, a demodulating unit that selects a desired transport stream from the tuner output, and selection and output of video and audio data from the transport stream output from the demodulating unit, and subtitle information Transport decoding means for selecting and acquiring data; RAM memory for expanding and storing the data of the caption information; video decoding means capable of decoding video data acquired by the transport decoding means; Audio decoding means capable of decoding the audio data acquired by the port decoding means, video synthesizing means capable of synthesizing the output of the video decoding means and decoded subtitle data, the demodulation means, the transport decoding means, The video decoding means, the audio decoding means, By providing the control means for controlling the video synthesizing means, it is possible to decode the audio data converted from the audio speed sent by broadcast according to the user's selection, and the subtitle speed can be converted according to the speed converted audio Become.

また、上記受信端末装置と同様の構成を備え、表示速度が変換された字幕データも音声速度を変換したデータとともに放送で送られてきた場合、ユーザーが音声速度を変換した音声データを表示選択した時は字幕表示も速度変換されたデータを自動的に選択することが可能となる。 In addition, when the subtitle data having the same configuration as the receiving terminal device and the display speed converted is also sent together with the data converted from the audio speed, the user selects to display the audio data converted from the audio speed. When subtitles are displayed, the speed-converted data can be automatically selected.

また、上記受信端末装置と同様の構成を備え、表示速度が変換された字幕データも音声速度を変換したデータとともに放送で送られてきた場合、ユーザーが字幕表示速度を変換したデータを表示選択した時は音声表示も速度変換されたデータを自動的に選択、デコード可能となる。 In addition, when the subtitle data having the same configuration as that of the receiving terminal device and the subtitle data whose display speed is converted is also transmitted together with the data whose audio speed is converted, the user selects to display the data whose subtitle display speed is converted. In some cases, the voice display can automatically select and decode the speed-converted data.

本発明は、放送波で速度変換された音声、字幕を送出し、受信端末側でユーザーの支持により速度変換した音声、字幕を表示可能なことを最も主要な特徴とする。通常の音声、字幕表示速度を速いと感じる高齢者などに全ての情報を入手可能にさせるという目的を、本発明の放送および受信端末装置で実現した。 The most important feature of the present invention is that it can transmit voice and subtitles that have been speed-converted by broadcast waves, and can display voice and subtitles that have been speed-converted by the support of the user on the receiving terminal side. The purpose of making all information available to elderly people who feel normal voice and subtitle display speed is high is realized by the broadcasting and receiving terminal device of the present invention.

（実施の形態１）
以下に本発明のデジタル放送装置の、第１の実施の形態例について図１、図２を用いて説明する。ここではその一例として、デジタルテレビ放送について述べる。 (Embodiment 1)
A first embodiment of a digital broadcasting apparatus according to the present invention will be described below with reference to FIGS. Here, digital television broadcasting is described as an example.

図１において、５０はデジタル放送を送信するための装置であり、６０はデジタル放送を受信するための受信端末である。このデジタル放送装置５０から出力された信号をデジタル放送受信端末６０で受信し、信号処理を行い表示装置に表示するのが、デジタル放送システムであり、主にデジタル放送装置５０はテレビ局に備えられ、また家庭ではデジタル放送受信端末６０によってテレビ局が発信したデジタル放送を受信している。またデジタル放送受信端末６０は、その一例として家庭内のテレビジョン受信機であっても、もしくはデジタル放送を受信可能なコンピュータ、持ち運び可能な受信端末、または携帯電話であってもかまわない。 In FIG. 1, 50 is a device for transmitting digital broadcasts, and 60 is a receiving terminal for receiving digital broadcasts. A digital broadcast system receives a signal output from the digital broadcast device 50 by a digital broadcast receiving terminal 60, performs signal processing and displays the signal on a display device, and the digital broadcast device 50 is mainly provided in a television station. At home, a digital broadcast transmitted from a television station is received by the digital broadcast receiving terminal 60. As an example, the digital broadcast receiving terminal 60 may be a home television receiver, or a computer capable of receiving digital broadcasting, a portable receiving terminal, or a mobile phone.

デジタル放送装置５０は以下の構成を備える。２１は映像信号入力可能で、映像信号をＭＰＥＧ信号にエンコード可能な映像エンコーダ、２２は音声信号を入力可能で、音声信号をＭＰＥＧ信号にエンコード可能な第一の音声エンコーダ、２３は２２に入力される音声の速度を変換した信号を入力可能で、速度変換した音声信号をＭＰＥＧ信号にエンコード可能な第二の音声エンコーダ、２４は２２の音声エンコーダの音声出力と同期した字幕データを作成可能な第一の字幕データ作成装置、２５は２３の音声エンコーダ出力と同期した字幕データを作成可能な第二の字幕データ作成装置、２６は２１〜２５の装置からの出力の多重化を行いトランスポートストリームとして出力可能な多重化装置、２７は放送方式として規定されている変調方式に変調可能な変調手段、２８は前記変調手段２７からの出力を送出アンテナにアップリンク可能な電波送出手段、２９は送出アンテナへの出力端子である。 The digital broadcasting device 50 has the following configuration. 21 is a video encoder that can input a video signal and can encode the video signal into an MPEG signal. 22 is a first audio encoder that can input an audio signal and can encode the audio signal into an MPEG signal. A second audio encoder capable of inputting a speed-converted audio signal and encoding the speed-converted audio signal into an MPEG signal, and 24 is a subtitle data that can generate subtitle data synchronized with the audio output of 22 audio encoders. One subtitle data generating device, 25 is a second subtitle data generating device capable of generating subtitle data synchronized with 23 audio encoder outputs, and 26 is a transport stream by multiplexing outputs from 21 to 25 devices. Multiplexer capable of output; 27, modulation means capable of modulating to a modulation system defined as a broadcast system; Uplink possible wave transmission means sends the antenna outputs from 27, 29 are output terminals to the sending antenna.

以上のように構成された図１のデジタル放送装置５０について、以下その動作を説明する。 The operation of the digital broadcasting apparatus 50 of FIG. 1 configured as described above will be described below.

デジタルテレビ放送において、映像、音声、字幕などのデータ情報は放送装置により作成される。映像信号はビデオテープやデジタルカメラなどの映像素材を映像エンコーダ２１に入力し、その信号はＭＰＥＧ２ビデオストリームとしてエンコードされて出力される。音声信号は映像同様、音声素材を音声エンコーダ２２に入力し、その信号はＭＰＥＧ２音声ストリームとしてエンコードされて出力される。さらに字幕放送などのデータ放送は字幕データ作成装置２４により生成され、映像、音声に同期した信号として出力される。通常の放送は上記３つの装置、映像エンコーダ２１、音声エンコーダ２２、字幕データ作成装置２４、を用いることで放送信号を作成し送出される。 In digital television broadcasting, data information such as video, audio, and subtitles is created by a broadcasting device. As a video signal, a video material such as a video tape or a digital camera is input to the video encoder 21, and the signal is encoded and output as an MPEG2 video stream. As for the audio signal, the audio material is input to the audio encoder 22 like the video, and the signal is encoded and output as an MPEG2 audio stream. Further, a data broadcast such as a caption broadcast is generated by the caption data creation device 24 and output as a signal synchronized with video and audio. A normal broadcast uses the above three devices, the video encoder 21, the audio encoder 22, and the caption data creation device 24, to create and send a broadcast signal.

本実施の形態のデジタル放送装置５０はさらに第２の音声エンコーダ２３を備える。この第２の音声エンコーダ２３では、第一の音声エンコーダ２２に入力される音声素材を速度変換した音声素材を入力し、ＭＰＥＧ２音声ストリームとしてエンコードされて出力される。また第２の音声エンコーダに入力される音声素材に同期した字幕データを第２の字幕データ作成装置２５により生成され、速度変換した音声に同期した信号として出力される。 The digital broadcasting device 50 according to the present embodiment further includes a second audio encoder 23. In the second audio encoder 23, audio material obtained by speed-converting the audio material input to the first audio encoder 22 is input, encoded as an MPEG2 audio stream, and output. Subtitle data synchronized with the audio material input to the second audio encoder is generated by the second subtitle data creation device 25 and output as a signal synchronized with the speed-converted audio.

多重化装置２６では、映像エンコーダ２１、第１の音声エンコーダ２２、第２の音声エンコーダ２３、第１の字幕データ作成装置２４および第２の字幕作成装置２５の出力ストリームをＭＰＥＧ２システムとして多重化し、変調手段２７へ出力する。変調手段２７では放送方式に合った変調を行い（例えば、日本のＢＳデジタル、ＣＳ１１０デジタルでは８ＰＳＫ変調、地上デジタル放送ではＯＦＤＭ変調）、電波送出手段２８へ変調信号を出力する。電波送出手段２８では送出アンテナへアップリンクを行い放送波として送出する。この（実施の形態１）では、音声速度変換された音声信号を放送できる放送方式について以下に述べていく。 The multiplexing device 26 multiplexes the output streams of the video encoder 21, the first audio encoder 22, the second audio encoder 23, the first caption data creation device 24, and the second caption creation device 25 as an MPEG2 system, Output to modulation means 27. The modulation means 27 performs modulation suitable for the broadcasting system (for example, 8PSK modulation for Japanese BS digital and CS110 digital, OFDM modulation for terrestrial digital broadcasting), and outputs a modulation signal to the radio wave transmission means 28. The radio wave transmission means 28 uplinks to the transmission antenna and transmits it as a broadcast wave. In this (Embodiment 1), a broadcasting system capable of broadcasting an audio signal whose audio speed has been converted will be described below.

音声速度変換された音声を放送するために、映像エンコーダ２１と第一の音声エンコーダ２２とで構成された放送装置にあらたに第２の音声エンコーダ２３が加わりその送信装置システムは構成される。第２の音声エンコーダ２３には音声素材として、第一の音声エンコーダに入力されている音声の速度を変換した音声素材を入力し、ＭＰＥＧ信号にエンコードして多重化装置２６へと出力される。 In order to broadcast the audio having undergone the audio speed conversion, a second audio encoder 23 is newly added to the broadcasting apparatus constituted by the video encoder 21 and the first audio encoder 22 to constitute a transmission apparatus system. The audio material obtained by converting the speed of the audio input to the first audio encoder is input to the second audio encoder 23 as an audio material, encoded into an MPEG signal, and output to the multiplexing device 26.

多重化された音声は図２のようなストリーム構成で送出される。図２において、「ＡＡＵ」とは「ＡｕｄｉｏＡｃｃｅｓｓＵｎｉｔ」の略で、復号する際の単位であり、受信機はこれらの単位ごとにデコードを行う。ＭＰＥＧストリームではこのＡＡＵ内の「ＡｕｄｉｏＤａｔａ」に複数の音声を多重することが可能であり、通常速度の音声と、速度変換された音声とを多重化し、放送として送出可能となる。 The multiplexed audio is sent out in a stream configuration as shown in FIG. In FIG. 2, “AAU” is an abbreviation for “Audio Access Unit” and is a unit for decoding, and the receiver performs decoding for each of these units. In an MPEG stream, a plurality of sounds can be multiplexed on “Audio Data” in the AAU, and normal speed sound and speed-converted sound can be multiplexed and transmitted as a broadcast.

本実施の形態のデジタル放送装置によれば、速度の異なる音声信号を、多重して送り出すことが可能で、受信装置において復号・出力する際に、音声の速度を選択することができるので、視聴者にとっては映像または音声ソースに応じて音声速度を変えて聞くことが可能となる。特に高齢者ユーザーがテレビ放送で速い音声放送では聞き取りにくい場合、なかなか内容の理解をできない問題点を、速度変換された音声放送を選択できることで、音声情報取得が容易に出来るようになる。 According to the digital broadcasting apparatus of the present embodiment, it is possible to multiplex and send out audio signals having different speeds, and the audio speed can be selected when decoding and outputting in the receiving apparatus. For a person, it is possible to listen by changing the audio speed according to the video or audio source. In particular, when it is difficult for an elderly user to listen to a high-speed audio broadcast on a television broadcast, it is possible to easily acquire audio information by selecting a speed-converted audio broadcast that is difficult to understand the contents.

（実施の形態２）
次に、本発明のデジタル放送装置の、第２の実施の形態例について説明する。ここでは図１、図３を用いて説明する。なお前述した（実施の形態1）で記した例と同じ構成については、同じ符号を用い説明を省略する。本実施の形態では字幕速度変換された字幕信号を放送できる放送方式について以下に述べる。 (Embodiment 2)
Next, a second embodiment of the digital broadcast apparatus of the present invention will be described. Here, description will be made with reference to FIGS. Note that the same components as those in the example described in the above (Embodiment 1) are denoted by the same reference numerals and description thereof is omitted. In this embodiment, a broadcasting system capable of broadcasting a caption signal whose caption speed has been converted will be described below.

字幕速度変換された字幕放送を行うために、（実施の形態１）で構成されたデジタル放送装置に加え、第２の字幕データ作成装置２５で放送装置システムは構成される。第２の字幕データ作成装置２５では、第２の音声エンコーダ２３に入力される音声素材に同期した字幕素材を作成し多重化装置２６へと出力される。多重化された字幕は図３のような構成で送出される。（詳細は「ＡＲＩＢＳＴＤ−Ｂ２４第一編第３部９章字幕・文字スーパーの伝送」参照）。 In order to perform subtitle speed-converted subtitle broadcasting, the second subtitle data creation device 25 includes a broadcasting device system in addition to the digital broadcasting device configured in (Embodiment 1). In the second caption data creating device 25, caption material synchronized with the sound material input to the second sound encoder 23 is created and output to the multiplexing device 26. The multiplexed subtitles are sent out with the configuration shown in FIG. (For details, refer to “ARIB STD-B24, Part 1, Part 3, Chapter 9, Transmission of Subtitles / Superimposed Characters”).

字幕は通常、独立ＰＥＳ形式で伝送される。ここではデータグループ、字幕データとデータグループ識別の対応、字幕管理データを示す。字幕データは、データグループ構成によりグループ化され、独立ＰＥＳ（非同期型／同期型）に収容されて伝送される。１字幕データは最大２５６個のデータグループにより構成される。データグループでは、字幕管理データ、字幕分データの種類を識別する。 Subtitles are usually transmitted in an independent PES format. Here, data groups, correspondence between caption data and data group identification, and caption management data are shown. The caption data is grouped according to the data group configuration, and is accommodated in an independent PES (asynchronous / synchronous) and transmitted. One caption data is composed of a maximum of 256 data groups. In the data group, the types of caption management data and caption data are identified.

データグループ識別（ＤＧＩ）では字幕文の種類を８種類伝送可能であり、日本語、英語などの他国語多重伝送を行うことができる。字幕管理データではデータ受信再生時の時刻制御可能となる情報を伝送する。これら情報に基づき受信機では字幕デコード、表示タイミング制御となる。 In data group identification (DGI), eight types of subtitle texts can be transmitted, and multilingual multi-language transmission such as Japanese and English can be performed. In the caption management data, information that enables time control during data reception and reproduction is transmitted. Based on these information, the receiver performs subtitle decoding and display timing control.

ここでは、前記８種類伝送できる字幕文に、第２の音声エンコーダ２３に入力される音声素材に同期した字幕素材を作成、多重し、速度変換された字幕情報として放送により送出する。これにより、高齢者ユーザーなどが、速い速度で表示される字幕では読み切れないまま次の表示へ移ってしまって内容をなかなか理解できない問題点を、速度変換された字幕放送を行うことで、視力の落ちた高齢者の方々など字幕情報取得が容易に出来るようになる。 Here, subtitle material synchronized with the audio material input to the second audio encoder 23 is created and multiplexed on the eight subtitle texts that can be transmitted, and the subtitle information subjected to speed conversion is transmitted by broadcasting. As a result, elderly users, etc., moved to the next display without being able to read the subtitles displayed at a high speed, and it was difficult to understand the contents. Caption information can be easily obtained for elderly people who have fallen.

（実施の形態３）
次に、本発明のデジタル放送受信端末装置の、一実施の形態例について説明する。ここでは図１、図４を用いて説明する。なお前述した（実施の形態１）、（実施の形態２）の例と同じ構成については同じ符号を用い説明を省略する。ここでの放送内容は先述の映像、通常音声、通常字幕、速度変換された音声のみが多重されている場合である。 (Embodiment 3)
Next, an embodiment of the digital broadcast receiving terminal device of the present invention will be described. Here, description will be made with reference to FIGS. Note that the same reference numerals are used for the same configurations as those in the above-described (Embodiment 1) and (Embodiment 2) examples, and description thereof is omitted. The broadcast content here is a case where only the above-described video, normal audio, normal subtitle, and speed-converted audio are multiplexed.

１はアンテナ信号入力端子、２は前記アンテナ信号入力端子１で受信された放送信号の選局を行うチューナー、３は前記チューナーで選局された信号の放送方式に合わせて復調する復調手段、４は放送データであるトランスポートストリーム情報に従い、パケットＩＤ選択、同期再生、番組情報選択可能なトランスポートデコード手段、５はトランスポートデコード手段４からの映像ストリームパケットをデコード可能な映像デコード手段、６はトランスポートデコード手段４からの音声ストリームパケットをデコード可能な音声デコード手段、７は復調手段３、トランスポートデコード手段４、映像デコード手段５、音声デコード手段６、映像合成手段１０、ＲＡＭメモリ８、不揮発性メモリ９の制御を行うＣＰＵ、８は字幕情報のデータやＯＳＤ映像情報の展開・蓄積を行うＲＡＭメモリ、９は受信端末装置の制御および機能実行を行うためのプログラムを格納可能な不揮発性メモリ、１０は映像デコード手段５と制御手段７、ＲＡＭ８で展開、デコードされる字幕データを合成可能な映像合成手段、１１は映像合成手段１０の出力端子、１２は音声デコード手段６の出力端子、１３はユーザーからの操作指示を受信可能なリモコン入力端子である。 1 is an antenna signal input terminal, 2 is a tuner that selects a broadcast signal received at the antenna signal input terminal 1, and 3 is a demodulator that demodulates in accordance with the broadcast system of the signal selected by the tuner, 4 Is a transport decoding means capable of selecting packet ID, synchronized playback and program information according to transport stream information which is broadcast data, 5 is a video decoding means capable of decoding a video stream packet from the transport decoding means 4, and 6 is Audio decoding means capable of decoding the audio stream packets from the transport decoding means 4, 7 demodulating means 3, transport decoding means 4, video decoding means 5, audio decoding means 6, video synthesizing means 10, RAM memory 8, nonvolatile memory CPU for controlling the referential memory 9, 8 is data of subtitle information A RAM memory for developing / accumulating OSD video information, 9 is a non-volatile memory capable of storing a program for controlling and executing functions of the receiving terminal device, 10 is developed by the video decoding means 5 and the control means 7, and RAM 8. Video synthesizing means capable of synthesizing decoded subtitle data, 11 an output terminal of the video synthesizing means 10, 12 an output terminal of the audio decoding means 6, and 13 a remote control input terminal capable of receiving an operation instruction from the user.

以上のように構成された図１のデジタル放送受信端末装置６０（以下、受信端末と記す）について、以下その動作を説明する。不揮発性メモリ９には、受信端末６０の制御および機能実行を行うためのプログラムを格納されており、制御手段７は不揮発性メモリ９に格納されているプログラムを読み出し、このプログラムに基づき各手段の制御を行う。なお、本実施の形態の動作のフローチャートを、図４に示す。 The operation of the digital broadcast receiving terminal device 60 (hereinafter referred to as a receiving terminal) of FIG. 1 configured as described above will be described below. The non-volatile memory 9 stores a program for controlling the receiving terminal 60 and executing functions, and the control means 7 reads the program stored in the non-volatile memory 9, and based on this program, Take control. A flowchart of the operation of this embodiment is shown in FIG.

アンテナ（図示せず）で受信された放送信号は、アンテナ信号入力端子１から入力され、チューナー２に入力される。チューナー２では所望する番組選局が行われる。チューナー２から出力された信号は復調手段３に入力される。復調手段３では、送られてくる放送信号に対応した復調方法で復調し、誤り訂正処理を行い、トランスポートデコード手段４へはトランスポートストリームが出力される。 A broadcast signal received by an antenna (not shown) is input from the antenna signal input terminal 1 and input to the tuner 2. The tuner 2 selects a desired program. The signal output from the tuner 2 is input to the demodulation means 3. The demodulating means 3 demodulates by a demodulation method corresponding to the broadcast signal sent, performs error correction processing, and outputs a transport stream to the transport decoding means 4.

トランスポートデコード手段４では、パケットＩＤ（ＰＩＤ）処理を行う。これは、トランスポートデコード手段４に設定されたＰＩＤ候補と入力されたトランスポートストリームパケットから取り出したＰＩＤに一致を検出することが可能であり、ＰＩＤ単位にデータフォーマット処理を行う。データフォーマット処理とは、トランスポートストリームパケットから指定のデータ形式のデータを抽出する処理である。また、ＡＲＩＢ、ＤＶＢの準拠の放送では、ＳＩ（ＳｅｒｖｉｃｅＩｎｆｏｒｍａｔｉｏｎ）セクションが数Ｍｂｐｓの帯域を使用して送られてくる。ここで予めセクションフィルタ条件を登録でき、この内一つでも一致するセクションヘッダ（セクションデータバイト前での各フィールド）を持つセクションを抜き出すことができる。このＳＩ情報は常にＣＰＵで管理することができる。映像ストリームは映像デコード手段５へ、音声ストリームは音声デコード手段６へ、ＲＡＭメモリ８には、トランスポートデコード手段４で取得されたＳＩ情報が展開、蓄積される。ＲＡＭ８上に展開された字幕情報は図３と同様である。 The transport decoding means 4 performs packet ID (PID) processing. It is possible to detect a match between the PID candidate set in the transport decoding means 4 and the PID extracted from the input transport stream packet, and the data format processing is performed in units of PID. Data format processing is processing for extracting data of a specified data format from a transport stream packet. In addition, in ARIB and DVB compliant broadcasting, an SI (Service Information) section is sent using a bandwidth of several Mbps. Here, a section filter condition can be registered in advance, and a section having a matching section header (each field before the section data byte) can be extracted. This SI information can always be managed by the CPU. The SI information acquired by the transport decoding unit 4 is developed and accumulated in the video stream to the video decoding unit 5, the audio stream to the audio decoding unit 6, and the RAM memory 8. The subtitle information developed on the RAM 8 is the same as that shown in FIG.

制御手段７は、ユーザーからの操作指示を受信可能なリモコン入力端子１３からの信号を認識し、音声提示の指示が通常音声か話速変換音声かを判断する（図４ステップ１）。通常音声を選択する場合、トランスポートデコード手段４で、通常速度の音声情報、字幕情報を取得し、音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ２）。 The control means 7 recognizes the signal from the remote control input terminal 13 that can receive the operation instruction from the user, and determines whether the instruction of voice presentation is normal voice or speech speed converted voice (step 1 in FIG. 4). When selecting normal audio, the transport decoding means 4 acquires normal speed audio information and subtitle information, and transfers the audio information to the audio decoding means 6 and the subtitle information to the RAM 8 (step 2).

音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得し、デコードを行い、展開する（ステップ３）。展開された字幕情報は映像合成手段１０へ転送され、映像合成手段１０では映像デコード手段５の出力と、放送で送られてくる表示タイミング情報に基づき字幕情報を合成し出力する（ステップ４）。 The audio decoding means 6 decodes and outputs the input audio stream, acquires the subtitle information developed in the RAM 8 by the control means 7, performs decoding, and expands (step 3). The expanded subtitle information is transferred to the video synthesizing means 10, and the video synthesizing means 10 synthesizes and outputs the subtitle information based on the output of the video decoding means 5 and the display timing information sent by broadcasting (step 4).

一方、（ステップ１）において制御手段７で話速変換音声を選択した場合、トランスポートデコード手段４で、話速変換の音声情報、字幕情報を取得し音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ５）。音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得し、デコードを行い、ＲＡＭ８条に展開する（ステップ６）。 On the other hand, when the speech speed conversion voice is selected by the control means 7 in (Step 1), the transport decoding means 4 acquires the speech speed conversion voice information and caption information, and the voice information is sent to the voice decoding means 6 to the caption information. Is transferred to the RAM 8 (step 5). The audio decoding means 6 decodes and outputs the input audio stream, acquires the subtitle information developed in the RAM 8 by the control means 7, performs decoding, and develops it in the RAM 8 (step 6).

次に展開した字幕情報を映像合成手段１０へ転送し、映像合成手段１０では映像デコード手段５の出力と、話速変換された音声の表示タイミング情報を受信機で判断し、音声に合わせる形で字幕情報を合成し出力する（ステップ７）。 Next, the expanded subtitle information is transferred to the video synthesizing means 10, which determines the output of the video decoding means 5 and the display timing information of the voice converted to speech speed by the receiver and matches it with the audio. Subtitle information is synthesized and output (step 7).

これにより、ユーザーが話速変換音声を選択した時に、受信端末では放送で送られてきている通常速度の字幕表示を自動的に音声に同期した形で速度変換を行うことで、聴力、視力の落ちた高齢者などが音声、文字情報を確実に入手可能となる。 As a result, when the user selects speech speed converted speech, the receiving terminal automatically converts the normal speed subtitle display that is sent by broadcast in synchronization with the audio, thereby reducing the hearing and visual acuity. Voice and text information can be reliably obtained by the elderly who have fallen.

また、受信端末５０は、話速変換音声を選択した際には、速度変調を実行していることをユーザーに認識させるための手段を備え、例えば、表示装置において画面に「話速変換中」である表示を行うための信号を出力することで、ユーザーがソースと異なる音声信号を聞いていることを認識することができる。 In addition, the receiving terminal 50 includes means for allowing the user to recognize that the speed modulation is being performed when the speech speed converted speech is selected. For example, the display device displays “Speaking speed conversion” on the screen. By outputting a signal for performing display, it is possible to recognize that the user is listening to an audio signal different from the source.

（実施の形態４）
次に、本発明のデジタル放送受信端末装置の、他の実施の形態例について説明する。ここでは図１、図５を用いて説明する。なお前述した（実施の形態）の例と同じ構成については、同じ符号を用い説明を省略する。ここでの放送内容は先述の映像、通常音声、通常字幕、速度変換された音声、速度変換された字幕が多重されている場合である。なお、本実施の形態の動作のフローチャートを、図５に示す。 (Embodiment 4)
Next, another embodiment of the digital broadcast receiving terminal device of the present invention will be described. Here, description will be made with reference to FIGS. In addition, about the same structure as the example of (embodiment) mentioned above, description is abbreviate | omitted using the same code | symbol. The broadcast content here is a case where the above-mentioned video, normal audio, normal subtitle, speed-converted audio, and speed-converted subtitle are multiplexed. A flowchart of the operation of this embodiment is shown in FIG.

制御手段７は、ユーザーからの操作指示を受信可能なリモコン入力端子１３からの信号を認識し、音声提示の指示が通常音声か話速変換音声かを判断する（図５、ステップ１１）。通常の速度の音声を選択の場合、トランスポートデコード手段４で、通常速度の音声情報、字幕情報を取得し音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ１２）。 The control means 7 recognizes the signal from the remote control input terminal 13 that can receive the operation instruction from the user, and determines whether the instruction of voice presentation is normal voice or speech speed converted voice (step 11 in FIG. 5). When normal speed audio is selected, the transport decoding means 4 acquires normal speed audio information and subtitle information, and the audio information is transferred to the audio decoding means 6 and the subtitle information is transferred to the RAM 8 (step 12).

音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得し、デコードを行う（ステップ１３）。デコードされた字幕情報は、映像合成手段１０へ転送され、映像合成手段１０では映像デコード手段５の出力と、放送で送られてくる通常表示タイミング情報に基づき字幕情報を合成し、出力する（ステップ１４）。 The audio decoding means 6 decodes and outputs the inputted audio stream, acquires the subtitle information developed in the RAM 8 by the control means 7, and performs decoding (step 13). The decoded subtitle information is transferred to the video synthesizing means 10, and the video synthesizing means 10 synthesizes and outputs the subtitle information based on the output of the video decoding means 5 and the normal display timing information sent by broadcast (step). 14).

一方、（ステップ１１）において制御手段７で話速変換音声を選択した場合、トランスポートデコード手段４で、話速変換の音声情報、速度変換された字幕情報を取得し音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ１５）。 On the other hand, if the control means 7 selects speech speed converted speech in (Step 11), the transport decoding means 4 acquires speech speed converted speech information and speed-converted subtitle information. The subtitle information is transferred to the RAM 8 (step 15).

音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得しデコードを行う（ステップ１６）。デコードされた字幕情報は映像合成手段１０へ転送され、映像合成手段１０では映像デコード手段５の出力と、放送で送られてくる速度変換された表示タイミング情報に基づき字幕情報を合成し出力する（ステップ１７）。 The audio decoding means 6 decodes and outputs the input audio stream, and the control means 7 acquires and decodes the subtitle information developed in the RAM 8 (step 16). The decoded subtitle information is transferred to the video synthesizing means 10, and the video synthesizing means 10 synthesizes and outputs the subtitle information based on the output of the video decoding means 5 and the speed converted display timing information sent by broadcasting ( Step 17).

これにより、ユーザーが話速変換音声を選択した時に、受信機では放送で送られてきている速度変換された字幕情報を自動的に表示させる動作を行うことで、聴力、視力の落ちた高齢者などが音声、文字情報を確実に入手可能となる。 As a result, when the user selects speech speed converted speech, the receiver automatically operates to display the speed-converted subtitle information that is sent by broadcast, so that elderly people with reduced hearing and visual acuity The voice and text information can be obtained reliably.

（実施の形態５）
次に、本発明のデジタル放送受信端末装置の、第５の実施の形態例について説明する。ここでは図１、図６を用いて説明する。なお前述した（実施の形態１）〜（実施の形態４）に記載のものと同じ構成については同じ符号を用い、ここでの説明を省略する。本実施の形態の放送内容は、先述の映像、通常音声、通常字幕、速度変換された音声、速度変換された字幕が多重されている場合である。またこの動作をフローチャートにして図５に示す。 (Embodiment 5)
Next, a fifth embodiment of the digital broadcast receiving terminal device of the present invention will be described. Here, description will be made with reference to FIGS. Note that the same components as those described in (Embodiment 1) to (Embodiment 4) described above are denoted by the same reference numerals, and description thereof is omitted here. The broadcast content of the present embodiment is a case where the above-described video, normal audio, normal subtitle, speed-converted audio, and speed-converted subtitle are multiplexed. This operation is shown in a flowchart in FIG.

制御手段７は、ユーザーからの操作指示を受信可能なリモコン入力端子１３からの信号を認識し、字幕提示の指示が通常字幕か速度変換された字幕かを判断する（ステップ２１）。通常字幕選択の場合、トランスポートデコード手段４で、通常速度の音声情報、字幕情報を取得し音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ２２）。 The control means 7 recognizes the signal from the remote control input terminal 13 that can receive the operation instruction from the user, and determines whether the instruction for presenting the caption is a normal caption or a speed-converted caption (step 21). In the case of normal subtitle selection, the transport decoding means 4 acquires normal speed audio information and subtitle information, and the audio information is transferred to the audio decoding means 6 and the subtitle information is transferred to the RAM 8 (step 22).

音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得し、デコードを行う（ステップ２３）。デコードされた字幕情報は映像合成手段１０へ転送され、映像合成手段１０では映像デコード手段５の出力と、放送で送られてくる通常表示タイミング情報に基づき字幕情報を合成し出力する（ステップ２４）。 The audio decoding means 6 decodes and outputs the input audio stream, the subtitle information developed in the RAM 8 is acquired by the control means 7 and is decoded (step 23). The decoded subtitle information is transferred to the video synthesizing means 10, and the video synthesizing means 10 synthesizes and outputs the subtitle information based on the output of the video decoding means 5 and the normal display timing information sent by broadcasting (step 24). .

一方、（ステップ２１）において制御手段７で速度変換された字幕を選択した場合、トランスポートデコード手段４で、話速変換の音声情報、速度変換された字幕情報を取得し音声情報は音声デコード手段６へ、字幕情報はＲＡＭ８へ転送する（ステップ２５）。 On the other hand, when the subtitle speed-converted by the control means 7 is selected in (Step 21), the transport decoding means 4 acquires speech information for speech speed conversion and subtitle information subjected to speed conversion, and the sound information is the sound decoding means. The subtitle information is transferred to the RAM 8 (step 25).

音声デコード手段６では、入力された音声ストリームをデコード出力し、制御手段７でＲＡＭ８に展開された字幕情報を取得し、デコードを行う（ステップ２６）。そしてデコードされた字幕情報は映像合成手段１０へ転送され、映像合成手段１０では映像デコード手段５の出力と、放送で送られてくる速度変換された表示タイミング情報に基づき字幕情報を合成し出力する（ステップ２７）。 The audio decoding means 6 decodes and outputs the input audio stream, acquires the subtitle information developed in the RAM 8 by the control means 7, and performs decoding (step 26). The decoded subtitle information is transferred to the video synthesizing means 10, and the video synthesizing means 10 synthesizes and outputs the subtitle information based on the output of the video decoding means 5 and the speed converted display timing information sent by broadcasting. (Step 27).

これにより、ユーザーが速度変換字幕を選択した時に、受信機では放送で送られてきている速度変換された音声情報を自動的に表示させる動作を行うことで、聴力、視力の落ちた高齢者などが、文字、音声情報を確実に入手可能となる。 As a result, when the user selects speed conversion subtitles, the receiver automatically operates to display the speed-converted audio information that is sent by broadcast. However, it becomes possible to obtain text and voice information with certainty.

以上のように本発明によれば、高齢者ユーザーなどがテレビジョン放送において速い音声放送では聞き取りにくい場合、なかなか内容の理解をできない問題点を、速度変換された音声放送を送信することで、聴力の落ちた高齢者の方々など音声情報取得が容易に出来るようになる効果を有し、高齢者、障害者に優しい放送を目指すデジタル放送分野において有用である。 As described above, according to the present invention, when it is difficult for an elderly user or the like to listen to a high-speed audio broadcast in a television broadcast, it is difficult to understand the content. This system has the effect of facilitating the acquisition of voice information, such as elderly people who have fallen, and is useful in the field of digital broadcasting aiming at broadcasting that is friendly to the elderly and the disabled.

また、高齢者ユーザーなどが、速い速度で表示される字幕では読み切れないまま、次の表示へ移ってしまって内容をなかなか理解できない問題点を、速度変換された字幕放送を送信することで、視力の落ちた高齢者の方々など字幕情報取得が容易に出来るようになる効果を有し、高齢者、障害者に優しい放送を目指すデジタル放送分野において有用である。 In addition, elderly users, etc. are unable to read subtitles that are displayed at high speed, but have moved on to the next display and cannot easily understand the content. It has the effect of making it easier to obtain subtitle information, such as elderly people who have fallen, and is useful in the field of digital broadcasting that aims to broadcast friendly to elderly people and people with disabilities.

また、ユーザーの選択により放送で送られてくる音声速度を変換した音声データをデコード可能で、速度変換した音声に合わせて字幕速度を変換可能である効果を有することで、速い音声放送では聞き取りにくい場合や速い速度で表示される字幕では読み切れない場合などに有用である。 In addition, it has the effect of being able to decode the audio data converted by the user's selection and converted to the audio speed, and the subtitle speed can be converted according to the speed-converted audio, so it is difficult to hear in high-speed audio broadcasting. This is useful when the subtitles displayed at high speed cannot be read.

また、表示速度が変換された字幕データも音声速度を変換したデータとともに放送で送られてきた場合、ユーザーが音声速度を変換した音声データを表示選択した時は字幕表示も速度変換されたデータを自動的に選択することが可能である効果を有し、ユーザーは音声の速度変換指示さえ命令すれば字幕の速度変換も行え、高齢者、障害者に優しい放送を目指すデジタル放送分野の受信端末において有用である。 In addition, when subtitle data whose display speed has been converted is also sent by broadcast together with data whose audio speed has been converted, when the user selects to display audio data whose audio speed has been converted, the subtitle display will also display the data whose speed has been converted. In the receiving terminal of the digital broadcasting field, which has the effect of being able to select automatically, the user can convert the speed of subtitles as long as the voice speed conversion instruction is instructed, and aims at broadcasting that is kind to the elderly and the disabled Useful.

また、表示速度が変換された字幕データも音声速度を変換したデータとともに放送で送られてきた場合、ユーザーが字幕表示速度を変換したデータを表示選択した時は音声表示も速度変換されたデータを自動的に選択、デコード可能となる効果を有し、ユーザーは字幕の速度変換指示さえ命令すれば音声の速度変換も行え、高齢者、障害者に優しい放送を目指すデジタル放送分野の受信端末において有用である。 In addition, when subtitle data whose display speed has been converted is also sent by broadcast along with data whose audio speed has been converted, when the user selects to display data whose subtitle display speed has been converted, the audio display will also have the data whose speed has been converted. It has the effect of being automatically selectable and decodable, and the user can convert the speed of the audio as long as the subtitle speed conversion instruction is given. This is useful for receiving terminals in the digital broadcasting field aiming at broadcasting that is friendly to the elderly and the disabled. It is.

受信端末としては、家庭内のテレビジョン受信機やコンピュータ、または持ち運びの可能な携帯可能な表示装置やコンピュータ、携帯電話等においても本発明は同等な効果を奏するものである。 As a receiving terminal, the present invention has the same effect even in a television receiver or computer in a home, or a portable display device or computer that can be carried, a mobile phone, or the like.

本発明のデジタル放送装置および受信端末装置の構成を示す図The figure which shows the structure of the digital broadcasting apparatus of this invention, and a receiving terminal device ＭＰＥＧオーディオビットストリームのデータ構造を示す図The figure which shows the data structure of an MPEG audio bit stream ＡＲＩＢによる「字幕・文字スーパーの伝送」での構造例を示す図Diagram showing an example of the structure of “Transmission of caption / superimpose” by ARIB 本発明の動作の一例をフローチャートで示した図The figure which showed an example of operation | movement of this invention with the flowchart 本発明の動作の一例をフローチャートで示した図The figure which showed an example of operation | movement of this invention with the flowchart 本発明の動作の一例をフローチャートで示した図The figure which showed an example of operation | movement of this invention with the flowchart 従来のテレビ放送受信機の構成を示す図The figure which shows the structure of the conventional television broadcast receiver 同テレビ放送受信の機能例を説明するための図The figure for demonstrating the function example of the television broadcast reception

Explanation of symbols

１アンテナ信号入力端子
２チューナー
３復調手段
４トランスポートデコード手段
５映像デコード手段
６音声デコード手段
７制御手段
８ＲＡＭメモリ
９プログラム用不揮発性メモリ
１０映像合成手段
１１映像出力端子
１２音声出力端子
１３リモコン信号入力端子
２１映像エンコーダ
２２第一の音声エンコーダ
２３第２の音声エンコーダ
２４第一の字幕データ作成装置
２５第２の字幕データ作成装置
２６多重化装置
２７変調装置
２８電波送出手段
２９送出アンテナ入力端子 DESCRIPTION OF SYMBOLS 1 Antenna signal input terminal 2 Tuner 3 Demodulation means 4 Transport decoding means 5 Video decoding means 6 Audio decoding means 7 Control means 8 RAM memory 9 Program nonvolatile memory 10 Video composition means 11 Video output terminal 12 Audio output terminal 13 Remote control signal Input terminal 21 Video encoder 22 First audio encoder 23 Second audio encoder 24 First subtitle data creation device 25 Second subtitle data creation device 26 Multiplexer 27 Modulator 28 Radio wave sending means 29 Sending antenna input terminal

Claims

In digital broadcasting, which can transmit broadcast data composed of video information, audio information, text information, etc., it is possible to broadcast multiple types of audio information, and at least one type of audio data with converted audio speed is A digital broadcasting device characterized by being included.

Capable of transmitting broadcast data composed of video information, audio information, text information, etc. In digital broadcasting, it is possible to broadcast multiple types of subtitle information, and subtitle data in which subtitle display speed is converted to at least one of them. A digital broadcasting apparatus characterized by including:

A tuner capable of receiving digital broadcasting, a demodulating means for selecting a desired transport stream from the tuner output, and a video stream, audio data selection and output from the transport stream output from the demodulating means, and subtitle information data Transport decoding means for performing selection and acquisition; RAM memory for expanding / accumulating data of the caption information; Video decoding means capable of decoding video data acquired by the transport decoding means; and Transport decoding Audio decoding means capable of decoding the audio data acquired by the means, video synthesizing means capable of synthesizing the output of the video decoding means and decoded subtitle data, the demodulation means, the transport decoding means, and the video Decoding means; audio decoding means; It is provided with a control means for controlling the synthesizing means, can decode the audio data converted by the user's selection and sent by broadcast, and can convert the subtitle speed according to the speed-converted voice. Digital broadcasting compatible receiving terminal device.

When subtitle data with converted display speed is sent together with data with converted audio speed, when the user selects to display audio data with converted audio speed, the subtitle display is automatically converted to speed-converted data. 4. The digital broadcast-compatible receiving terminal device according to claim 3, wherein the receiving terminal device can be selected as follows.

When subtitle data with converted display speed is also sent by broadcast along with data with converted audio speed, when the user selects to display data with converted subtitle display speed, the audio display also automatically converts the speed converted data. 4. The receiving terminal device for digital broadcasting according to claim 3, wherein the receiving terminal device can be selected and decoded.

4. The digital broadcast-compatible receiving terminal device according to claim 3, wherein the output of the voice data whose voice speed is converted is displayed.