JP2008091972A

JP2008091972A - System for automatically outputting recorded narration corresponding to still image

Info

Publication number: JP2008091972A
Application number: JP2006266952A
Authority: JP
Inventors: Mizuho Maehara; みずほ前原; Futao Ozawa; 二穂小澤
Original assignee: Daiichikosho Co Ltd
Current assignee: Daiichikosho Co Ltd
Priority date: 2006-09-29
Filing date: 2006-09-29
Publication date: 2008-04-17
Anticipated expiration: 2026-09-29
Also published as: JP4679480B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve entertainment properties by a narration, and to arouse a viewing desire in a system for automatically outputting a recorded narration corresponding to a still image that successively switches a series of still images to which words are related for display and outputs a narration at a prescribed part. <P>SOLUTION: A recorded narration corresponding to prescribed words related to respective still images composed in series is stored in an NR data DB 38. Voice data in which word voice inputted by a voice recognition means 31 of a voice processing section 21 is analyzed is compared with a word keyword corresponding to a still image being displayed current of a comparison table 33 by a voice comparison means 32. An NR control means 35 extracts a corresponding recorded narration from the NR data DB 38, based on a comparison individual signal identified by the word keyword at the time of agreement, and is sent to an NR output control section 18 for outputting. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、台詞が関連付けられた一連の静止画像を順次切り替え表示させると共に、所定個所でナレーションを出力させる静止画像対応録音ナレーション自動出力システムに関する。 The present invention relates to an automatic recording narration automatic output system for still images that sequentially switches and displays a series of still images associated with dialogue and outputs narration at a predetermined location.

近年、一連のデジタル静止画像を昔ながらの紙芝居に見立てて、台詞を読み上げながら順次各シーンの画像を表示させるデジタル紙芝居が知られている。このようなデジタル紙芝居においては、エンターテイメント性の向上が望まれ、観者への観覧意欲の喚起が望まれる。 In recent years, there has been known a digital picture-story show in which a series of digital still images are regarded as an old-fashioned picture-story show, and images of each scene are sequentially displayed while reading a dialogue. In such a digital picture-story show, it is desired to improve entertainment properties and to encourage viewers to view.

従来、デジタル紙芝居は、静止画像を表示し、当該静止画像に対応した台詞が読み上げられ、順次各シーンを切り替えるのが一般的であるが、各シーン（静止画像）に対応したナレーションを流すこともエンターテイメント性等の向上の一つとして考えられる。ナレーションは、通常、登場人物とは異なる声が好ましく、また、芝居中の登場人物の台詞とは異なり、予め録音しておいたものでも語られる台詞に影響を与えず、むしろ録音しておいた方が台詞語りも効率的となる。 Conventionally, a digital picture-story show generally displays a still image, and a speech corresponding to the still image is read out, and each scene is switched sequentially. However, a narration corresponding to each scene (still image) can be played. It is considered as one of the improvement of entertainment. Voices that are different from the characters are usually preferred for narration, and unlike the characters of the characters in the play, even those that were recorded in advance did not affect the spoken dialogue, but rather were recorded It is more efficient to speak the line.

ところで、会話中に効果音を流す手法として下記の特許文献で開示されているものがある。下記特許文献には、移動体通信装置に関して、予め効果音データとキーワードとが対応付けられて記憶され、ユーザの通話音声信号に含まれるキーワードを音声認識手段が検出したときに、当該キーワードに対応する効果音データを移動体通信装置に送出することが開示されている。 By the way, there is a technique disclosed in the following patent document as a technique for playing sound effects during conversation. In the following patent document, sound effect data and a keyword are stored in advance in association with each other with respect to the mobile communication device, and when the keyword included in the user's call voice signal is detected by the voice recognition means, the keyword is handled. Transmitting sound effect data to a mobile communication device.

特開２００２−０５１１１６号公報JP 2002-05116 A

しかしながら、上記特許文献では、効果音データとキーワードとが単に対応付けられたものであり、ユーザによる通話の時間経過で音声信号に同じキーワードが複数含まれていても、常に同じ効果音が流れることとなり、これを上記デジタル紙芝居のナレーション出力に適用した場合に静止画像が切り替わっても同じキーワードでは同じ録音ナレーションを出力するということとなって録音ナレーションの効果が薄れてくるという問題がある。また、台詞と台詞との間に録音ナレーションを出力させるためには、所定のナレーションの選択や出力トリガの発信などのマニュアル操作を必要として、操作自体を行うという煩わしさがあるという問題がある。 However, in the above patent document, the sound effect data and the keyword are simply associated with each other, and the same sound effect always flows even if a plurality of the same keywords are included in the sound signal as the user calls. When this is applied to the narration output of the above-mentioned digital picture-story show, there is a problem that even if the still image is switched, the same recorded narration is output with the same keyword, and the effect of the recorded narration is diminished. Further, in order to output a recorded narration between lines, there is a problem that manual operations such as selection of a predetermined narration and transmission of an output trigger are required, and there is a troublesome operation.

そこで、本発明は上記課題に鑑みなされたもので、特別の操作を要せずに録音ナレーションを出力可能とし、ナレーションによるエンターテイメント性の向上、観覧意欲の喚起を図る静止画像対応録音ナレーション自動出力システムを提供することを目的とする。 Accordingly, the present invention has been made in view of the above problems, and it is possible to output a recording narration without requiring a special operation, and it is possible to output a recording narration that can improve the entertainment property by the narration and stimulate the willingness to view. The purpose is to provide.

上記課題を解決するために、請求項１の発明では、台詞が関連付けられた一連の静止画像を順次切り替え表示させると共に、所定個所でナレーションを出力させる静止画像対応録音ナレーション自動出力システムであって、静止画像記憶部、録音ナレーション記憶部、照合テーブル、音声入力部、音声処理部、録音ナレーション制御手段及び第１の表示部を備え、前記静止画像記憶部は、一連で構成される複数の静止画像を一組として所定数記憶し、前記録音ナレーション記憶部は、前記一連で構成される静止画像毎に対して、当該それぞれの静止画像毎に関連する台詞のうち、所定の台詞キーワードに対応する録音ナレーションデータをそれぞれ記憶し、前記照合テーブルは、所定の静止画像毎の台詞キーワードと前記録音ナレーション記憶部に記憶されている静止画像毎に対応する録音ナレーションデータに関する情報とが関連付けられ、前記音声入力部は、台詞音声を入力し、前記音声処理部は、前記入力した台詞音声を解析して生成した音声データと前記照合テーブルの前記第１の表示部で現に表示されている静止画像に対応する台詞キーワードとを照合し、一致したときに上記台詞キーワードで個々に特定される前記録音ナレーションデータに関する情報を出力し、前記録音ナレーション制御手段は、前記音声処理部からの前記録音ナレーションデータに関する情報に応じて、前記第１の表示部で現に表示されている静止画像毎に対応した録音ナレーションデータを前記録音ナレーション記憶部より抽出して出力させる構成とする。 In order to solve the above-described problem, the invention of claim 1 is a still image-corresponding recording narration automatic output system for sequentially switching and displaying a series of still images associated with dialogue and outputting narration at a predetermined location, A still image storage unit, a recording narration storage unit, a collation table, a voice input unit, a voice processing unit, a recording narration control means, and a first display unit, wherein the still image storage unit includes a plurality of still images configured in series. The recording narration storage unit records, for each still image composed of the series, a recording corresponding to a predetermined dialogue keyword among the dialogues associated with each still image. Narration data is stored, and the collation table includes a dialogue keyword for each predetermined still image and the recorded narration description. Information related to the recorded narration data corresponding to each still image stored in the unit is associated, the speech input unit inputs speech speech, and the speech processing unit analyzes and generates the input speech speech The recorded voice narration data is compared with the speech keyword corresponding to the still image currently displayed on the first display unit of the collation table, and when the speech data match, the recorded narration data individually specified by the speech keyword The recording narration control means outputs recording narration data corresponding to each still image currently displayed on the first display unit according to the information on the recording narration data from the audio processing unit. The recording narration storage unit extracts and outputs.

請求項２、３の発明では、「さらに台詞記憶部、台詞表示制御手段及び第２の表示部を備え、前記台詞記憶部は、前記一連で構成される静止画像毎の台詞データを記憶し、前記台詞表示制御手段は、前記第２の表示部に、現に表示されている静止画像に対応する台詞データを前記台詞記憶部より抽出して表示させる」構成であり、
「前記第１の表示部に表示する静止画像を表示制御する第１の画像表示制御手段の他に第２の画像表示制御手段を備え、前記第２の表示部に、前記第１の表示部で現に表示されている静止画像を表示させる」構成である。 In the inventions of claims 2 and 3, "further comprising a line storage unit, a line display control means and a second display unit, the line storage unit stores line data for each still image composed of the series, The dialogue display control means is configured to cause the second display unit to extract and display dialogue data corresponding to a still image currently displayed from the dialogue storage unit,
“In addition to first image display control means for controlling display of a still image displayed on the first display section, second image display control means is provided, and the second display section includes the first display section. In this configuration, the still image currently displayed is displayed.

本発明によれば、一連で構成される静止画像毎に対して、当該それぞれの静止画像毎に関連する台詞のうちの所定の台詞に対応する静止画像毎の録音ナレーションをそれぞれ記憶しておき、入力した台詞音声を解析した音声データと照合テーブルの台詞キーワードとを照合し、一致したときに当該台詞キーワードで個々に特定される録音ナレーションデータに関する情報に基づいて表示されている静止画像上で録音ナレーションを出力させる構成とすることにより、特別の操作を要せずに録音ナレーションを出力可能とし、同じ台詞キーワードでも表示されている静止画像が異なる場合には対応の異なる録音ナレーションを出力させることとなってナレーションによるエンターテイメント性の向上、観覧意欲の喚起を図ることができるものである。 According to the present invention, for each still image constituted by a series, the recording narration for each still image corresponding to a predetermined dialogue among the dialogues associated with each still image is stored, The voice data obtained by analyzing the input speech is compared with the speech keywords in the matching table, and when they match, they are recorded on the still image that is displayed based on the information related to the recorded narration data that is individually identified by the speech keywords. By configuring the narration to be output, it is possible to output the recorded narration without requiring any special operation, and if the displayed still image is different even with the same dialogue keyword, the corresponding recorded narration is output. It is possible to improve the entertainment properties by narration and to stimulate the willingness to see That.

以下、本発明の最良の実施形態を図により説明する。
図１に、本発明に係る静止画像対応録音ナレーション自動出力システムの第１実施形態のブロック構成図を示す。図１において、静止画像対応録音ナレーション自動出力システム１１は、バス１２、中央制御部１３、ＲＯＭ１４、ＲＡＭ１５、表示制御部１６、第１の表示部である画像表示部１７、ＮＲ（ナレーション）出力制御部１８、ミキシングアンプ１９、音声入力部であるマイク２０、スピーカ２１、音声処理部２２、記憶装置２３及び操作部２４を適宜備える。 Hereinafter, the best embodiment of the present invention will be described with reference to the drawings.
FIG. 1 shows a block diagram of a first embodiment of a still picture narration automatic output system according to the present invention. In FIG. 1, a recording narration automatic output system 11 for still images includes a bus 12, a central control unit 13, a ROM 14, a RAM 15, a display control unit 16, an image display unit 17 as a first display unit, and an NR (narration) output control. A unit 18, a mixing amplifier 19, a microphone 20 as a voice input unit, a speaker 21, a voice processing unit 22, a storage device 23, and an operation unit 24 are appropriately provided.

また、音声処理部２２には、音声認識手段３１及び音声照合手段３２を備える。さらに、記憶装置２３には、照合テーブル３３、録音ナレーション制御手段であるＮＲ制御手段３４、第１の画像表示制御手段３５、画像記憶部である画像ＤＢ（画像データベース）３６及び録音ナレーション記憶部であるＮＲデータＤＢ（データベース）３７が記憶される。 The voice processing unit 22 includes a voice recognition unit 31 and a voice collation unit 32. Further, the storage device 23 includes a collation table 33, an NR control means 34 that is a recording narration control means, a first image display control means 35, an image DB (image database) 36 that is an image storage section, and a recording narration storage section. A certain NR data DB (database) 37 is stored.

中央制御部１３は、このシステムを統括的に処理制御する物理的なＣＰＵであり、ＲＯＭ１４に記憶されているプログラムに基づくアルゴリズム処理を行う。上記ＲＡＭ１５は、種々のプログラムを展開、実行させるための作業領域としての役割をなすもので、例えば半導体メモリで構成され、仮想的にハードディスク上に構築される場合をも含む概念である。 The central control unit 13 is a physical CPU that performs overall process control of the system, and performs algorithm processing based on a program stored in the ROM 14. The RAM 15 serves as a work area for developing and executing various programs, and is a concept including, for example, a case where the RAM 15 is configured by a semiconductor memory and is virtually built on a hard disk.

上記表示制御部１６は、後述の第１の画像表示制御手段３５より送られてくる画像をデコードして画像表示部１７に表示させる電子回路及びこれに付随したプログラムを備える。当該画像表示部１７としては例えば、プロジェクタスクリーン、ブラウン管（ＣＲＴ）ディスプレイ、液晶ディスプレイ（ＬＣＤ）、プラズマディスプレイ（ＰＤＰ）等がある。 The display control unit 16 includes an electronic circuit that decodes an image sent from a first image display control unit 35 to be described later and displays the decoded image on the image display unit 17 and a program associated therewith. Examples of the image display unit 17 include a projector screen, a cathode ray tube (CRT) display, a liquid crystal display (LCD), and a plasma display (PDP).

上記ＮＲ出力制御部１８は、後述のＮＲ制御手段３４から送られてくる録音ナレーションのデータ（ファイル）をデコードしてミキシングアンプ１９に出力する電子回路である。当該ミキシングアンプ１９は、マイク２０より入力した読み上げ者の台詞音声を増幅してスピーカ２１より出力させると共に、当該ＮＲ出力制御部１８より送られてくる録音ナレーションの音声を増幅してスピーカ２１より出力させるものである。また、上記操作部２４は、電源ボタンやスタートボタン等のスイッチ類を備えたものである。 The NR output control unit 18 is an electronic circuit that decodes recording narration data (file) sent from an NR control unit 34 (to be described later) and outputs the data to a mixing amplifier 19. The mixing amplifier 19 amplifies the speech of the reading person input from the microphone 20 and outputs it from the speaker 21, and amplifies the voice of the recording narration sent from the NR output control unit 18 and outputs it from the speaker 21. It is something to be made. The operation unit 24 includes switches such as a power button and a start button.

上記音声処理部２２の備える上記音声認識手段３１は、入力される台詞音声を解析して音声データとするプログラムでありＲＡＭ１５に展開されて実行される。本実施形態では、例えば、アナログ信号の台詞音声をデジタル変換し、フォントコード化した音声データとする。なお、アナログ波形を画像化して解析した音声データとしてもよい。 The speech recognition means 31 provided in the speech processing unit 22 is a program that analyzes input speech and generates speech data, which is expanded in the RAM 15 and executed. In the present embodiment, for example, speech sound of an analog signal is digitally converted into font-coded sound data. In addition, it is good also as audio | voice data which imaged and analyzed the analog waveform.

上記音声処理部２２の備える音声照合手段３２は、音声認識手段３１からの音声データと照合テーブル３３（図２で説明する）の画像表示部１７で現に表示されている静止画像に対応する台詞キーワードとを照合し、一致したときに上記台詞キーワードで個々に特定されるＮＲファイル名を出力するプログラムであり、ＲＡＭ１５に展開されて実行される。 The speech verification unit 32 included in the speech processing unit 22 is a speech keyword corresponding to the speech data from the speech recognition unit 31 and the still image currently displayed on the image display unit 17 of the verification table 33 (described in FIG. 2). , And outputs the NR file name individually specified by the above-mentioned dialogue keyword when matched, and is expanded in the RAM 15 and executed.

ここで、音声データと台詞キーワードとの照合は、例えば、随時入力されて解析された音声データ（フォントコード）を、まず台詞キーワードの最初の一音データ（フォントコード）と照合し、一致したときに順次２番目以降の一音データ（フォントコード）と照合していく処理であり、台詞キーワードの総ての一音データ（フォントコード）とが一致したときに、当該台詞キーワードで個々に特定されるＮＲファイル名を出力する。逆に、一つでも一致しないものがあれば、台詞キーワードの最初の一音データとの照合から繰り返すものである。記憶装置２３に記憶される照合テーブル３３は、図２で一例を説明するが、静止画像毎の台詞キーワードと録音ナレーションデータに関する情報としての当該静止画像毎の録音ナレーションを特定付けるファイル名（ＮＲファイル名）とを関連付けたテーブルである。 Here, the collation between the speech data and the speech keyword is performed, for example, when speech data (font code) input and analyzed at any time is first collated with the first sound data (font code) of the speech keyword and matched. This is a process of sequentially matching the second and subsequent one-tone data (font code), and when all the one-tone data (font code) of the dialogue keyword is matched, it is individually identified by the dialogue keyword. Output NR file name. On the contrary, if there is even one that does not match, it is repeated from the collation with the first note data of the line keyword. An example of the collation table 33 stored in the storage device 23 will be described with reference to FIG. 2. A file name (NR file) that identifies a speech narration for each still image as information about a speech keyword and recording narration data for each still image. Name).

上記記憶装置２３に記憶されるＮＲ制御手段３４は、音声照合手段３２から照合結果として出力される録音ナレーションデータに関する情報としてのＮＲファイル名に基づいて、画像表示部１７で現に表示されている静止画像毎に対応した録音ナレーションデータ（ファイル）をＮＲデータＤＢ３７より抽出してＮＲ出力制御部１８に送出するプログラムであり、ＲＡＭ１５に展開されて実行される。ＮＲ出力制御部１８では入力した録音ナレーションデータをデコードしてミキシングアンプ１９に出力するものである。 The NR control means 34 stored in the storage device 23 is based on the NR file name as information relating to the recorded narration data output as the collation result from the voice collation means 32, and the still image currently displayed on the image display unit 17. This is a program for extracting recorded narration data (file) corresponding to each image from the NR data DB 37 and sending it to the NR output control unit 18. The program is developed in the RAM 15 and executed. The NR output control unit 18 decodes the input recording narration data and outputs it to the mixing amplifier 19.

上記記憶装置２３に記憶される第１の画像表示制御手段３５は、画像表示部１７に一連の静止画像を所定の切替信号（例えば、操作部２４の切替ボタン等）により順次切り替え表示させるプログラムであり、ＲＡＭ１５に展開されて実行される。上記記憶装置２３に記憶される静止画像記憶部である画像ＤＢ３６は、一連で構成される複数の静止画像を一組として所定数記憶させておくデータベースであり、ＮＲデータＤＢ３７は一連で構成される静止画像毎に対して、当該それぞれの静止画像毎に関連する録音ナレーションデータであって、台詞のうちの所定の台詞キーワードに対応する当該録音ナレーションデータをそれぞれ記憶するデータベースである。 The first image display control means 35 stored in the storage device 23 is a program for sequentially switching and displaying a series of still images on the image display unit 17 by a predetermined switching signal (for example, a switching button of the operation unit 24). Yes, it is expanded in the RAM 15 and executed. The image DB 36, which is a still image storage unit stored in the storage device 23, is a database that stores a predetermined number of still images configured in series as a set, and the NR data DB 37 is configured in series. This is a database that stores recording narration data associated with each still image and corresponding to each predetermined still image keyword in the dialogue for each still image.

ここで、図２に、図１の照合テーブル及びＮＲテーブルの説明図を示す。図２に示す照合テーブル３３は、例えば第１画像の静止画像に、当該第１画像に対応した所定数の台詞キーワードとそれぞれのＮＲファイル名とが対応付けられたもので、第１０画像の静止画像まで存在するとして、当該静止画像毎に所定数の台詞キーワードとＮＲファイル名とを関連付けている。すなわち、当該照合テーブル３３は、静止画像毎に定められた台詞キーワードが対応のＮＲファイル名と関連付けられていることから、当該静止画像間で同じ台詞キーワードが存在しても異なるＮＲファイル名として区別している。なお、同一の静止画像において、台詞キーワードのワード数の長さを異ならせることで、重複した台詞キーワードの設定を回避させることができるものである。 Here, FIG. 2 is an explanatory diagram of the collation table and the NR table of FIG. The collation table 33 shown in FIG. 2 is a table in which, for example, a still image of the first image is associated with a predetermined number of dialogue keywords corresponding to the first image and each NR file name. Assuming that images exist, a predetermined number of dialogue keywords and NR file names are associated with each still image. That is, since the dialogue keyword defined for each still image is associated with the corresponding NR file name, the matching table 33 is classified as a different NR file name even if the same dialogue keyword exists between the still images. Separated. In the same still image, the setting of overlapping dialogue keywords can be avoided by changing the length of the number of words of dialogue keywords.

そこで、図３に、図１の録音ナレーション制御のフローチャートを示す。ここで、上記静止画像対応録音ナレーション自動出力システム１１においては、第１の画像表示制御手段３５が所定の静止画像を画像ＤＢ３６より抽出して表示制御部１６に送出することで、当該表示制御部１６が当該画像データをデコードして画像表示部１７に表示する。このときに読み上げ者が当該画像に対応した台詞を、マイク２０を介して読み上げると、当該マイク２０に入力される読み上げられた台詞音声は、ミキシングアンプ１９で増幅されてスピーカ２１より出力され、一方で、当該台詞音声が音声処理部２２の音声認識手段３１に随時入力される（ステップ（Ｓ）１）。 FIG. 3 shows a flowchart of the recording narration control of FIG. Here, in the above-described still image-corresponding recording narration automatic output system 11, the first image display control means 35 extracts a predetermined still image from the image DB 36 and sends it to the display control unit 16. 16 decodes the image data and displays it on the image display unit 17. At this time, when a speaker reads out the speech corresponding to the image through the microphone 20, the speech speech read out to the microphone 20 is amplified by the mixing amplifier 19 and output from the speaker 21, while Thus, the dialogue speech is input to the speech recognition means 31 of the speech processing unit 22 as needed (step (S) 1).

すなわち、音声処理部２２では、第１の画像表示制御手段３５より画像表示部１７で現に表示されている静止画像の情報を取得すると共に、音声認識手段３１が入力した台詞音声を解析して音声データとし、音声照合手段３２が当該音声データと照合テーブル３３の現に表示されている静止画像に対応する台詞キーワードとを実時間で順次照合していく（Ｓ２）。一致するまで照合が繰り返される（Ｓ３）。 That is, the voice processing unit 22 acquires information on the still image currently displayed on the image display unit 17 from the first image display control unit 35 and analyzes the speech input by the voice recognition unit 31 to obtain a voice. The voice collation means 32 sequentially collates the voice data with the speech keyword corresponding to the still image currently displayed in the collation table 33 in real time (S2). The matching is repeated until they match (S3).

音声照合手段３２による照合の結果（Ｓ３）、一致した場合には、照合テーブル３３の当該一致した台詞キーワードに関連付けられたＮＲファイル名をＮＲ制御手段３４に出力する（Ｓ４）。ＮＲ制御手段３４では、音声照合手段３２より入力したＮＲファイル名に基づいて、録音ナレーションデータをＮＲデータＤＢ３７より抽出してＮＲ出力制御部１８に送出する（Ｓ５）。 As a result of the collation by the voice collation unit 32 (S3), if they match, the NR file name associated with the matched dialogue keyword in the collation table 33 is output to the NR control unit 34 (S4). The NR control means 34 extracts the recorded narration data from the NR data DB 37 based on the NR file name input from the voice collating means 32 and sends it to the NR output control unit 18 (S5).

ＮＲ音出力部１８は、送られてきたＮＲファイル名の録音ナレーションデータをデコードしてミキシングアンプ１９に出力し、当該ミキシングアンプ１９は、当該録音ナレーションの音声を増幅してスピーカ２１より出力させる。そして、これらの処理が最終画像（例えば、第１０画像の静止画像）まで繰り返されるものである（Ｓ６）。 The NR sound output unit 18 decodes the received recording narration data of the NR file name and outputs it to the mixing amplifier 19, and the mixing amplifier 19 amplifies the sound of the recording narration and outputs it from the speaker 21. These processes are repeated until the final image (for example, the still image of the tenth image) (S6).

このように、特別の操作を要せずに録音ナレーションを出力可能とし、同じ台詞キーワードでも表示されている静止画像が異なる場合には対応の異なる録音ナレーションを出力させることとなってナレーションによるエンターテイメント性の向上、観覧意欲の喚起を図ることができるものである。 In this way, recording narration can be output without requiring a special operation, and even if the still image displayed is different even if the same dialogue keyword is displayed, different recorded narration is output and entertainment characteristics by narration It is possible to improve the quality and stimulate the willingness to view.

次に、図４に本発明に係る静止画像対応録音ナレーション自動出力システムの第２実施形態のブロック構成図を示すと共に、図５に図４の台詞表示の説明図を示す。図４（Ａ）は、静止画像表示システム１１の、主要な一部のブロック図であり、記憶装置２３には、さらに台詞ＤＢ４１、台詞表示制御手段４２及び第２の画像表示制御手段４３が記憶される。また、バス１２に送受信部４４が設けられ、当該送受信部４４と無線式によりデータ送信される第２の表示部である遠隔表示端末４５が備えられる。当該遠隔表示端末４５には、第２の表示部４６、第２の表示制御部４７及び送受信部４８が適宜備えられる。 Next, FIG. 4 shows a block diagram of the second embodiment of the still image corresponding automatic recording narration output system according to the present invention, and FIG. 5 shows an explanatory diagram of the dialogue display of FIG. FIG. 4A is a block diagram of a main part of the still image display system 11. The storage device 23 further stores a dialogue DB 41, dialogue display control means 42, and second image display control means 43. Is done. In addition, a transmission / reception unit 44 is provided on the bus 12, and a remote display terminal 45, which is a second display unit that wirelessly transmits data with the transmission / reception unit 44. The remote display terminal 45 appropriately includes a second display unit 46, a second display control unit 47, and a transmission / reception unit 48.

上記台詞ＤＢ４１は、画像ＤＢ３６に記憶されている一連で構成される静止画像に関連付けられている対応の台詞データ（台詞ファイル名）をそれぞれ記憶しておくデータベースである。この場合、照合テーブル３３Ａには、図４（Ｂ）に示すように、例えば、各静止画像に対応した台詞、すなわち台詞ＤＢ４５に記憶された台詞ファイル名で特定される台詞データのうち、所定数の台詞キーワードとそれぞれのＮＲファイル名とが対応付けられたもので、第１０画像の静止画像まで存在するとして、当該静止画像毎に所定数の台詞キーワードとＮＲファイル名とを関連付けている。 The dialogue DB 41 is a database for storing corresponding dialogue data (dialog file name) associated with a series of still images stored in the image DB 36. In this case, as shown in FIG. 4B, the collation table 33A includes, for example, a predetermined number of dialogue data corresponding to each still image, that is, dialogue data specified by dialogue file names stored in the dialogue DB 45. Are associated with each NR file name, and a predetermined number of dialogue keywords and NR file names are associated with each still image, assuming that there is a still image of the tenth image.

なお、この台詞ＤＢ４１のデータを画像ＤＢ３６内に対応画像と関連付けて記憶させておくこととしてもよい。上記台詞表示制御手段４２は、遠隔表示端末４５に、上記画像表示部１７に現に表示されている静止画像に対応する台詞データを、台詞ＤＢ４１より抽出して送受信部４４に送出するプログラムであり、ＲＡＭ１５に展開されて実行される。上記第２の画像表示制御手段４３は、上記画像表示部１７で現に表示されている静止画像を、画像ＤＢ３６より抽出して送受信部４４に送出するプログラムであり、ＲＡＭ１５に展開されて実行される。 Note that the data of the dialogue DB 41 may be stored in the image DB 36 in association with the corresponding image. The dialogue display control means 42 is a program for extracting dialogue data corresponding to the still image currently displayed on the image display unit 17 from the dialogue DB 41 to the remote display terminal 45 and sending it to the transmission / reception unit 44. The data is expanded in the RAM 15 and executed. The second image display control means 43 is a program that extracts the still image currently displayed on the image display unit 17 from the image DB 36 and sends it to the transmission / reception unit 44, which is developed in the RAM 15 and executed. .

上記送受信部４４は、遠隔表示端末４５に対して、送受信部４８と無線方式で送出する電子回路である。例えば、ＩＲ方式やブルートゥース機構のピコネット接続方式などによる無線方式で接続可能であるが、有線方式であってもよい。当該遠隔表示端末４５の備える第２の表示部４６は、画像表示部１７に現に表示されている静止画像、及び、当該静止画像に対応する台詞データを表示させるディスプレイである。そして、遠隔表示端末４５の備える第２の表示制御部４７は、送受信部４８に送られてくる静止画像のデータ及び台詞データをデコードして当該静止画像を第２の表示部４６に表示させ、このときに台詞データをテロップ状態で表示させるものである。 The transmission / reception unit 44 is an electronic circuit that transmits to the remote display terminal 45 in a wireless manner with the transmission / reception unit 48. For example, connection is possible by a wireless method such as an IR method or a piconet connection method of a Bluetooth mechanism, but a wired method may also be used. The second display unit 46 included in the remote display terminal 45 is a display that displays a still image currently displayed on the image display unit 17 and dialogue data corresponding to the still image. Then, the second display control unit 47 included in the remote display terminal 45 decodes the still image data and the dialogue data sent to the transmission / reception unit 48 to display the still image on the second display unit 46, At this time, dialogue data is displayed in a telop state.

すなわち、システムの所定の画像表示時には、画像表示部１７に現に表示されている静止画像のデータを第２の画像表示制御手段４３が抽出して遠隔表示端末４５に送出することで第２の表示部４６に表示させ、また、これに対応した台詞データを台詞表示制御手段４２が台詞ＤＢ４１より抽出し、図５に示すように台詞キーワードとなる部分を、下線、表示文字の大きさ、異なる色表示等に特徴付けて遠隔表示端末４５に送出することで表示させる。 That is, at the time of displaying a predetermined image of the system, the second image display control means 43 extracts the still image data currently displayed on the image display unit 17 and sends it to the remote display terminal 45 so that the second display is performed. The dialogue display control means 42 extracts the dialogue data corresponding to this from the dialogue DB 41, and the portion that becomes the dialogue keyword is underlined, the size of the displayed character, a different color as shown in FIG. Characterize the display and send it to the remote display terminal 45 for display.

そこで、読み上げ者が遠隔表示端末４５に表示された静止画像及び台詞を見ながら読み上げ、台詞キーワード（図４（Ｂ））と一致する台詞が読み上げられると、音声照合手段３２から照合結果として送出されてくるＮＲファイル名に基づいて、ＮＲ制御手段３４が録音ナレーションデータをＮＲデータＤＢ３７より抽出してＮＲ出力制御部１８に送出することによって、当該録音ナレーションをスピーカ２１より出力させる。これが最終の静止画像の表示まで繰り返されるものである。 Therefore, when the reader reads out the speech while looking at the still image and the dialogue displayed on the remote display terminal 45 and reads out the dialogue that matches the dialogue keyword (FIG. 4B), the speech collating means 32 sends it out as a matching result. Based on the incoming NR file name, the NR control means 34 extracts the recorded narration data from the NR data DB 37 and sends it to the NR output control unit 18 to output the recorded narration from the speaker 21. This is repeated until the final still image is displayed.

なお、遠隔表示端末４５に、上記画像表示部１７に現に表示されている静止画像及び台詞を表示させることとして説明したが、台詞のみの表示であってもよく、遠隔表示端末４５に同一の静止画像を表示させることで読み上げ者の台詞読み上げに感情移入させる効果がある。逆に、遠隔表示端末４５に静止画像のみを表示させることとしてもよく、台詞、特に録音ナレーションの出力される個所を特徴付けて表示させることで、読み上げ者に対して録音ナレーションの音声が流れる個所（台詞）を認識させることができるものである。また、台詞読み上げ者は、遠隔表示端末４５を見ながらマイク１９に読み上げ発声することから、当該遠隔表示端末４５にマイク機能を備えさせてもよい。 Note that although the remote display terminal 45 has been described as displaying the still image and dialogue currently displayed on the image display unit 17, the dialogue may be displayed only, and the same static on the remote display terminal 45. Displaying an image has the effect of empathizing with the reader's speech reading. On the contrary, only the still image may be displayed on the remote display terminal 45, and the point where the speech, especially the recording narration is output is characterized and displayed so that the voice of the recording narration flows to the reader. (Sentence) can be recognized. Further, since the speech reader reads out and speaks into the microphone 19 while looking at the remote display terminal 45, the remote display terminal 45 may be provided with a microphone function.

このように、台詞読み上げ者が、遠隔表示端末４５に表示された台詞、特に録音ナレーションが流れる台詞キーワードを特徴付けた台詞、又は、当該台詞及び静止画像を見ながら読み上げることから利便性を提供することができると共に、上記同様に、特別の操作を要せずに録音ナレーションを出力可能とし、同じ台詞キーワードでも表示されている静止画像が異なる場合には対応の異なる録音ナレーションを出力させることとなってナレーションによるエンターテイメント性の向上、観覧意欲の喚起を図ることができるものである。 In this way, the user who reads the speech reads the speech displayed on the remote display terminal 45, particularly the speech characterizing the speech keyword in which the recording narration flows, or the speech and the still image while viewing the speech. In the same way as above, recording narration can be output without requiring any special operation, and even when the same dialogue keyword is displayed, different recorded narration is output. It is possible to improve entertainment through narration and to stimulate viewing.

本発明の静止画像対応録音ナレーション自動出力システムは、デジタル紙芝居の台詞と静止画像とが対応付けられ、静止画像が表示されているときの当該静止画像毎に対応した録音ナレーションを出力するシステムに利用可能である。 The automatic recording narration output system for still images according to the present invention is used in a system for outputting recorded narration corresponding to each still image when a digital picture-story show dialogue is associated with a still image and a still image is displayed. Is possible.

本発明に係る静止画像対応録音ナレーション自動出力システムの第１実施形態のブロック構成図である。1 is a block configuration diagram of a first embodiment of an automatic recording narration automatic output system according to the present invention. FIG. 図１の照合テーブルの説明図である。It is explanatory drawing of the collation table of FIG. 図１の録音ナレーション制御のフローチャートである。It is a flowchart of the recording narration control of FIG. 本発明に係る静止画像対応録音ナレーション自動出力システムの第２実施形態の説明図である。It is explanatory drawing of 2nd Embodiment of the recording voice-over narration automatic output system which concerns on this invention. 図４の台詞表示の説明図である。It is explanatory drawing of the dialog display of FIG.

Explanation of symbols

１１静止画像対応録音ナレーション自動出力システム
１７画像表示部
１８ＮＲ出力制御部
２２音声処理部
３１音声認識手段
３２音声照合手段
３３照合テーブル
３４ＮＲ制御手段
３５第１の画像表示制御手段
３６画像ＤＢ
３７ＮＲデータＤＢ
４１台詞ＤＢ
４２台詞表示制御手段
４３第２の画像表示制御手段
４５遠隔表示端末 DESCRIPTION OF SYMBOLS 11 Still image corresponding voice recording automatic output system 17 Image display part 18 NR output control part 22 Voice processing part 31 Voice recognition means 32 Voice collation means 33 Collation table 34 NR control means 35 First image display control means 36 Image DB
37 NR data DB
41 Dialogue DB
42 Dialog display control means 43 Second image display control means 45 Remote display terminal

Claims

It is a recording narration automatic output system for still images that sequentially switches and displays a series of still images associated with dialogue and outputs narration at a predetermined location,
A still image storage unit, a recording narration storage unit, a collation table, a voice input unit, a voice processing unit, a recording narration control means, and a first display unit;
The still image storage unit stores a predetermined number of still images configured in series as a set,
The recording narration storage unit stores recording narration data corresponding to a predetermined dialogue keyword among dialogues associated with each still image for each still image configured in the series,
In the collation table, dialogue keywords for each predetermined still image are associated with information related to recorded narration data corresponding to each still image stored in the recorded narration storage unit,
The speech input unit inputs speech speech,
The speech processing unit collates speech data generated by analyzing the input speech and the speech keyword corresponding to the still image currently displayed on the first display unit of the collation table, and matches. Sometimes output information about the recorded narration data individually identified by the above dialogue keywords,
The recording narration control means, according to information related to the recording narration data from the audio processing unit, records narration data corresponding to each still image currently displayed on the first display unit. An automatic recording narration output system for still images, which is characterized by being extracted and output.

The automatic narration recording audio output system according to claim 1, further comprising a dialogue storage unit, dialogue display control means and a second display unit,
The dialogue storage unit stores dialogue data for each still image composed of the series,
The dialogue display control means extracts and displays dialogue data corresponding to a still image currently displayed on the second display portion from the dialogue storage portion and displays it. system.

3. The automatic recording voice narration output system according to claim 2, wherein a second image display control means is provided in addition to the first image display control means for controlling the display of the still image displayed on the first display section. A still image narration automatic output system, wherein the still image currently displayed on the first display unit is displayed on the second display unit.