JP3350583B2

JP3350583B2 - Synchronized output of audio / image / text information

Info

Publication number: JP3350583B2
Application number: JP30858593A
Authority: JP
Inventors: 弘二荒井
Original assignee: 株式会社ハドソン
Priority date: 1993-11-16
Filing date: 1993-11-16
Publication date: 2002-11-25
Anticipated expiration: 2017-11-25
Also published as: JPH07141521A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、肉声のサンプリング再
生、動画データ再生、文字表記を同時に行う方法に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for simultaneously sampling and reproducing real voice, reproducing moving image data, and writing characters.

【０００２】[0002]

【従来の技術】従来、コンピュータ装置がデータ処理を
行い、利用者に対して情報を呈示するという観点からデ
ータを分類すると、音声データおよび画像データ、文字
データに分けられる。処理能力が向上した結果、画像
は、静画だけでなく動画再生が可能となり、音声は、電
子的に音を発生させるだけでなく、肉声等の自然音の記
録、再生が可能となり、音質、画質の向上が図られてい
る。2. Description of the Related Art Conventionally, when a computer device performs data processing and classifies data from the viewpoint of presenting information to a user, the data is classified into voice data, image data, and character data. As a result of the improvement in processing capability, images can be played not only as still images but also as moving images, and audio can not only generate sound electronically, but also record and reproduce natural sounds such as real voice, The image quality is improved.

【０００３】また、ＣＤ−ＲＯＭあるいはＬＤ−ＲＯＭ
など、膨大なデータ格納量を有する光磁気ディスクに画
像データを記録することにより動画の長時間連続再生が
可能となっている。[0003] CD-ROM or LD-ROM
For example, by recording image data on a magneto-optical disk having an enormous data storage amount, it is possible to continuously reproduce moving images for a long time.

【０００４】再生する側でも、音声データや画像データ
を圧縮、伸張する技術が向上し、音質、画質を損ねるこ
となく、高圧縮率で処理できるようになった。高圧縮率
の実現は、圧縮、伸張回路ロジックの改良や、コンピュ
ータ処理能力の向上による圧縮、伸張処理の高精度化に
起因している。On the reproducing side, techniques for compressing and expanding audio data and image data have been improved, and processing can be performed at a high compression rate without impairing sound quality and image quality. The realization of a high compression ratio is attributable to the improvement of the compression and decompression circuit logic and the high precision of the compression and decompression processing by the improvement of computer processing capability.

【０００５】情報の呈示は、音声、画像、文字のいずれ
かを単独出力するか、あるいは、それぞれの組み合わせ
を同時出力して行われる。単独出力の場合は、呈示され
る情報の質と呈示速度を考慮すればよい。音声、画像
等、複数の呈示形態で同時出力する場合、同時出力する
呈示形態相互の関連性が乏しく、単に同時に出力すれば
足りるものと、関連性が密接で同調出力させたほうが好
ましいものがある。関連性が密接なデータを同時に作成
できない際、その同調出力が問題となる。[0005] The presentation of information is performed by outputting any one of voice, image, and character alone, or simultaneously outputting a combination of each. In the case of single output, the quality of the information to be presented and the presentation speed may be considered. In the case of simultaneous output in a plurality of presentation forms, such as audio and images, there is little relevance between the presentation forms that are simultaneously outputted, and there are those in which simply outputting at the same time is sufficient, and those in which the relevance is close and the synchronized output is preferable. . When closely related data cannot be created simultaneously, the tuned output becomes a problem.

【０００６】問題となるものの代表例としては、いわゆ
るアニメーションと呼ばれる動画において人物が話をす
る画像データと、いわゆるナレーション等の話声の音声
データの組み合わせを挙げることができる。動画に合わ
せて音声データの録音を行うことで、同調出力すること
ができる。As a typical example of the problem, there is a combination of image data in which a person speaks in a moving image called a so-called animation and voice data of a spoken voice such as a narration. By recording audio data in synchronization with a moving image, synchronized output can be achieved.

【０００７】他の例としてアニメーションにおいて人物
が話をする画像データと、話の内容を示す文字データの
組み合わせがある。必要な場面で対応する文字情報が表
示されるよう、画像データブロック単位に文字の表示を
行う、あるいは、場面先頭から一連の文字を適当な速度
で連続表示することにより、同時出力している。As another example, there is a combination of image data in which a person speaks in animation and character data indicating the contents of the story. The characters are displayed in units of image data blocks so that the corresponding character information is displayed in a necessary scene, or a series of characters are continuously displayed at an appropriate speed from the beginning of the scene, thereby simultaneously outputting the characters.

【０００８】[0008]

【発明が解決しようとする課題】このように従来、コン
ピュータ装置では種々の工夫を凝らし、データ量の大き
い動画に対して、比較的データ規模の小さい音声や文字
等の出力調整を行うことにより同時出力を実現してい
る。しかし、文字の出力調整は、ブロック単位の表示、
あるいは、場面先頭からの連続表示程度のまだまだ不十
分なものにすぎない。As described above, in the conventional computer apparatus, various measures have been devised so that the output of a relatively small data size sound or character can be adjusted simultaneously for a moving image having a large data amount. Output is realized. However, the output adjustment of the character, display in block units,
Or, it is still insufficient enough to display continuously from the beginning of the scene.

【０００９】動画に話声の音声データの組み合わせる場
合、情報内容の呈示は主に音声に依存している。コンピ
ュータ装置では、動画と話声に文字を同時出力する例は
見られないが、情報を呈示される者に対する理解補助と
して、あるいは確実な情報の伝達手法として、動画、話
声に文字を同時出力することが有効である。文字多重放
送等にみられるように、聴力障害者への対応や、２カ国
語での情報呈示を図ることが求められている。[0009] In the case of combining speech voice data with a moving image, presentation of information content mainly depends on voice. There are no examples of simultaneous output of characters to video and spoken voice in computer devices, but simultaneous output of text to video and spoken voice as an aid to understanding for those who are presented with information or as a reliable information transmission method. It is effective to do. As seen in text multiplex broadcasting and the like, it is required to respond to hearing-impaired persons and to present information in two languages.

【００１０】従来のコンピュータ装置で動画と話声、文
字を同時出力すると、動画と話声の同調は可能である
が、文字の出力調整が不十分である。アニメーション動
画とともに話声と文字により全く同等の情報を提示する
場合、聴覚情報、視覚情報、視聴覚の二元呈示により確
実に伝達される情報として利用されるが、文字の出力調
整が不十分なことによる問題点が考えられる。When a conventional computer device simultaneously outputs a moving image, a voice, and a character, the moving image and the voice can be synchronized, but the output adjustment of the character is insufficient. When the same information is presented by voice and text together with animated animation, it is used as information that can be reliably transmitted by dual presentation of auditory information, visual information, and audiovisual information, but the output adjustment of the text is insufficient. There is a problem due to this.

【００１１】文字が話文を呈示している場合等は、文字
と動画が一体化して特定の情報を形成することが望まれ
るが、文字と動画が同調しないものを視覚のみの情報と
して利用すると、一体化した情報として把握できない恐
れがある。視聴覚の二元呈示される情報として利用する
場合、話声に対して文字を遅れて呈示すると、文字情報
が有効に利用されず、話声に対して文字を先に呈示する
と、音声情報が有効利用できない。In the case where a character presents a spoken sentence, it is desired that the character and the moving image be integrated to form specific information. However, there is a possibility that the information cannot be grasped as integrated information. When used as audiovisually presented information, character information is not effectively used if the character is presented with a delay to the spoken voice, and voice information is enabled if the character is presented first to the spoken voice. Not available.

【００１２】本発明は、音声、画像、文字による三元情
報の同調出力が可能なコンピュータ装置を得て、複合情
報の有効利用を図ることを目的とする。SUMMARY OF THE INVENTION It is an object of the present invention to obtain a computer device capable of synchronizing and outputting ternary information by voice, image, and character, and to effectively use composite information.

【００１３】[0013]

【課題を解決するための手段】上記の課題を解決するた
めに、本発明の同調出力装置には、予め、動画の場面毎
に録音されたナレーション等の音声データ再生単位を基
準単位として作成した文字データを利用し、この文字デ
ータを表示する際に、この基準単位中で、文字あるいは
単語単位の表示制御を行う手段を備える。音声、動画に
同調出力したい文字データについては、予め、特定の文
字コードで作成した文字データを利用する。特定の文字
コードによる文字データが、例えば１文字等の一単位分
表示されると、一単位分に相当する画像データをアニメ
処理して出力する過程を、文字データ終了まで反復する
手段を設ける。In order to solve the above-mentioned problems, in the tuning output device of the present invention, a unit for reproducing audio data such as narration recorded for each scene of a moving image is prepared in advance as a reference unit. When the character data is displayed using the character data, a means for performing display control in units of characters or words in the reference unit is provided. For character data that is desired to be tuned to audio and video, character data created in advance with a specific character code is used. When character data of a specific character code is displayed for one unit such as one character, for example, means for repeating the process of animating and outputting image data corresponding to one unit until the end of the character data is provided.

【００１４】このように、音声データの再生に同調する
よう作成された文字データについて、音声データ再生単
位毎に、文字単位ないし単語単位の表示制御を行うこと
により、音声データと文字データの出力が同調する。音
声データの再生に同調する文字データの単位表示量を用
いて、画像データのアニメ処理を制御することにより、
文字データの表示に対して画像データの出力が同調し、
音声データ、文字データ、画像データの三元情報の同調
出力が可能な装置が構成される。As described above, by controlling the display of character data created in synchronization with the reproduction of audio data in units of characters or words for each audio data reproduction unit, the output of the audio data and character data can be performed. Synchronize. By controlling the animation processing of image data using the unit display amount of character data synchronized with the reproduction of audio data,
The output of image data is synchronized with the display of character data,
A device capable of synchronously outputting ternary information of audio data, character data, and image data is configured.

【００１５】図１に、本発明の同調出力装置の動作の一
例の概要を説明する流れ図を示す。文字データに音声、
動画との同調出力が必要な場合をナレーションメッセー
ジとし、ナレーションメッセージでない文字データは、
通常どおりに出力して終了する。文字データがナレーシ
ョンメッセージである場合は、画面アニメーションデー
タ表示を開始し、続いて音声データの再生を開始する。
音声データの再生はデータ終了まで連続実行される。FIG. 1 is a flowchart for explaining an outline of an example of the operation of the tuning output device of the present invention. Text, voice,
A narration message is used when synchronizing output with a video is required.
Output as usual and exit. If the character data is a narration message, display of screen animation data is started, and then reproduction of audio data is started.
The reproduction of the audio data is continuously executed until the end of the data.

【００１６】ナレーションメッセージである文字データ
中にフラグを埋め込んで、文字データの表示のウェイト
を指示する。ウェイト処理が終了するとメッセージが表
示される。表示しようとする文字データがアスキー文字
でないときは、アニメ処理と出力とメッセージ表示は同
調せず、文字データの表示は音声データの再生にのみ同
調する。表示しようとする文字データがアスキー文字で
ある場合は、たとえば、アスキー文字が１文字表示され
る都度、アニメ処理を実行することにより、音声に同調
して出力される文字データによりアニメ処理の実行が調
整されるため、画像データ、文字データ、音声データの
同調出力が可能となる。A flag is embedded in character data, which is a narration message, to indicate the display weight of the character data. When the wait processing is completed, a message is displayed. When the character data to be displayed is not ASCII characters, the animation processing, output and message display are not synchronized, and the display of character data is synchronized only with the reproduction of audio data. When the character data to be displayed is ASCII characters, for example, every time one ASCII character is displayed, the animation process is executed, so that the animation process can be executed by the character data output in synchronization with the voice. Since the adjustment is performed, synchronized output of image data, character data, and audio data becomes possible.

【００１７】[0017]

【発明の効果】上記のように、本発明の同調出力装置で
は、音声再生に同調するよう作成された文字データを、
音声出力単位に合わせて表示制御を行うことにより、音
声データと文字データの出力が同調する。音声再生に同
調する文字データの単位表示量を基準にして画像データ
のアニメ処理を制御することにより、文字データの表示
に対して画像データの出力が同調するため、音声デー
タ、文字データ、画像データの三元情報の同調出力が可
能な装置となる。音声データ、文字データ、画像データ
の複合情報を形成することにより、コンピュータ装置に
よる高度な音響映像技術を提供することができる。とく
に文字データと画像データの同調出力により聴力障害者
への対応の拡大、あるいは、文字データと音声データの
同調出力により２カ国語での情報呈示を図ることができ
る等の効果がある。As described above, in the tuning output device of the present invention, the character data created so as to synchronize with the sound reproduction is
By performing the display control in accordance with the audio output unit, the output of the audio data and the output of the character data are synchronized. By controlling the animation processing of the image data based on the unit display amount of the character data synchronized with the sound reproduction, the output of the image data is synchronized with the display of the character data, so that the sound data, character data, image data Is a device that can tune and output the three-way information. By forming composite information of audio data, character data, and image data, it is possible to provide an advanced audiovisual technique using a computer device. In particular, the synchronized output of the character data and the image data has an effect that the correspondence to the hearing impaired person can be expanded, or the synchronized output of the character data and the voice data can present information in two languages.

[Brief description of the drawings]

【図１】本発明の同調出力装置の動作の一例の概要を説
明する流れ図である。FIG. 1 is a flowchart illustrating an outline of an example of an operation of a tuning output device of the present invention.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩＨ０４Ｎ 5/91 Ｈ０４Ｎ 5/91 Ｚ (56)参考文献特開平５−143708（ＪＰ，Ａ) 特開昭62−225875（ＪＰ，Ａ) 特開平４−38782（ＪＰ，Ａ) 特開平５−284464（ＪＰ，Ａ) 光永知生外２名，”音声同期アニメーション生成システム”，テレビジョン学会技術報告，社団法人テレビジョン学会，1993年５月20日，第17巻，第28 号，ｐ．17−22 (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 13/00 G06T 11/60 G06T 11/80 G11B 20/00 H04N 5/91 ＣＳＤＢ（日本国特許庁)────────────────────────────────────────────────── (5) Continuation of the front page (51) Int.Cl. ⁷ Identification symbol FI H04N5 / 91 H04N5 / 91Z (56) References JP-A-5-143708 (JP, A) JP-A-62-225875 ( JP, A) JP-A-4-38782 (JP, A) JP-A-5-284464 (JP, A) Tomio Mitsunaga Two other people, "Sound Synchronized Animation Generation System", Television Society Technical Report, Television John Society, May 20, 1993, Vol. 17, No. 28, p. 17-22 (58) Fields investigated (Int. Cl. ⁷ , DB name) G06T 13/00 G06T 11/60 G06T 11/80 G11B 20/00 H04N 5/91 CSDB (Japan Patent Office)

Claims

(57) [Claims]

1. A computer device for processing and outputting data represented by voice, image, and character, which is previously recorded for each scene of a moving image, and a specific reproduction unit of the recorded voice data is used as a reference. Means for performing display control in units of characters or words for each reproduction unit of the audio data when processing and displaying character data created with character codes, each time character data with a specific character code is displayed for a certain unit , Video expansion processing of image data equivalent to a certain unit,
A method for synchronizing and outputting voice / image / character information, comprising means for repeating the output process as appropriate.