JPH1166345A

JPH1166345A - Image acoustic processor and recording medium

Info

Publication number: JPH1166345A
Application number: JP9216163A
Authority: JP
Inventors: Takayuki Yamaguchi; 貴之山口
Original assignee: Sega Enterprises Ltd
Current assignee: Sega Corp
Priority date: 1997-08-11
Filing date: 1997-08-11
Publication date: 1999-03-09

Abstract

PROBLEM TO BE SOLVED: To provide an image acoustic processor and a recording medium which have expression performance that naturally simulates the way a person actually pronounces and produces illusions that a model really sings and talks in user's mind. SOLUTION: This processor is provided with a sound generating means 12 which generates a sound, image displaying means 10 and 11 which show an image the represents the movement of a mouth at the time of pronouncing a syllable that is pronounced by the sound generating means and storing means 102 and 121 which store as many image data that express a mouth which pronounces a vowel as the number of vowel kinds. The means 10 and 11 identify vowels that are included in syllables which are pronounced by the means 12, read image data to represent a mouth at the time of pronouncing the identified vowels from the means 102 and 121 and show an image based on the read image data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、声を発する人間等
の画像をその音声とともに表現する画像音響処理装置に
係り、特に、実際に人間が発音する際の口の動きを表現
できる画像音響処理技術の改良に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image sound processing apparatus for expressing an image of a human or the like uttering voice along with its voice, and more particularly, to an image sound processing apparatus capable of expressing a mouth movement when a human actually sounds. Technology improvement.

【０００２】[0002]

【従来の技術】コンピュータグラフィックス技術の発達
に伴い、実物の人間を模擬したモデルをポリゴン等によ
り構成できるようになった。その中で、音声に合わせて
モデルの口を動かして、モデルが唄ったり喋ったりして
いる様子を表現する画像処理装置があった。2. Description of the Related Art With the development of computer graphics technology, a model simulating a real person can be constituted by polygons or the like. Among them, there is an image processing apparatus that moves the mouth of the model in accordance with the voice and expresses a state in which the model is singing and talking.

【０００３】実際のところ、人間が言語を発音する際の
口の動きは複雑である。このため、従来の画像処理装置
では、口を開いているか閉じているかの二種類の画像デ
ータのみを保持し、これら二種の画像データに基づいて
音声に合わせてモデルの口を交互に動かす画像表示を行
っていた。[0003] In fact, the movement of the mouth when a human pronounces a language is complicated. For this reason, the conventional image processing apparatus holds only two types of image data, that is, the mouth is opened or closed, and the image that moves the mouth of the model alternately in accordance with the voice based on these two types of image data. Display was done.

【０００４】しかしながら、従来の画像処理装置では、
モデルが実際に言葉を発音しているといえるまでの巧み
な口の動きを表現できないという不都合があった。すな
わち、口を開くか閉じるかの二種類の表現のみでは、実
際に人間が発音する際の多様な口の動きを真似ることが
できないため、完全に音声に同期させて口を動かしたと
しても、不自然な口の動きしか再現できなかった。However, in a conventional image processing apparatus,
There was an inconvenience that the model could not express a skillful mouth movement until it could be said that the model actually pronounced words. In other words, since only two types of expression, opening and closing the mouth, cannot actually mimic the various mouth movements when a human pronounces, even if the mouth is moved completely in synchronization with the sound, Only unnatural mouth movements could be reproduced.

【０００５】その一方で、すべての口の動きを表現でき
る画像データを記憶するのは、膨大なデータ量となるた
め、工業的な応用に適さないという不都合もあった。On the other hand, storing image data capable of expressing all mouth movements requires a huge amount of data, which is not suitable for industrial applications.

【０００６】[0006]

【発明が解決しようとする課題】本発明は、従来の画像
処理装置における不都合を解決するために、実際に人間
が発音する様子を自然に模擬し、まるでモデルが本当に
唄ったり話したりしていると錯覚させることのできる表
現力を有する画像音響処理装置およびそれを実現させる
プログラムデータが記録された記録媒体を提供すること
を目的とする。SUMMARY OF THE INVENTION In order to solve the inconvenience in the conventional image processing apparatus, the present invention naturally simulates how a human actually sounds, as if the model really sang or talked. It is an object of the present invention to provide an image-acoustic processing device having an expressive power capable of giving an illusion to the user and a recording medium in which program data for realizing the same is recorded.

【０００７】すなわち、本発明の第１の課題は、一定量
の画像データで実際に人間が音節（syllable）を発音す
る様子を模擬できる画像音響処理技術を提供することで
ある。[0007] That is, a first object of the present invention is to provide an image acoustic processing technique capable of simulating a manner in which a person actually sounds a syllable with a fixed amount of image data.

【０００８】本発明の第２の課題は、特殊な口の動かし
方となる音節であっても、人間が発音する様子を自然に
模擬できる画像音響処理技術を提供することである。A second object of the present invention is to provide an image sound processing technique which can naturally simulate a human uttering even a syllable that moves the mouth specially.

【０００９】本発明の第３の課題は、同一の母音を有す
る音節が連続しても、人間が発音する様子を自然に模擬
できる画像音響処理技術を提供することである。A third object of the present invention is to provide an image-acoustic processing technique capable of naturally simulating a human sounding even if syllables having the same vowel are consecutive.

【００１０】本発明の第４の課題は、連続した音節に含
まれる母音が異なる場合であっても、人間が発音する様
子をより自然に模擬できる画像処理技術を提供すること
である。[0010] A fourth object of the present invention is to provide an image processing technique capable of more naturally simulating the manner in which a human pronounces even when vowels included in consecutive syllables are different.

【００１１】本発明の第５の課題は、撥音を含む音節が
連続しても、人間が発音する様子を自然に模擬できる画
像音響処理技術を提供することである。A fifth object of the present invention is to provide an image sound processing technique capable of simulating a human being naturally producing sound even when syllables including sound repelling are continuous.

【００１２】[0012]

【課題を解決するための手段】上記第１の課題を解決す
る発明は、口の動きを表現する画像に合わせて音響を生
成するための画像音響処理装置において、音響を生成す
るための音響生成手段と、音節を発音する際の口の動き
を表現する画像を表示するための画像表示手段と、母音
を発音する際の口を表現するための画像データを母音の
種類だけ記憶させた記憶手段と、を備える。According to the first aspect of the present invention, there is provided an image sound processing apparatus for generating a sound in accordance with an image representing a mouth movement. Means, image display means for displaying an image representing the movement of the mouth when the syllable is pronounced, and storage means for storing image data for expressing the mouth when the vowel is pronounced only for the type of vowel And.

【００１３】そして、画像表示手段は、音響生成手段に
発音させる音節に含まれる母音を識別し、識別した母音
を発音する際の口を表現するための画像データを記憶手
段から読み取り、読み取った画像データに基づいて画像
表示する。The image display means identifies a vowel included in a syllable to be generated by the sound generation means, reads image data for expressing a mouth when the identified vowel is generated from the storage means, and reads the read image data. Display an image based on the data.

【００１４】なお、音節とは、音の単位で、一つのまと
まった音の感じを与えるものをいう。母音とは、日本語
では「あ」、「い」、「う」、「え」および「お」の五
音をいい、英語では、‘a’、‘i’、‘u’、‘e’およ
び‘o’をいう。音節に含まれる母音とは、主として音
節の語尾に含まれる母音をいう。Note that a syllable means a unit of sound that gives a sense of one sound. In Japanese, vowels mean the five sounds of "A", "I", "U", "E", and "O". In English, "a", "i", "u", "e" and Says 'o'. Vowels included in syllables mainly refer to vowels included in the ending of syllables.

【００１５】上記第２の課題を解決する発明は、記憶手
段には、母音を発音する際の口を表現するための画像デ
ータの他に、半開きした口を表現するための画像データ
が記憶され、画像表示手段は、音響生成手段に発音させ
る音節が予め定められた特定の音節である場合には、記
憶手段に記憶された半開きした口を表現するための画像
データおよび当該音節に含まれる母音を発音する際の口
を表現するための画像データに基づいて、半開きした口
を表現する画像に続けて母音を発音する際の口を表現す
る画像を表示する。In the invention for solving the second problem, the storage means stores image data for expressing a half-open mouth in addition to image data for expressing a mouth when a vowel is pronounced. When the syllable to be generated by the sound generating means is a predetermined specific syllable, the image display means includes image data for expressing the half-open mouth stored in the storage means and vowels included in the syllable. Based on the image data for expressing the mouth when pronouncing the vowel, an image representing the mouth when pronouncing the vowel is displayed subsequent to the image for expressing the half-open mouth.

【００１６】なお、半開きした口とは、例えば、「あ
(a)」と「え(e)」の中間的な口の開き方をいう。The half-open mouth is, for example, "A
(a) ”and“ e (e) ”.

【００１７】上記第３の課題を解決する発明は、記憶手
段には、母音を発音する際の口を表現するための画像デ
ータの他に、半開きした口を表現するための画像データ
が記憶され、画像表示手段は、音響生成手段に発音させ
る連続した音節が同一の母音を含んでいると識別した場
合には、記憶手段に記憶された半開きした口を表現する
ための画像データに基づいて、連続した音節の各々を発
音する口を表現するための画像の合間に、半開きした口
を表現する画像を表示する。In the invention for solving the third problem, the storage means stores image data for expressing a half-open mouth in addition to image data for expressing a mouth when a vowel is pronounced. If the image display means identifies that consecutive syllables to be generated by the sound generation means include the same vowel, based on the image data for expressing the half-open mouth stored in the storage means, An image expressing a half-open mouth is displayed between images for expressing a mouth that pronounces each successive syllable.

【００１８】上記第４の課題を解決する発明は、記憶手
段には、母音を発音する際の口を表現するための画像デ
ータの他に、異なる母音の１以上の組み合わせにおい
て、各組み合わせに係る二種類の母音の中間的な口の開
き方を表現するための画像データが記憶され、画像表示
手段は、音響生成手段に異なる母音を含む連続した音節
を発音させる際に、前記記憶手段に記憶された画像デー
タに基づいて、異なる母音の各々を発音する口を表現す
るための画像の合間に、これら二種類の母音の中間的な
口の開き方を表現するための画像を表示する。[0018] The invention for solving the above fourth problem is characterized in that, in the storage means, in addition to image data for expressing a mouth when a vowel is pronounced, one or more combinations of different vowels are associated with each combination. Image data for expressing an intermediate mouth opening method between two vowels is stored, and the image display unit stores the vowels in the storage unit when causing the sound generation unit to generate continuous syllables including different vowels. On the basis of the obtained image data, an image for expressing the way of opening the mouth between these two types of vowels is displayed between the images for expressing the mouths that emit different vowels.

【００１９】上記第５の課題を解決する発明は、記憶手
段には、母音を発音する際の口を表現するための画像デ
ータの他に、閉じた口を表現するための画像データが記
憶され、画像表示手段は、音響生成手段に発音させる音
節が撥音を含む場合に、記憶手段に記憶された閉じた口
を表現するための画像データに基づいて画像表示する。In the invention for solving the fifth problem, the storage means stores image data for expressing a closed mouth in addition to image data for expressing a mouth when a vowel is pronounced. The image display means displays an image based on the image data for expressing the closed mouth stored in the storage means, when the syllable to be sounded by the sound generation means includes a sound repellent.

【００２０】なお、本発明において、記憶手段は記憶装
置に相当する。画像表示手段は、処理回路および画像表
示回路に相当する。また、音響生成回路は、口の動きを
表現する画像データとともに提供される波形データに基
づき、その音節の音響を生成するものである。In the present invention, the storage means corresponds to a storage device. The image display means corresponds to a processing circuit and an image display circuit. The sound generation circuit generates sound of the syllable based on waveform data provided together with image data expressing mouth movement.

【００２１】上記第２の課題を解決する発明において、
特定の音節には、‘さ’、‘せ’、‘た’、‘ま’、
‘み’、‘む’、‘め’、‘も’、‘わ’、‘を’、
‘ば’、‘び’、‘ぶ’、‘べ’、‘ぼ’、‘ぱ’、
‘ぴ’、‘ぷ’、‘ぺ’および‘ぽ’の一部または全部
を含む。In the invention for solving the second problem,
Specific syllables include 'sa', 'se', 'ta', 'ma',
'Mi', 'mu', 'me', 'mo', 'wa', 'o',
'Ba', 'bi', 'bu', 'be', 'bo', 'ぱ',
Includes some or all of 'ぴ', 'ぷ', 'ぺ' and 'ぽ'.

【００２２】前記第１の課題を解決する発明は、音節の
識別が、文字データに基づいて行われる。In the invention for solving the first problem, syllables are identified based on character data.

【００２３】本発明における音節には、母音の他に、撥
音等任意の音を含めることができる。The syllable according to the present invention can include any sound such as a repellent sound in addition to the vowel.

【００２４】本発明における記録媒体には、例えば、フ
ロッピーディスク、ハードディスク、磁気テープ、光磁
気ディスク、ＣＤ−ＲＯＭ、ＤＶＤ、ＲＯＭカートリッ
ジ、バッテリバックアップ付きのＲＡＭメモリカートリ
ッジ、フラッシュメモリカートリッジ、不揮発性ＲＡＭ
カートリッジ等を含む。The recording medium in the present invention includes, for example, a floppy disk, a hard disk, a magnetic tape, a magneto-optical disk, a CD-ROM, a DVD, a ROM cartridge, a RAM memory cartridge with a battery backup, a flash memory cartridge, and a nonvolatile RAM.
Including cartridges.

【００２５】また、電話回線等の有線通信媒体、マイク
ロ波回線等の無線通信媒体等の通信媒体を含む。インタ
ーネットもここでいう通信媒体に含まれる。The communication medium also includes a communication medium such as a wired communication medium such as a telephone line and a wireless communication medium such as a microwave line. The Internet is also included in the communication medium mentioned here.

【００２６】記録媒体とは、何等かの物理的手段により
情報（主にデジタルデータ、プログラム）が記録されて
いるものであって、コンピュータ、専用プロセッサ等の
処理装置に所定の機能を行わせることができるものであ
る。要するに、何等かの手段でもってコンピュータにプ
ログラムをダウンロードし、所定の機能を実行させるも
のであればよい。A recording medium is a medium on which information (mainly digital data and programs) is recorded by some physical means, and which causes a processing device such as a computer or a dedicated processor to perform a predetermined function. Can be done. In short, any method may be used as long as the program is downloaded to the computer by some means and a predetermined function is executed.

【００２７】[0027]

【発明の実施の形態】本発明の好適な実施の形態は、ゲ
ーム装置に本発明の画像音響処理装置を適用した。以
下、図面を参照して説明する。DESCRIPTION OF THE PREFERRED EMBODIMENTS In a preferred embodiment of the present invention, the image sound processing apparatus of the present invention is applied to a game machine. Hereinafter, description will be made with reference to the drawings.

【００２８】（装置構成）図１に、本実施形態のゲーム
装置の外観図を示す。図１に示すように、本形態のゲー
ム装置は、ゲーム装置本体１、パッド２ｂおよびモニタ
装置５を備えている。(Apparatus Configuration) FIG. 1 shows an external view of a game apparatus according to the present embodiment. As shown in FIG. 1, the game device of the present embodiment includes a game device main body 1, a pad 2b, and a monitor device 5.

【００２９】ゲーム装置本体１は、その内部に、本発明
の画像音響処理装置が収められている。その外装には、
ＲＯＭカートリッジ接続用のインターフェース（Ｉ／
Ｆ）部１ａ、ＣＤ−ＲＯＭ読取用のＣＤ−ＲＯＭドライ
ブ１ｂが設けられている。パッド２ｂは、十字カーソル
や複数の操作ボタンを備え、遊技者の操作に基づいて操
作信号を生成可能に構成される。そして、この操作信号
がケーブル２ｃを介してコネクタ２ａに供給可能に構成
されている。パッド２ｂは、コネクタ２ａによりゲーム
装置本体１に着脱可能に設けられ、二人の遊技者が同時
に操作するために二つのパッドが接続可能に構成され
る。なお、パッドの代わりに、マウス、リモコン、キー
ボード等を接続して構成してもよい。モニタ装置５は、
ビデオケーブル４ａおよびオーディオケーブル４ｂを介
して、ゲーム装置本体１の図示しないビデオ出力端子お
よびオーディオ出力端子に接続されて構成されている。The game apparatus body 1 houses therein the image and sound processing apparatus of the present invention. On its exterior,
Interface for connecting ROM cartridge (I /
F) A section 1a and a CD-ROM drive 1b for reading a CD-ROM are provided. The pad 2b includes a cross cursor and a plurality of operation buttons, and is configured to be able to generate an operation signal based on a player's operation. The operation signal can be supplied to the connector 2a via the cable 2c. The pad 2b is detachably provided to the game apparatus main body 1 by the connector 2a, and is configured such that two pads can be connected so that two players can operate at the same time. Note that a mouse, a remote controller, a keyboard, or the like may be connected instead of the pad. The monitor device 5
It is connected to a video output terminal and an audio output terminal (not shown) of the game apparatus main body 1 via a video cable 4a and an audio cable 4b.

【００３０】（ブロック構成）図２に、本発明の画像音
響処理装置を適用したゲーム装置のブロック図を示す。
図２に示すように、本ゲーム装置は、ＣＰＵブロック１
０、ビデオブロック１１、サウンドブロック１２および
サブシステム１３により構成される。(Block Configuration) FIG. 2 shows a block diagram of a game apparatus to which the image and sound processing apparatus of the present invention is applied.
As shown in FIG. 2, the game device has a CPU block 1
0, a video block 11, a sound block 12, and a subsystem 13.

【００３１】（ＣＰＵブロック）ＣＰＵブロック１０
は、本発明の画像表示手段の一部および処理回路であっ
て、プログラムにしたがってゲーム処理を進めるととも
に、本発明の画像音響処理を制御するものであって、Ｓ
ＣＵ（System Control Unit）１００、メインＣＰＵ１
０１、ＲＡＭ１０２、ＲＯＭ１０３、サブＣＰＵ１０４
およびＣＰＵバス１０５等により構成されている。(CPU Block) CPU Block 10
Is a part and a processing circuit of the image display means of the present invention, which advances the game processing according to the program and controls the image sound processing of the present invention.
CU (System Control Unit) 100, main CPU 1
01, RAM 102, ROM 103, sub CPU 104
And a CPU bus 105 and the like.

【００３２】メインＣＰＵ１０１は、その内部にＤＳＰ
（Digital Signal Processor）を備え、ＣＤ−ＲＯＭ１
から転送されたプログラムデータに基づく処理を高速に
実行可能に構成されている。ＲＡＭ１０２は、ＣＤ−Ｒ
ＯＭ１から読み取られたアプリケーションソフトのプロ
グラムデータ、音響制御プログラムデータ、波形データ
および各音節を発音する口の形を表現させるための画像
データが格納されている。また、ＲＡＭ１０２は、ＭＰ
ＥＧ画像復号時のワークエリアおよびＣＤ−ＲＯＭ復号
時の誤り訂正用データキャッシュとして使用可能に構成
されている。ＲＯＭ１０３は、当該装置の初期化処理の
ために用いるイニシャルプログラムデータが格納可能に
構成されている。ＳＣＵ１００は、バス１０５、１０６
および１０７を介して行われるデータの転送を統括可能
に構成されている。また、ＳＣＵ１００は、内部にＤＭ
Ａコントローラを備え、ＲＡＭ１０２等に格納されたゲ
ームの実行中に必要になる画像データをビデオブロック
１１内のＶＲＡＭへ転送可能に、音響制御プログラムデ
ータおよび波形データをサウンドブロック１２に転送可
能に構成されている。The main CPU 101 has a DSP inside.
(Digital Signal Processor), CD-ROM1
It is configured to be able to execute a process based on the program data transferred from the PC at high speed. RAM 102 is a CD-R
The program data of the application software read from the OM1, the sound control program data, the waveform data, and the image data for expressing the shape of the mouth that sounds each syllable are stored. Also, the RAM 102 stores the MP
It is configured to be usable as a work area when decoding an EG image and as an error correction data cache when decoding a CD-ROM. The ROM 103 is configured to be able to store initial program data used for initialization processing of the device. The SCU 100 includes buses 105 and 106
And 107 are configured so as to be able to control the transfer of data performed through them. The SCU 100 has a DM inside.
An A controller is provided so that image data required during the execution of a game stored in the RAM 102 or the like can be transferred to the VRAM in the video block 11 and sound control program data and waveform data can be transferred to the sound block 12. ing.

【００３３】サブＣＰＵ１０４はＳＭＰＣ（System Man
ager & Peripheral Control）と呼ばれ、メインＣＰＵ
１０１の要求に応じパッド２ｂからの操作信号を収集可
能に構成される。The sub CPU 104 is an SMPC (System Man
ager & Peripheral Control), the main CPU
An operation signal from the pad 2b can be collected in response to the request of 101.

【００３４】（ビデオブロックの構成）ビデオブロック
１１は、本発明の画像表示手段の一部および画像表示回
路であって、ＶＤＰ（Video Display Processor）１２
０、ＶＤＰ１３０、ＶＲＡＭ１２１、フレームバッファ
１２２、１２３、ＶＲＡＭ１３１およびメモリ１３２を
備えて構成されている。(Configuration of Video Block) The video block 11 is a part of the image display means and the image display circuit of the present invention.
0, a VDP 130, a VRAM 121, frame buffers 122 and 123, a VRAM 131, and a memory 132.

【００３５】ＶＲＡＭ１２１は、メインＣＰＵ１０１に
よりＳＣＵ１００を介して転送された描画コマンドを格
納可能に構成されている。The VRAM 121 is configured to be able to store drawing commands transferred by the main CPU 101 via the SCU 100.

【００３６】ＶＤＰ１２０は、ＶＲＡＭ１２１に格納さ
れた描画コマンドに基づいて、ビットマップ形式の画像
データの生成、図形の変形、影やシェーディング等の色
演算等を行い、生成した画像データをフレームバッファ
１２２および１２３に書込み可能に構成されている。The VDP 120 generates bitmap image data, deforms graphics, performs color calculations such as shadows and shading, etc., based on the drawing commands stored in the VRAM 121, and transfers the generated image data to the frame buffer 122 and the frame buffer 122. 123 is configured to be writable.

【００３７】フレームバッファ１２２および１２３は、
ＶＤＰ１２０により生成させられた画像データを格納可
能に構成されている。The frame buffers 122 and 123 are
The image data generated by the VDP 120 can be stored.

【００３８】ＶＲＡＭ１３１は、背景画像の画像デー
タ、ＶＤＰ１３０の機能実現に必要なデータテーブル等
を格納可能に構成されている。The VRAM 131 is configured to store image data of a background image, a data table necessary for realizing the functions of the VDP 130, and the like.

【００３９】ＶＤＰ１３０は、ＶＲＡＭ１３１、フレー
ムバッファ１２２および１２３に格納された画像データ
に基づいて、選択枠を設定するウインドウ処理、影をつ
ける処理、拡大・縮小、回転、モザイク処理、移動処
理、クリッピングや表示優先順位処理等の陰面処理、等
の画像処理を行い表示画像データをメモリ１３２に格納
可能に構成されている。The VDP 130 is based on image data stored in the VRAM 131 and the frame buffers 122 and 123, and performs window processing for setting a selection frame, shadowing processing, enlargement / reduction, rotation, mosaic processing, moving processing, clipping, Image processing such as hidden surface processing such as display priority processing is performed, and display image data can be stored in the memory 132.

【００４０】メモリ１３２は、ＶＤＰ１３０により描画
用画像データが格納され、エンコーダ１６０に出力可能
に構成されている。The memory 132 stores image data for drawing by the VDP 130 and can output the image data to the encoder 160.

【００４１】エンコーダ１６０は、メモリ１３２に格納
されている描画用画像データをビデオ信号のフォーマッ
トに変換し、Ｄ／Ａ変換してモニタ装置５へ供給可能に
構成されている。The encoder 160 converts the drawing image data stored in the memory 132 into a video signal format, D / A converts the data, and supplies it to the monitor device 5.

【００４２】モニタ装置５は、供給されたビデオ信号に
基づいた画像を表示可能に構成されている。The monitor device 5 is configured to display an image based on the supplied video signal.

【００４３】（サウンドブロック）サウンドブロック１
２は、本発明の音響生成手段および音響生成回路であっ
て、ＤＳＰ１４０と、ＣＰＵ１４１とにより構成され
る。(Sound Block) Sound Block 1
Reference numeral 2 denotes a sound generation unit and a sound generation circuit of the present invention, which are constituted by the DSP 140 and the CPU 141.

【００４４】ＣＰＵ１４１は、メインＣＰＵ１０１から
転送された音響制御プログラムデータおよび波形データ
をＤＳＰ１４０に転送可能に構成されている。ＤＳＰ１
４０は、音響メモリを内蔵して構成されている。そして
ＣＰＵ１４１の制御により、波形データを参照してＰＣ
Ｍ音源またはＦＭ音源による波形発生、遅延データの生
成、および音声合成を行い、生成した波形データをＤ／
Ａコンバータ１７０に出力可能に構成されている。ＤＳ
Ｐ１４０は、これらの作用により周波数制御、音量制
御、ＦＭ演算、変調、音声合成、リバーブ等の機能を備
えている。Ｄ／Ａコンバータ１７０は、ＤＳＰ１４０に
よって生成された波形データを２チャンネルの信号に変
換し、スピーカ５ａおよび５ｂに供給可能に構成されて
いる。The CPU 141 can transfer the sound control program data and the waveform data transferred from the main CPU 101 to the DSP 140. DSP1
Reference numeral 40 is configured to include an acoustic memory. Then, under the control of the CPU 141, the PC
Performs waveform generation, delayed data generation, and voice synthesis using an M sound source or an FM sound source, and generates the D / D
It is configured to be able to output to the A converter 170. DS
The P140 has functions such as frequency control, volume control, FM calculation, modulation, speech synthesis, reverb, etc. by these actions. The D / A converter 170 is configured to convert the waveform data generated by the DSP 140 into a two-channel signal and supply the signal to the speakers 5a and 5b.

【００４５】なお、上記波形データには、人間の音声が
含まれる。音響制御プログラムデータには、この人間の
音節を示す文字コードが含まれる。このコードは、例え
ば画像表示させるモデルの唄う曲の歌詞データである。The waveform data includes a human voice. The sound control program data includes a character code indicating this human syllable. This code is, for example, lyrics data of a song sung by a model to be displayed on an image.

【００４６】（サブシステム）サブシステム１３は、Ｃ
Ｄ−ＲＯＭドライブ１ｂ、ＣＤインターフェース（Ｉ／
Ｆ）１８０、ＣＰＵ１８１、ＭＰＥＧオーディオ回路１
８２およびＭＰＥＧビデオ回路１８３を備えて構成され
ている。(Subsystem) The subsystem 13 has a C
D-ROM drive 1b, CD interface (I /
F) 180, CPU 181, MPEG audio circuit 1
82 and an MPEG video circuit 183.

【００４７】ＣＤ―ＲＯＭドライブ１ｂは、ＣＤ―ＲＯ
Ｍからアプリケーションソフトのプログラムデータ、画
像データ、音響制御プログラムデータおよび波形データ
を読み取り、ＣＤインターフェース１８０はこれらをＣ
ＰＵブロック１０に供給可能に構成されている。また、
ＣＰＵ１８１は、ＭＰＥＧオーディオ回路１８２および
ＭＰＥＧビデオ回路１８３を制御して、ＭＰＥＧ規格に
より高能率符号圧縮化された画像データおよび音響デー
タを復号化可能に構成されている。The CD-ROM drive 1b has a CD-RO
M. The program data, image data, sound control program data and waveform data of the application software are read from M.
It is configured to be able to be supplied to the PU block 10. Also,
The CPU 181 is configured to control the MPEG audio circuit 182 and the MPEG video circuit 183 so as to be able to decode the image data and the audio data that have been highly efficient code-compressed according to the MPEG standard.

【００４８】（原理説明）次に本発明の原理を説明す
る。本実施形態のゲームでは、実際のアイドルを模擬し
たモデルを、いわゆるポリゴンにより構成して表示す
る。例えば、図８に示すように、モデルの顔は、多数の
微少三角形、あるいは四角形のポリゴンの各々により構
成される。ある画像を表示させる場合、次の表示タイミ
ング（例えば垂直同期タイミング）に先んじて、メイン
ＣＰＵ１０１は、個々のポリゴンのワールド座標系にお
ける空間位置を特定する座標データとともに、各頂点座
標等を特定するポリゴンデータをビデオブロック１１の
ＶＲＡＭ１２１に転送する。ビデオブロック１１では、
ＶＤＰ１２０がこれらデータを参照して、図８に示すよ
うなモデルの顔等のセグメントを生成し、個々のポリゴ
ンの表面に模様であるテクスチャーデータをビットマッ
プ形式で適用する（貼り付ける）。さらにＶＤＰ１２０
は、セグメントの表面を滑らかにするシェーディング処
理を行って、図９に示すように、実際のアイドルに似せ
たセグメントの画像を完成させる。この画像は、エンコ
ーダ１６０により、表示させるべきタイミングに合わせ
たビデオ信号に変換され、モニタ装置５にモデルの顔が
表示される。(Explanation of Principle) Next, the principle of the present invention will be described. In the game of the present embodiment, a model that simulates an actual idle is constituted by so-called polygons and displayed. For example, as shown in FIG. 8, the face of the model is composed of a large number of minute triangles or quadrangular polygons. When displaying a certain image, prior to the next display timing (for example, vertical synchronization timing), the main CPU 101 sets coordinate data for specifying the spatial position of each polygon in the world coordinate system and polygons for specifying each vertex coordinate and the like. The data is transferred to the VRAM 121 of the video block 11. In video block 11,
The VDP 120 generates a segment such as a model face as shown in FIG. 8 with reference to these data, and applies (pastes) texture data as a pattern to the surface of each polygon in a bitmap format. In addition, VDP120
Performs a shading process for smoothing the surface of the segment to complete an image of the segment that resembles an actual idle as shown in FIG. This image is converted by the encoder 160 into a video signal that matches the display timing, and the model face is displayed on the monitor device 5.

【００４９】以上の処理を、フレーム期間（一枚の画面
を表示する周期）ごとにポリゴンの空間座標を更新しな
がら行うことによって、実際の人間のように、口を動か
したり瞼を閉じたり目を動かしたりするモデルの画像を
表示する。By performing the above processing while updating the spatial coordinates of the polygon every frame period (period for displaying one screen), moving the mouth, closing the eyelids, Display the image of the model that moves the.

【００５０】（母音による口の形の表現）さて、人間が
実際に一つの音節を発音する場合、発音期間中で母音を
発音している期間が比較的長い。本発明では、この原理
に基づいて、サウンドブロック１１に発音させる音節の
母音が何かを判断し、この母音を発音する際の口を表現
するための画像を、この音節の発音と同時に表示させ
る。すなわち、「あ（ａ）」、「い（ｉ）」、「う
（ｕ）」、「え（ｅ）」および「お（ｏ）」のそれぞれ
の母音に対応する画像表示は、例えば図１０乃至図１４
に示すようになる。(Expression of mouth shape by vowel) When a human actually pronounces one syllable, the period during which a vowel is sounding is relatively long during the sounding period. In the present invention, based on this principle, it is determined what the vowel of the syllable to be sounded by the sound block 11 is, and an image for expressing the mouth when the vowel is pronounced is displayed simultaneously with the pronunciation of this syllable. . That is, the image display corresponding to each of the vowels “A (a)”, “I (i)”, “U (u)”, “E (e)”, and “O (o)” is shown in FIG. Through FIG.
It becomes as shown in.

【００５１】「こんにちは」という言葉を例に採る。各
音節の長さが等しいなら、サウンドブロックより波形デ
ータに基づき「こんにちは」と発音する間、表示画像
は、図４に示すように「おんいいあ」と発音するように
動かしていることになる。[0051] taken as an example the word "Hello". If the length of each syllable are equal between pronounced: Based sound block to the waveform data "Hello", the displayed image would have moved to say "On'iia" as shown in FIG. 4 .

【００５２】このような処理によれば、各音節に対応す
る母音をその音節の発音期間中に表示することによっ
て、近似的にモデルがその音節を発音しているものと遊
技者に認識させることができる。According to such processing, the vowel corresponding to each syllable is displayed during the sounding period of the syllable, thereby allowing the player to approximately recognize that the model is sounding the syllable. Can be.

【００５３】（同一母音の音節が連続する場合における
口の形の表現）ところで、図４を参照すると、「こんに
ちは」という言葉のうち連続する「に」と「ち」とは、
同じ母音「い（ｉ）」を含んでいる。このため、画像表
示される口は、この二音節にわたって同一の形のまま動
かないことになり不自然となる場合もある。[0053] (representation of the shape of the mouth in the case where the same vowel syllables are consecutive) By the way, referring to FIG. 4, successive of the word "Hello" and "to" and "blood" is,
It contains the same vowel "i (i)". For this reason, the mouth displayed as an image does not move in the same shape over these two syllables, which may be unnatural.

【００５４】これに対処するため、本発明では、前に発
音させた音節の母音と次回に発音させる音節の母音が同
一である場合には、さらに半開き状態の口を表現する画
像を挿入する。例えば、図５に示すように、連続する同
一母音に係る音節のうち、第二番目の音節の前半に、図
１５に示すような「あ」と「え」の中間的な口の状態を
表現した、半開き状態の口の画像を挿入する。In order to cope with this, in the present invention, when the vowel of the syllable to be sounded previously and the vowel of the syllable to be sounded next time are the same, an image representing the mouth in a half-open state is further inserted. For example, as shown in FIG. 5, in the first half of the second syllable of syllables related to the same vowel, a state of an intermediate mouth between “A” and “E” as shown in FIG. 15 is expressed. Then, insert the image of the mouth in the half-open state.

【００５５】このような処理によれば、連続する二つの
音節において同一の母音が連続する場合であっても、い
ずれの母音とも異なる口の形がその間に表示されるの
で、より自然な口の動きを表現できる。According to such processing, even if the same vowel is continuous in two consecutive syllables, a different mouth shape is displayed between both vowels, so that a more natural mouth Can express movement.

【００５６】なお、半開きを挿入する個所は、連続する
二音節のうち第二音節の冒頭に限らず、第一音節の後で
もよい。また、第一音節の後部と第二音節の冒頭部にと
もにかかってもよい。また、半開きの口を表現する画像
を挿入する長さは、音節の長さの半分に限らず、それよ
り短くても長くてもよい。さらに半開きの口の形は、い
ずれの母音を発音する際の口の形を異なればよく、図１
５のように「あ」と「え」の中間状態における口の他、
「あ」と「お」等他の口の形でもよい。The place where the half-opening is inserted is not limited to the beginning of the second syllable in two consecutive syllables, but may be after the first syllable. Also, both the end of the first syllable and the beginning of the second syllable may be applied. Further, the length of inserting the image representing the half-open mouth is not limited to half the length of the syllable, but may be shorter or longer. Further, the shape of the half-open mouth only needs to be different from the shape of the mouth when producing any vowel.
In addition to the mouth in the middle state between "A" and "E" as in 5,
Other mouth shapes such as "A" and "O" may be used.

【００５７】（特定音節における口の形の表現）さら
に、図５の各音節を実際に発音する際の口の形を検討す
ると、「わ（は）」の音節は、音節冒頭の子音を発音す
る際の口の形と、その母音である「あ」を発音する際の
口の形とが大幅に異なっている。このように、一つの音
節であっても子音の口の形と母音の口の形が大きく変化
する音節には、表１に掲げるものがある。(Expression of mouth shape in specific syllable) Further, when examining the mouth shape when each syllable shown in FIG. 5 is actually pronounced, the syllable of "wa (wa)" sounds the consonant at the beginning of the syllable. The shape of the mouth when performing and the shape of the mouth when pronouncing its vowel "A" are significantly different. As described above, some syllables in which the shape of the mouth of a consonant and the shape of the mouth of a vowel greatly change even in one syllable are listed in Table 1.

【００５８】[0058]

【表１】 [Table 1]

【００５９】このような音節を、その母音を発音する際
の口の形で表現すると不自然になる。このため本発明で
は、発音する音節が表１に掲げる音節である場合には、
その音節の冒頭に半開き状態の口を表示する。例えば、
図６に示すように、「わ」の音節において、その前半
に、図１５に示すような半開きの口を表現する画像を表
示する。If such a syllable is expressed in the form of a mouth when the vowel is pronounced, it becomes unnatural. Therefore, in the present invention, when the syllables to be pronounced are the syllables listed in Table 1,
A half-open mouth is displayed at the beginning of the syllable. For example,
As shown in FIG. 6, an image expressing a half-open mouth as shown in FIG. 15 is displayed in the first half of the syllable of "wa".

【００６０】このような処理によれば、音節の子音と母
音とで口の形が大きく変化する音節においても、その音
節の冒頭に半開きの口の形を表示するので、自然な口の
動きを表現できる。According to such processing, even in a syllable in which the shape of the mouth greatly changes between the consonant and the vowel of the syllable, the half-open mouth shape is displayed at the beginning of the syllable. Can be expressed.

【００６１】なお、半開きの口の形は、連続する音節が
同一の母音を含む場合と同様に、種々に変更が可能であ
る。The shape of the half-open mouth can be variously changed, as in the case where consecutive syllables contain the same vowel.

【００６２】（異なる母音を含む連続した音節間におけ
る口の形のさらに自然な表現）一方、連続した音節が異
なる母音を含んでいる場合であっても、最初の音節にお
ける母音を発音するときの口の形と次の音節における母
音を発音するときの口の形が大きく異なる場合がある。
また、両音節の間には子音も入り、これら両音節を実際
に発音する人間の口の形は、単に異なる母音を連続して
発音させたものとは相違する。したがって、異なる母音
を含む連続した音節を発音する口を模擬する場合に、こ
れら二種類の異なる母音を発音する口を表現するための
画像を連続して表現するだけでなく、これら二種類の画
像の合間に、これら二種類の母音を発音する口の中間的
な表現をさせる画像を挿入すれば、さらに自然な人間の
口の動きを模擬できる。(More Natural Expression of Mouth Shape Between Consecutive Syllables Containing Different Vowels) On the other hand, even when consecutive syllables contain different vowels, the vowel in the first syllable is pronounced. The shape of the mouth and the shape of the mouth when the vowel in the next syllable is pronounced may be significantly different.
In addition, consonants are included between the two syllables, and the shape of the mouth of a human who actually pronounces these two syllables is different from that in which different vowels are simply produced continuously. Therefore, when simulating a mouth that pronounces a continuous syllable containing different vowels, not only an image for expressing the mouth that pronounces these two different vowels, but also these two types of images By inserting an image that gives an intermediate expression of the mouth that produces these two types of vowels, a more natural human mouth movement can be simulated.

【００６３】そこで、本発明では、異なる音節を含む連
続した音節を発音させる際に、異なる母音の各々を発音
する口を表現するための画像の合間に、例えば、図７に
示すように、これら二種類の母音の中間的な口の開き方
を表現するための画像を表示する。例えば、同図では、
「まっくらです」という発音を模擬する場合を示す。最
初の「まっ」という音節は、上記特殊音節に該当してい
るので図１５に示すような半開きの口を表示する。残り
の音節では、前後する音節に含まれる母音の中間的な口
を表示してある。「まっ」と「く」、「く」と「ら」で
は、それぞれの音節に含まれる母音の組み合わせからみ
れば、「あ」と「う」の組み合わせであって同一の組み
合わせとなるので、同一の中間的な口を表現する画像を
表示する。Therefore, according to the present invention, when producing continuous syllables including different syllables, as shown in FIG. 7, for example, as shown in FIG. Displays an image that represents the way the mouth opens between two vowels. For example, in the figure,
This shows the case of simulating the pronunciation of "Makura". Since the first syllable “ma” corresponds to the special syllable, a half-open mouth as shown in FIG. 15 is displayed. In the remaining syllables, the middle mouth of the vowel included in the preceding and following syllables is displayed. In terms of the combination of vowels included in each syllable, "ma" and "ku" and "ku" and "ra" are the same combination of "a" and "u" because they are the same combination. Display an image that represents the middle mouth of.

【００６４】なお、中間的な口の開き方を表現するため
の画像データは、前後する音節の組み合わせに応じて用
意する。例えば、基本的に用意してある画像データが、
「あ」、「い」、「う」、「え」および「お」の５つの
母音を表現する口、「ん」を表す撥音を表現する口並び
に上記特殊な半開き状態を表現する口の合計７種類から
なる口の形を表現する場合には、これら７種類から２つ
を選択する組み合わせの数だけ画像データを用意する。
すなわち、₇Ｃ₂＝２１通り用意する。Note that image data for expressing an intermediate way of opening the mouth is prepared according to a combination of preceding and following syllables. For example, basically, the image data prepared is
The sum of the mouth that expresses the five vowels “A”, “I”, “U”, “E”, and “O”, the mouth that expresses the sound-repelling that represents “N”, and the mouth that expresses the special half-open state When expressing seven types of mouth shapes, image data is prepared by the number of combinations for selecting two from these seven types.
That is, ₇ C ₂ = 21 patterns are prepared.

【００６５】ただし、この中間的な口の画像は、異なる
母音を発音する口の画像を連続して表示した場合に不自
然に見える組み合わせに限って挿入するものであっても
よい。例えば、母音「あ」と「え」とでは口の形があま
り変化しないので、これら二種類の異なる母音を含む音
節が連続する場合には、中間的な口の画像を挿入しなく
てもよい。However, this intermediate mouth image may be inserted only in a combination that looks unnatural when successively displaying mouth images that produce different vowels. For example, the shape of the mouth does not change much between the vowels "A" and "E", so if syllables containing these two different vowels are continuous, it is not necessary to insert an intermediate mouth image .

【００６６】さらに、上記したような、異なる母音の中
間的な口の形を表現するための画像の代わりに、他の中
間的な口を表現するための画像を用いてもよい。例え
ば、後の音節に含まれる子音を発音する際の口の形を模
擬しうる他の中間的な口を表現するための画像を用いて
もよい。Further, instead of the above-described image for expressing an intermediate mouth shape of different vowels, an image for expressing another intermediate mouth may be used. For example, an image for expressing another intermediate mouth that can simulate the shape of the mouth when the consonant included in the later syllable is generated may be used.

【００６７】なお、中間的な口の画像を表示する期間
は、連続する二音節のうち第二音節の前半部に限らず、
第一音節の後半部でもよい。また、第一音節の後部と第
二音節の前部にともにかかってもよい。また、中間的な
口を表現する画像を挿入する長さは、音節の長さの半分
に限らず、それより短くても長くてもよい。The period during which the image of the intermediate mouth is displayed is not limited to the first half of the second syllable of two consecutive syllables.
It may be the second half of the first syllable. Also, both the rear part of the first syllable and the front part of the second syllable may be applied. Further, the length of inserting an image representing an intermediate mouth is not limited to half the length of a syllable, but may be shorter or longer.

【００６８】（音節が撥音であるにおける口の形の表
現）また、音節のうち撥音とよばれる音節、つまり「ん
（ｎ）」の音節では、いずれの母音も含まれない。した
がって、本発明では、音節が撥音である場合には、図１
６に示すように口を閉じた画像を表示する。(Expression of mouth shape when syllable is plucked) In the syllable, a syllable called a plucked sound, that is, a syllable of "n (n)" does not include any vowel. Therefore, in the present invention, when the syllable is sound-repellent, FIG.
As shown in FIG. 6, an image with the mouth closed is displayed.

【００６９】このような処理によれば、音節が撥音であ
る場合にも口を閉じた画像を表示するので、自然な口の
動きを表現できる。According to such processing, an image in which the mouth is closed is displayed even when the syllable is sound-repellent, so that a natural movement of the mouth can be expressed.

【００７０】上記各処理を、音節とそれに対応させて表
示する画像と関係つけてまとめると、表２のようにな
る。Table 2 summarizes the above processes in relation to syllables and images displayed in correspondence with the syllables.

【００７１】[0071]

【表２】 [Table 2]

【００７２】（動作説明）次に、上記原理に基づく処理
を、図３のフローチャートを参照して説明する。(Explanation of Operation) Next, processing based on the above principle will be described with reference to the flowchart of FIG.

【００７３】ステップＳ１（初期設定）：ゲームを開
始するに当たり、メインＣＰＵ１０１用プログラムデー
タ、画像データ、ＣＰＵ１４１用音響制御プログラムデ
ータおよび波形データが、カートリッジインターフェー
ス回路１ａまたはＣＤ−ＲＯＭから読み取られ、ＲＡＭ
１０２に転送される。以下、メインＣＰＵ１０１はＲＡ
Ｍ１０２に格納されたメインＣＰＵ用プログラムを実行
し、ＣＰＵ１４１はメインＣＰＵ１０１により転送され
ＤＳＰ１４０に格納された音響制御用プログラムを実行
する。Step S1 (Initial Setting): At the start of the game, the program data for the main CPU 101, the image data, the sound control program data for the CPU 141, and the waveform data are read from the cartridge interface circuit 1a or the CD-ROM, and stored in the RAM.
Transferred to 102. Hereinafter, the main CPU 101
The main CPU 101 executes the main CPU program stored in the M 102, and the CPU 141 executes the acoustic control program transferred by the main CPU 101 and stored in the DSP 140.

【００７４】ステップＳ２：メインＣＰＵ１０１は、
サウンドブロック１１に転送される音響制御用プログラ
ムデータのうち歌詞データを、画像の更新タイミングご
とに読み取る。この歌詞データの示す文字を順次参照し
ていけば、次にサウンドブロック１１から発音させる音
節が何であるかを判断できる。Step S2: The main CPU 101
The lyrics data of the sound control program data transferred to the sound block 11 is read at each image update timing. By successively referring to the characters indicated by the lyrics data, it is possible to determine what syllable is to be sounded next from the sound block 11.

【００７５】このため、メインＣＰＵ１０１は、歌詞デ
ータから次に発音させる音節文字を読み取り、プログラ
ムデータに含まれる、表２に示すような対応関係を規定
したテーブルデータを参照する。そして、このテーブル
から特定される母音を発音させる口を表現するための以
下の画像データ、すなわちＲＡＭ１０２に格納させたポ
リゴンの空間位置を特定する座標データの集合をビデオ
ブロック１１に転送し、発音される音節に対応したモデ
ルの口を表現させる。For this reason, the main CPU 101 reads the syllable character to be pronounced next from the lyrics data, and refers to the table data defining the correspondence as shown in Table 2 included in the program data. Then, the following image data for expressing the mouth that emits the vowel specified from this table, that is, a set of coordinate data for specifying the spatial position of the polygon stored in the RAM 102 is transferred to the video block 11, and is generated. To express the mouth of the model corresponding to the syllable.

【００７６】ステップＳ３：まず、次にサウンドブロ
ック１１に転送する波形データにより発音させられる音
節が、表１に掲げる特定の音節であるか否かを判断す
る。特定の音節である場合（Ｓ３；ＹＥＳ）には、半開
きの口を表現するため半開きした口を表現する画像を表
示させ（Ｓ４）、ステップＳ８に移行する。すなわち、
半開きの口を表現するためのポリゴンの空間位置を定義
した座標データをビデオブロック１１に転送し、図１５
に示すような口の表示をさせる。半開きの口を表示する
時間長は、特定音節ごとに異なった時間長に設定しても
よいし、一定の時間長に設定してもよい。また、音節の
長さに対する半開きの口を表示する長さの比率で設定し
てもよい。Step S3: First, it is determined whether or not the syllable generated by the waveform data transferred to the sound block 11 is a specific syllable listed in Table 1. If the syllable is a specific syllable (S3; YES), an image representing the half-open mouth is displayed to express the half-open mouth (S4), and the process proceeds to step S8. That is,
The coordinate data defining the spatial position of the polygon for expressing the half-open mouth is transferred to the video block 11, and
The mouth is displayed as shown in. The time length for displaying the half-open mouth may be set to a different time length for each specific syllable, or may be set to a fixed time length. Alternatively, it may be set by a ratio of a length of displaying a half-open mouth to a length of a syllable.

【００７７】ステップＳ５：音節が特定の音節でない
場合（Ｓ３；ＮＯ）、前回に発音させた音節と今回新た
に発音させる音節が同一母音を含むか否かを識別する。
同一母音を含む場合には、ステップＳ３と同様に、半開
きの口を表示するための画像データをビデオブロック１
１に転送し、半開きした口を表現する画像をステップＳ
４と同様に表示させ（Ｓ６）、ステップＳ８に移行す
る。半開きの口を表示する時間長に関しても上記ステッ
プ４と同様である。Step S5: If the syllable is not a specific syllable (S3; NO), it is discriminated whether or not the syllable to be generated last time and the syllable to be newly generated this time include the same vowel.
If the same vowel is included, the image data for displaying the half-open mouth is converted to the video block 1 as in step S3.
1 and the image representing the half-open mouth is displayed in step S
4 is displayed (S6), and the process proceeds to step S8. The time length for displaying the half-open mouth is the same as that in step 4 above.

【００７８】ステップＳ７：前回に発音させた音節と
今回新たに発音させる音節が同一母音を含まない場合
（Ｓ５；ＮＯ）、前後する音節に含まれる母音等が異な
ること考えられる。そこで、メインＣＰＵ１０１は、前
音節に含まれる音（母音）と今回発音させるべき音節の
音（母音、撥音若しくは半開きのいずれか、または子
音）との中間的な口を表現しうる画像データをビデオブ
ロック１１に転送する。また、メインＣＰＵ１０１はこ
の音節を発音させるための波形データをサウンドブロッ
ク１２に転送する。なお、中間的な口を表示する時間長
に関しては、上記ステップ４と同様である。Step S7: If the syllable that was generated last time and the syllable that is newly generated this time do not include the same vowel (S5; NO), it is considered that the vowels included in the preceding and following syllables are different. Therefore, the main CPU 101 converts image data that can represent an intermediate mouth between the sound (vowel) included in the previous syllable and the sound of the syllable to be generated this time (either a vowel, a repellent sound, a half-open sound, or a consonant) into video. Transfer to block 11. Further, the main CPU 101 transfers the waveform data for generating the syllable to the sound block 12. Note that the length of time for displaying an intermediate mouth is the same as in step 4 above.

【００７９】ステップＳ８：中間的な口を表現するた
めの画像を表示した後（Ｓ４，Ｓ６またはＳ７）、次に
発音させる音節が撥音であるか否かを判断する。撥音で
ある場合（Ｓ８；ＹＥＳ）は、閉じた口を表現するため
の口を構成するポリゴンの空間位置を定義した座標デー
タをビデオブロック１１に転送し、図１６に示すような
口を表示させる（Ｓ９）。Step S8: After displaying an image representing an intermediate mouth (S4, S6 or S7), it is determined whether or not the next syllable to be sounded is sound-repelling. If the sound is sound-repellent (S8; YES), the coordinate data defining the spatial position of the polygon constituting the mouth for expressing the closed mouth is transferred to the video block 11, and the mouth as shown in FIG. 16 is displayed. (S9).

【００８０】撥音でない場合（Ｓ８；ＮＯ）は、次の音
節は母音を含むと考えられるので、その母音を発音する
ための口の形を特定し、特定されたポリゴンの空間位置
が定義された座標データをビデオブロック１１に転送
し、図１０乃至図１４のいずれかに示すような口を表示
させる（Ｓ１０）。If it is not a sound repellent (S8; NO), the next syllable is considered to include a vowel, so the shape of the mouth for producing the vowel is specified, and the spatial position of the specified polygon is defined. The coordinate data is transferred to the video block 11, and a mouth as shown in any of FIGS. 10 to 14 is displayed (S10).

【００８１】ステップＳ１０：歌詞データがまだ続く
場合は（Ｓ１１；ＮＯ）、再び次の歌詞データを参照す
る（Ｓ２）。歌詞データが終わった場合は（Ｓ１１；Ｙ
ＥＳ）、本発明の処理を終了する。Step S10: If the lyrics data still continues (S11; NO), the next lyrics data is referred to again (S2). When the lyrics data is over (S11; Y
ES), end the processing of the present invention.

【００８２】（効果）上記したように本実施の形態によ
れば、次に発音させる音節に含まれる母音を発音するた
めの口の表現で、その音節における口の表現を代用する
ので、全体として少ない画像データで音節の発音に合せ
て自然に口を動かすモデルを表現できる。(Effect) As described above, according to the present embodiment, the expression of the mouth for generating the vowel included in the syllable to be pronounced next is used instead of the expression of the mouth in the syllable. With a small amount of image data, it is possible to express a model that moves the mouth naturally according to the pronunciation of syllables.

【００８３】特に、連続する二つの音節において、同一
の母音が連続する場合であっても、いずれの母音とも異
なる半開きの口の形をその間に表示するので、より自然
な口の動きを表現できる。In particular, even when the same vowel is continuous in two consecutive syllables, a half-open mouth shape different from any vowel is displayed between the two syllables, so that a more natural movement of the mouth can be expressed. .

【００８４】また、連続する二つの音節において、異な
る母音となる場合であっても、それらの母音を発音する
口を表現する画像の合間に、中間的な口の形を表現する
画像を表示するので、さらに自然な口の動きを表現でき
る。Further, even when two consecutive syllables have different vowels, an image representing an intermediate mouth shape is displayed between the images representing the mouths producing the vowels. Therefore, a more natural movement of the mouth can be expressed.

【００８５】さらに、音節が撥音である場合にも口を閉
じた画像を表示するので、より自然な口の動きを表現で
きる。Further, since the image in which the mouth is closed is displayed even when the syllable is sound-repellent, a more natural movement of the mouth can be expressed.

【００８６】（その他の変形例）本発明は、上記各形態
に拘らず種々に変形できる。例えば、上記実施の形態で
は、アイドルが唄う場合を想定し、歌詞データを参照し
ていたが、歌詞の代わりに台詞であってもよい。このと
き、ステップS２では、歌詞データの代わりに、台詞の
内容をひらがなあるいはカタカナ等の音節文字あるいは
ローマ字で表わしたデータに基づいて、口の形を判断す
る。(Other Modifications) The present invention can be variously modified irrespective of the above embodiments. For example, in the above embodiment, the lyrics data is referred to assuming a case where the idol sings. However, the words may be spoken instead of the lyrics. At this time, in step S2, the shape of the mouth is determined based on data representing the contents of the dialogue in syllabic characters such as Hiragana or Katakana or Roman characters instead of the lyrics data.

【００８７】上記実施形態では、モデルが人間の顔であ
ったが、モデルは人間に限ることなく、動物、ロボット
等、口が存在し、人間の発音する様子を表現しうるもの
であれば、種々に適用が可能である。In the above-described embodiment, the model is a human face. However, the model is not limited to humans. Various applications are possible.

【００８８】上記実施形態では、モデルをポリゴンによ
り表示していたが、ビットマップ形式の画像データを口
の形に応じて複数記憶させ、これらの画像データを交互
に読み取って表示させてもよい。この場合は、二次元的
な表示に本発明が適用できることになる。In the above embodiment, the model is displayed by polygons. However, a plurality of bitmap image data may be stored according to the shape of the mouth, and these image data may be alternately read and displayed. In this case, the present invention can be applied to two-dimensional display.

【００８９】[0089]

【発明の効果】本発明によれば、実際に人間が発音する
様子を自然に模擬し、まるでモデルが本当に唄ったり話
したりしていると錯覚させることのできる表現力を有す
る画像音響処理装置および記録媒体を提供することがで
きる。According to the present invention, there is provided an image sound processing apparatus having an expressive power capable of naturally simulating the manner in which a human actually pronounces and giving the illusion that the model is really singing or speaking. A recording medium can be provided.

【００９０】つまり、本発明によれば、各音節に対応す
る母音を発音する際の口の画像を、その音節の発音期間
中に表示することによって、近似的にそのモデルが発音
しているような画像を表示できる。画像データは母音の
みなので、一定量の画像データのみ済む。That is, according to the present invention, the image of the mouth when the vowel corresponding to each syllable is generated is displayed during the sounding period of the syllable, so that the model sounds approximately. Images can be displayed. Since the image data is only vowels, only a certain amount of image data is required.

【００９１】本発明によれば、同一の母音を有する音節
が連続しても、人間が発音する様子を自然に模擬でき
る。According to the present invention, even when syllables having the same vowel are consecutive, it is possible to naturally simulate the manner in which a human pronounces sound.

【００９２】本発明によれば、異なる母音を含む音節が
連続しても、二種類の母音の各々を発音する口を表現す
る画像の合間に中間的な口の形を表現する画像を挿入す
るので、人間が発音する様子をより自然に模擬できる。According to the present invention, even if syllables containing different vowels are consecutive, an image representing an intermediate mouth shape is inserted between images representing mouths that emit each of two kinds of vowels. Therefore, it is possible to more naturally simulate how a human pronounces.

【００９３】本発明によれば、音節が撥音である場合に
も口を閉じた画像を表示するので、撥音を発音する際の
自然な口の動きを表現できる。According to the present invention, even when the syllable is sound-repelling, an image with the mouth closed is displayed, so that a natural movement of the mouth when the sound-repelling is pronounced can be expressed.

[Brief description of the drawings]

【図１】本発明の画像音響処理装置を適用したゲーム装
置の外観図である。FIG. 1 is an external view of a game device to which an image sound processing device of the present invention is applied.

【図２】本発明の画像音響処理装置を提供したゲーム装
置のブロック図である。FIG. 2 is a block diagram of a game device provided with the image sound processing device of the present invention.

【図３】本実施の形態の動作を説明するフローチャート
である。FIG. 3 is a flowchart illustrating the operation of the present embodiment.

【図４】母音のみで「こんにちは」を表現する場合の表
示順序である。FIG. 4 is a display order of the case to represent the "Hello" only in vowels.

【図５】同一母音を含む音節が連続する場合の表示順序
である。FIG. 5 shows a display order when syllables including the same vowel are continuous.

【図６】特定音節である場合の表示順序である。FIG. 6 shows a display order in the case of a specific syllable.

【図７】前後する音節が異なる音を含む場合の表示順序
である。FIG. 7 is a display order in a case where preceding and succeeding syllables include different sounds.

【図８】ポリゴンのみで構成したモデルの構成図であ
る。FIG. 8 is a configuration diagram of a model composed of only polygons.

【図９】テクスチャデータを適用したモデルの表示画像
図である。FIG. 9 is a display image diagram of a model to which texture data is applied.

【図１０】「あ（ａ）」を発音させる表示画像の例であ
る。FIG. 10 is an example of a display image for producing “a”.

【図１１】「い（ｉ）」を発音させる表示画像の例であ
る。FIG. 11 is an example of a display image for generating “i (i)”.

【図１２】「う（ｕ）」を発音させる表示画像の例であ
る。FIG. 12 is an example of a display image for generating “u (u)”.

【図１３】「え（ｅ）」を発音させる表示画像の例であ
る。FIG. 13 is an example of a display image for producing “e (e)”.

【図１４】「お（ｏ）」を発音させる表示画像の例であ
る。FIG. 14 is an example of a display image for generating “O (o)”.

【図１５】半開きした口を表現する表示画像の例であ
る。FIG. 15 is an example of a display image representing a half-open mouth.

【図１６】閉じた口（撥音）を表現する表示画像の例で
ある。FIG. 16 is an example of a display image expressing a closed mouth (sound repellency).

【符号の説明】１０…ＣＰＵブロック（画像表示手段の一部、処理回
路）１１…ビデオブロック（画像表示手段の一部、画像表示
回路）１２…サウンドブロック（音響生成手段、音響生成回
路）１０２…ＲＡＭ（記憶手段の一部、記憶回路の一部）１２１…ＶＲＡＭ（記憶手段の一部、記憶回路の一部）[Description of Signs] 10: CPU block (part of image display means, processing circuit) 11: Video block (part of image display means, image display circuit) 12: Sound block (sound generation means, sound generation circuit) 102 ... RAM (part of storage means, part of storage circuit) 121 ... VRAM (part of storage means, part of storage circuit)

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号ＦＩＧ１０Ｌ 3/00 Ｇ０６Ｆ 15/62 ３５０Ａ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁶ Identification code FI G10L 3/00 G06F 15/62 350A

Claims

[Claims]

1. An image sound processing apparatus for generating sound in accordance with an image representing mouth movement, a sound generating means for generating sound, and expressing mouth movement when a syllable is pronounced. Image display means for displaying an image, and storage means for storing image data for expressing a mouth when a vowel is pronounced only for the type of vowel, wherein the image display means comprises the sound generation means Identifying vowels included in syllables to be pronounced, reading image data for expressing a mouth when the identified vowel is pronounced from the storage means, and displaying an image based on the read image data. Image sound processing device.

2. The storage means stores image data for expressing a half-open mouth in addition to image data for expressing a mouth when the vowel is pronounced. When the syllable to be generated by the sound generating means is a predetermined specific syllable, the image data for expressing the half-open mouth stored in the storage means and the vowel included in the syllable are generated. 2. The image sound processing apparatus according to claim 1, wherein an image representing the mouth when the vowel is pronounced is displayed subsequent to the image representing the half-open mouth based on the image data for representing the mouth at the time. .

3. The storage means stores image data for expressing a half-open mouth in addition to image data for expressing a mouth when the vowel is pronounced. If it is determined that the continuous syllables produced by the sound generating means contain the same vowel, the continuous syllables are identified based on the image data for expressing the half-open mouth stored in the storage means. The image acoustic processing apparatus according to claim 1, wherein an image representing the half-open mouth is displayed between images for representing a mouth that pronounces each syllable.

4. The storage means includes, in addition to image data for expressing a mouth when the vowel is pronounced, an intermediate of two types of vowels related to each combination in one or more combinations of different vowels. Image data for expressing a natural mouth is stored, When the image display means causes the sound generation means to generate continuous syllables including different vowels, based on the image data stored in the storage means, The image acoustic processing according to claim 1, wherein an image for expressing a way of opening an intermediate mouth between these two types of vowels is displayed between images for expressing a mouth that pronounces each of the different vowels. apparatus.

5. The storage means stores image data for expressing a closed mouth in addition to image data for expressing a mouth when the vowel is pronounced, wherein the image display means comprises: 2. The image acoustic processing apparatus according to claim 1, wherein when a syllable to be sounded by the sound generating means includes a sound repelling, an image is displayed based on the image data for expressing the closed mouth stored in the storage means.

6. An image sound processing apparatus for generating a sound in accordance with an image expressing a movement of a mouth, wherein image data for expressing a mouth when a vowel is pronounced, waveform data for generating a sound, and a program A storage device in which data is stored; a sound generation circuit that generates a voice based on the waveform data read from the storage device; and an image expressing mouth movement based on the image data read from the storage device. An image display circuit for displaying, and a processing circuit for controlling the sound generation circuit and the image display circuit based on program data read from the storage device, wherein the processing circuit is sounded by the waveform data For identifying a vowel included in the syllable, generating a sound based on the waveform data, and expressing a mouth when the vowel is pronounced; The serial image data transferred to the image display circuit, image sound processing apparatus characterized by displaying an image based on the image data.

7. The storage device stores image data for expressing a half-open mouth in addition to image data for expressing a mouth when the vowel is pronounced. When the syllable pronounced by the waveform data is a predetermined specific syllable, image data for expressing the half-open mouth stored in the storage device while generating a sound based on the waveform data. And transferring image data for expressing the mouth when the vowel included in the syllable is to be generated to the image display circuit, and expressing the mouth when the vowel is generated following the image expressing the half-open mouth. 7. An image to be displayed is displayed.
3. The image acoustic processing device according to claim 1.

8. The storage device stores image data for expressing a half-open mouth, in addition to image data for expressing the mouth when the vowel is pronounced. If it is determined that consecutive syllables produced by the waveform data contain the same vowel, a sound based on the waveform data is generated,
The image data for expressing the mouth when the vowel included in the syllable included in the syllable stored in the storage device and the image data for expressing the half-open mouth are transferred to the image display circuit, The image sound processing apparatus according to claim 6, wherein an image representing the half-open mouth is displayed between images for representing a mouth that pronounces each syllable.

9. The storage device may include, in addition to image data for expressing a mouth when the vowel is pronounced, an intermediate of two types of vowels related to each combination in one or more combinations of different vowels. Image data for expressing a vowel is stored.When the processing circuit causes continuous syllables including different vowels to sound, the processing circuit converts each of the different vowels based on the image data stored in the storage device. 7. The image sound processing apparatus according to claim 6, wherein an image for expressing a way of opening an intermediate mouth between these two types of vowels is displayed between images for expressing a mouth to be pronounced.

10. The storage device stores image data for expressing a closed mouth in addition to image data for expressing a mouth when the vowel is pronounced. When a syllable pronounced by the waveform data includes a sound repellent, a sound based on the waveform data is generated, and image data for expressing the closed mouth stored in the storage device is transferred to the image display circuit. The image sound processing apparatus according to claim 6, wherein an image representing the closed mouth is displayed.

11. The specific syllable includes 'sa',
'Set', 'ta', 'ma', 'mi', 'mu', 'me',
'Well', 'Wa', 'O', 'B', 'Bi', 'Bu',
The image according to claim 2, wherein the image includes part or all of “be”, “bo”, “ぱ”, “ぴ”, “ぷ”, “ぺ”, and “ぽ”. Sound processing device.

12. The image acoustic processing apparatus according to claim 1, wherein the syllable is identified based on character data.

13. A computer for identifying a vowel included in a syllable pronounced by the waveform data, generating a sound based on the waveform data, and an image for expressing a mouth when the vowel is pronounced. Transferring data to an image display circuit; and displaying an image based on the image data;
A machine-readable recording medium in which program data for executing the program is stored.

14. A computer for identifying whether or not a syllable pronounced by waveform data is a predetermined specific syllable; and when the specific syllable is identified, the computer outputs Generating sound based on the image data, and, when the specific syllable is identified, image data for expressing a half-open mouth and image data for expressing a mouth when a vowel included in the syllable is to be pronounced. Transferring the image to the image display circuit, and displaying an image representing the mouth when the vowel is pronounced, following the image representing the half-open mouth,
A machine-readable recording medium in which program data for executing the program is stored.

15. A computer, comprising: determining whether or not continuous syllables produced by waveform data include the same vowel; and determining if the continuous syllable includes the same vowel. Generating sound based on the waveform data; and, when identifying that the continuous syllables include the same vowel, image data for expressing a mouth when the vowel included in the syllable is to be pronounced. And transferring image data for expressing the half-open mouth to an image display circuit; and an image for expressing the half-open mouth between images for expressing the mouth that pronounces each of the continuous syllables. And a machine-readable recording medium storing program data for executing the steps.

16. A computer for identifying whether or not continuous syllables produced by the waveform data include different vowels. If the computer determines that the continuous syllables include different vowels, Generating sound based on the image data, and, when the consecutive syllables are identified as including different vowels, image data for expressing a mouth when each vowel is pronounced and an opening of an intermediate mouth between the different vowels Transferring image data for expressing one of the two vowels to an image display circuit, and opening an intermediate mouth between these two types of vowels between the images for expressing the mouths producing each of the different vowels Displaying an image representing the following, and a machine-readable recording medium storing program data for executing the image.

17. A computer, comprising: determining whether a syllable pronounced by waveform data includes a sound-repelling sound; and generating a sound based on the waveform data when the syllable is determined to include a sound-repelling sound. Transferring the image data for expressing the closed mouth to an image display circuit when the syllable includes a sound-repelling sound; anddisplaying an image representing the closed mouth,
A machine-readable recording medium in which program data for executing the program is stored.