JP2017217425A

JP2017217425A - Game control device, game control method, and game control program

Info

Publication number: JP2017217425A
Application number: JP2016116741A
Authority: JP
Inventors: 正明牧野; Masaaki Makino
Original assignee: Individual
Current assignee: Individual
Priority date: 2016-06-12
Filing date: 2016-06-12
Publication date: 2017-12-14

Abstract

PROBLEM TO BE SOLVED: To improve a game property of a game which can be operated by a voice.SOLUTION: A voice complexity calculation part 123 calculates a voice complexity which is an index indicating a complexity of a voice signal, from voice analysis data. The invention comprises a voice command storage part 113 for storing information in which, the voice signal, the voice complexity, and an execution command are combined. A voice command selection part 125 causes the voice signal to be input from an external part, calculates the voice analysis data by a voice analysis part 122, and compares respective piece of voice analysis data of the voice signal corresponding to the stored voice command, with voice analysis data of the input voice data, for selecting the voice command for which, the highest similarity is calculated. A voice execution command part 126 executes processing of the execution command of the selected voice command, and changes a processing content based on the voice complexity.SELECTED DRAWING: Figure 1

Description

本発明は、音声入力によって操作可能なゲーム制御装置、ゲーム制御方法、およびゲーム制御プログラムに関するものである。 The present invention relates to a game control device that can be operated by voice input, a game control method, and a game control program.

ゲーム専用機器や、モバイル端末におけるゲームソフトの操作は、ゲーム専用コントローラやタッチパネルを用いて行われることが一般的であるが、マイクロフォンに音声を入力することで操作出来るゲームソフトが存在する。 Game software operations on game dedicated devices and mobile terminals are generally performed using game dedicated controllers and touch panels, but there are game software that can be operated by inputting sound into a microphone.

例えば、特許文献１では、プレイヤーキャラクタをコントローラだけでなく、音声によっても操作可能とすることで、ビデオゲームの面白みやキャラクタを操作する楽しみを向上させるための技術が開示されている。文献によると、「火炎放射器だ！」と発話することで、プレイヤーキャラクタの武器を火炎放射器に持ち替える指示を与えることができたり、「がんばれ！」とプレイヤーが発話することで、プレイヤーキャラクタの恐怖心のパラメータを下げることができる。 For example, Patent Document 1 discloses a technique for improving the fun of a video game and the enjoyment of operating a character by enabling the player character to be operated not only by a controller but also by voice. According to the literature, you can give instructions to switch the player character's weapon to a flamethrower by saying "It's a flamethrower!" The fear parameter can be lowered.

特開２００２−２４８２６１号公報JP 2002-248261 A

しかしながら、従来技術では、音声とコマンドを関連付けているだけであり、音声自体の入力の困難さは考慮されていなかった。すなわち、音声自体の性質が、対応付けされているコマンドの効果に影響を与えないため、長くて複雑な音声よりも、短くて単純な音声を入力する方が、ゲーム進行上、有利に働いてしまう。 However, in the prior art, only the voice and the command are associated, and the difficulty of inputting the voice itself has not been considered. In other words, since the nature of the voice itself does not affect the effect of the associated command, it is more advantageous for the progress of the game to input a short and simple voice than a long and complicated voice. End up.

そこで本発明は、上記に鑑みてなされたものであって、ユーザによって入力された音声をコマンドに置き換えると共に、実行されるコマンドを音声の複雑さに応じて変化させることで、音声によって操作可能なゲームのゲーム性を向上させることを目的とする。 Therefore, the present invention has been made in view of the above, and replaces a voice input by a user with a command and can be operated by voice by changing a command to be executed according to the complexity of the voice. It aims at improving the game nature of a game.

上述した課題を解決するため、本発明に係わるゲーム制御装置は、音声信号の音響特性を数値化したデータである音声解析データを算出する音声解析部と、音声信号の複雑さを示す指標である音声複雑度を、前記音声解析データから算出する音声複雑度算出部と、音声信号と、前記音声信号の音声複雑度と、前記音声信号に対応する実行処理を示す実行コマンドとを組み合わせた音声コマンドを記憶する音声コマンド記憶部と、外部より入力される第一の音声信号を、前記音声解析部に解析させて音声解析データを取得すると共に、前記音声コマンド記憶部に記憶された前記音声コマンドに対応する音声信号の音声解析データそれぞれと、前記第一の音声データの前記音声解析データとを比較することで、最も高い類似度が算出された音声コマンドを選出する音声コマンド選択部と、前記選出された音声コマンドの実行コマンドの処理を実行すると共に、前記選出された音声コマンドの音声複雑度に基づいて、前記実行コマンドの処理内容を変化させる音声コマンド実行部とを有する。 In order to solve the above-described problems, a game control device according to the present invention is a voice analysis unit that calculates voice analysis data that is data obtained by quantifying the acoustic characteristics of a voice signal, and an index that indicates the complexity of the voice signal. A voice command that combines a voice complexity calculation unit that calculates voice complexity from the voice analysis data, a voice signal, a voice complexity of the voice signal, and an execution command indicating an execution process corresponding to the voice signal. A voice command storage unit that stores the first voice signal input from the outside, the voice analysis unit analyzes the voice command data stored in the voice command storage unit By comparing each voice analysis data of the corresponding voice signal with the voice analysis data of the first voice data, the voice code having the highest similarity is calculated. A voice command selection unit for selecting a command and a voice command for executing a command for executing the selected voice command and changing a processing content of the execution command based on a voice complexity of the selected voice command A command execution unit.

また、本発明に係わるゲーム制御方法は、音声信号の音響特性を数値化したデータである音声解析データを算出するステップと、音声信号の複雑さを示す指標である音声複雑度を、前記音声解析データから算出するステップと、音声信号と、前記音声信号の音声複雑度と、前記音声信号に対応する実行処理を示す実行コマンドとを組み合わせた音声コマンドを記憶するステップと、外部より入力される音声信号を第一の音声信号とし、前記第一の音声信号の音声解析データを取得すると共に、記憶された前記音声コマンドに対応する音声信号の音声解析データそれぞれと、前記第一の音声データの前記音声解析データとを比較することで、最も高い類似度が算出された音声コマンドを選出するステップと、前記選出された音声コマンドの実行コマンドの処理を実行すると共に、前記選出された音声コマンドの音声複雑度に基づいて、前記実行コマンドの処理内容を変化させるステップとを含む。 The game control method according to the present invention includes a step of calculating speech analysis data that is data obtained by quantifying the acoustic characteristics of a speech signal, and speech complexity that is an index indicating the complexity of the speech signal. A step of calculating from data, a step of storing a voice command combining a voice signal, a voice complexity of the voice signal, and an execution command indicating an execution process corresponding to the voice signal; and voice inputted from outside The signal is a first voice signal, and voice analysis data of the first voice signal is acquired, each of the voice analysis data of the voice signal corresponding to the stored voice command, and the first voice data The step of selecting the voice command having the highest similarity calculated by comparing with the voice analysis data, and the execution of the selected voice command And executes the command processing, based on the sound complexity of the selected voice commands, and a step of changing the processing content of the execution command.

また、本発明に係わるゲーム制御プログラムは、音声信号の音響特性を数値化したデータである音声解析データを算出するステップと、音声信号の複雑さを示す指標である音声複雑度を、前記音声解析データから算出するステップと、音声信号と、前記音声信号の音声複雑度と、前記音声信号に対応する実行処理を示す実行コマンドとを組み合わせた音声コマンドを記憶するステップと、外部より入力される音声信号を第一の音声信号とし、前記第一の音声信号の音声解析データを取得すると共に、記憶された前記音声コマンドに対応する音声信号の音声解析データそれぞれと、前記第一の音声データの前記音声解析データとを比較することで、最も高い類似度が算出された音声コマンドを選出するステップと、前記選出された音声コマンドの実行コマンドの処理を実行すると共に、前記選出された音声コマンドの音声複雑度に基づいて、前記実行コマンドの処理内容を変化させるステップとをコンピュータに実行させる。 In addition, the game control program according to the present invention includes a step of calculating speech analysis data that is data obtained by quantifying the acoustic characteristics of a speech signal, and a speech complexity that is an index indicating the complexity of the speech signal. A step of calculating from data, a step of storing a voice command combining a voice signal, a voice complexity of the voice signal, and an execution command indicating an execution process corresponding to the voice signal; and voice inputted from outside The signal is a first voice signal, and voice analysis data of the first voice signal is acquired, each of the voice analysis data of the voice signal corresponding to the stored voice command, and the first voice data Selecting the voice command having the highest similarity calculated by comparing the voice analysis data; and the selected voice command. And it executes the process of executing the command, based on the audio complexity of the selected voice commands, and a step of changing the processing content of the execution command to the computer.

本発明のゲーム制御装置は、音声信号の複雑さが、ゲームの操作や効果に直接的な影響を与えることで、より高いゲーム性を有するゲームを提供することができる。 The game control device of the present invention can provide a game with higher game performance because the complexity of the audio signal directly affects the operation and effect of the game.

図１は、実施例１に係わるゲーム制御装置の構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of the game control apparatus according to the first embodiment. 図２は、実施例１に係わる音声解析データ記憶部に記憶されるフォーマットの一例を示す図である。FIG. 2 is a diagram illustrating an example of a format stored in the voice analysis data storage unit according to the first embodiment. 図３は、実施例１に係わる音声コマンド記憶部に記憶されるフォーマットの一例を示す図である。FIG. 3 is a diagram illustrating an example of a format stored in the voice command storage unit according to the first embodiment. 図４は、実施例１に係わる音声解析部の処理の流れの例を示すフローチャートである。FIG. 4 is a flowchart illustrating an example of a processing flow of the speech analysis unit according to the first embodiment. 図５は、実施例１に係わる音声データを解析した振幅スペクトルの一例を示す図である。FIG. 5 is a diagram illustrating an example of an amplitude spectrum obtained by analyzing voice data according to the first embodiment. 図６は、実施例１に係わる音声コマンド作成部の提示するＵＩの例を示す図である。FIG. 6 is a diagram illustrating an example of a UI presented by the voice command creation unit according to the first embodiment. 図７は、実施例１に係わる音声コマンド作成部がプレイヤーの音声を登録する処理の流れの例を示すフローチャートである。FIG. 7 is a flowchart illustrating an example of a flow of processing in which the voice command creation unit according to the first embodiment registers the player's voice. 図８は、実施例１に係わる音声コマンド選択部の処理の流れを示すフローチャートである。FIG. 8 is a flowchart illustrating the processing flow of the voice command selection unit according to the first embodiment. 図９は、実施例１に係わるゲーム画面の一例を示す図である。FIG. 9 is a diagram illustrating an example of a game screen according to the first embodiment. 図１０は、実施例１に係わる音声コマンド実行部がコマンドを実行した例を示す図である。FIG. 10 is a diagram illustrating an example in which the voice command execution unit according to the first embodiment executes a command. 図１１は、実施例１に係わる音声コマンド実行部が、音声複雑度に応じてコマンドの効果を変化させた例を示す図である。FIG. 11 is a diagram illustrating an example in which the voice command execution unit according to the first embodiment changes the effect of the command according to the voice complexity.

図１を用いて、実施の形態１に係るコメント作成表示装置の構成を説明する。図１は、実施の形態１に係るゲーム制御装置の構成例を示す機能ブロック図である。 The configuration of the comment creation / display apparatus according to Embodiment 1 will be described with reference to FIG. FIG. 1 is a functional block diagram illustrating a configuration example of the game control device according to the first embodiment.

図１に示すように、ゲーム制御装置１００は、入力部１０１と、表示部１０２と、マイクロフォン１０３と、記憶部１１０と、制御部１２０とを有する。 As shown in FIG. 1, the game control device 100 includes an input unit 101, a display unit 102, a microphone 103, a storage unit 110, and a control unit 120.

入力部１０１は、キーボードやマウス、タッチパネル等を有し、ユーザによる操作により、ゲーム制御装置１００における各種情報を入力する。表示部１０２は、表示装置としてのモニタ（若しくはディスプレイ、タッチパネル等）やスピーカを有し、ゲーム制御装置１００における各種情報を出力する。マイクロフォン１０３は、プレイヤーの音声を取得するために設置される。 The input unit 101 includes a keyboard, a mouse, a touch panel, and the like, and inputs various information in the game control apparatus 100 by a user operation. The display unit 102 includes a monitor (or a display, a touch panel, etc.) and a speaker as a display device, and outputs various information in the game control device 100. The microphone 103 is installed to acquire the player's voice.

記憶部１１０は、ＲＡＭ（Random Access Memory）やハードディスク、光ディスク、フラッシュメモリ（Flash Memory）等の記憶装置であって、制御部１２０による各種処理に要するデータや、制御部１２０による各種処理結果を記憶する。 The storage unit 110 is a storage device such as a RAM (Random Access Memory), a hard disk, an optical disk, or a flash memory, and stores data required for various processes by the control unit 120 and various processing results by the control unit 120. To do.

また、記憶部１１０には、音声データ記憶部１１１と、音声解析データ記憶部１１２と、音声コマンド記憶部１１３とが含まれる。 The storage unit 110 includes a voice data storage unit 111, a voice analysis data storage unit 112, and a voice command storage unit 113.

音声データ記憶部１１１は、音声信号のデジタルデータである音声データと、その音声データを一意に識別するための音声データＩＤとを組み合わせて記憶する。 The audio data storage unit 111 stores audio data, which is digital data of an audio signal, and an audio data ID for uniquely identifying the audio data.

音声解析データ記憶部１１２は、音声データに対して音響分析を行い、その音響分析結果である音声解析データを記憶する。具体的には、音声解析データ記憶部１１２は、音声データＩＤと、その音声データＩＤに対応する音声解析データを記憶する。音声解析データとして、周波数帯域ごとの振幅スペクトルの時系列データと、音声が連続して検出された区間である有音区間の開始位置、および終了位置を記憶する。
The voice analysis data storage unit 112 performs an acoustic analysis on the voice data, and stores voice analysis data that is a result of the acoustic analysis. Specifically, the voice analysis data storage unit 112 stores a voice data ID and voice analysis data corresponding to the voice data ID. As voice analysis data, the time series data of the amplitude spectrum for each frequency band and the start position and end position of a voiced section that is a section in which voice is continuously detected are stored.

図２は、音声解析データ記憶部１１２のフォーマットの一例である。図２の（Ａ）は、周波数帯域ごとの振幅スペクトルの時系列データを記憶しており、図２の（Ｂ）は、有音区間の開始位置、および終了位置を記憶している。図２の（Ａ）は、音声データＩＤと、周波数帯域１〜５００のパワースペクトルを、時系列ごとに記憶している。なお、図２の（Ａ）の音声データのサンプリング周波数は、４４１００Ｈｚであり、ＦＦＴのフレームサイズは２０４８サンプルで算出した結果を格納している。すなわち、時間分解能は約４６．４ｍｓであり、周波数分解能は、約２１．５Ｈｚである。図２の（Ｂ）は、音声データＩＤと、区間番号と、その区間番号の開始位置と終了位置とを、時系列番号で記憶している。音声データＩＤ１には、２つの有音区間が記憶されており、区間番号１は時系列番号１から時系列番号５まで、区間番号２は時系列番号７から時系列番号１０までであることを示している。
FIG. 2 shows an example of the format of the voice analysis data storage unit 112. 2A stores the time-series data of the amplitude spectrum for each frequency band, and FIG. 2B stores the start position and end position of the sound section. (A) of FIG. 2 has memorize | stored audio | voice data ID and the power spectrum of the frequency band 1-500 for every time series. Note that the sampling frequency of the audio data in FIG. 2A is 44100 Hz, and the FFT frame size stores the results calculated with 2048 samples. That is, the time resolution is about 46.4 ms, and the frequency resolution is about 21.5 Hz. FIG. 2B stores a voice data ID, a section number, and a start position and an end position of the section number as a time series number. Two voiced sections are stored in the voice data ID1, section number 1 is from time series number 1 to time series number 5, and section number 2 is from time series number 7 to time series number 10. Show.

音声コマンド記憶部１１３は、音声データに対応する実行コマンドを記憶する。具体的には、音声コマンド記憶部１１３は、音声データＩＤと、その音声データの複雑さを示す音声複雑度と、実行コマンドの識別子である実行コマンドＩＤとを組み合わせて記憶する。この組み合わせを音声コマンドと称する。 The voice command storage unit 113 stores an execution command corresponding to the voice data. Specifically, the voice command storage unit 113 stores a combination of a voice data ID, a voice complexity indicating the complexity of the voice data, and an execution command ID that is an identifier of the execution command. This combination is called a voice command.

実行コマンドとは、例えば、「回復魔法を発動する」「武器必殺技Ａを発動する」といったように、実際にゲーム内の操作として実行されるコマンドを示す。音声複雑度については後述する。 The execution command is a command that is actually executed as an operation in the game, such as “invoking recovery magic” or “invoking weapon special move A”. The voice complexity will be described later.

図３は、音声コマンド記憶部１１３のフォーマットの一例である。図３のフォーマットの例では、音声コマンドを一意に識別するための識別子である音声コマンドＩＤと、音声データＩＤと、実行コマンドＩＤと、音声複雑度が記憶されている。音声コマンドＩＤが音声コマンドＩＤ１である音声コマンドの音声データＩＤは「音声データ１」、実行コマンドＩＤは「炎魔法」、音声複雑度は１００と記憶されている。 FIG. 3 shows an example of the format of the voice command storage unit 113. In the example of the format in FIG. 3, a voice command ID that is an identifier for uniquely identifying a voice command, a voice data ID, an execution command ID, and a voice complexity are stored. The voice data ID of the voice command whose voice command ID is voice command ID1 is stored as “voice data 1”, the execution command ID is “flame magic”, and the voice complexity is 100.

なお、記憶部１１０は、本実施例においては、ゲーム制御装置１００に含まれるものとして説明したが、その一部又は全部をネットワーク接続された外部装置に配置されても良い。 In the present embodiment, the storage unit 110 has been described as being included in the game control device 100, but a part or all of the storage unit 110 may be disposed in an external device connected via a network.

制御部１２０は、制御プログラムや制御プログラム、各種の処理手順等を規定したプログラムを実行する部である。そのため、それらのプログラムの実行に必要とする内部メモリを保有し、ＡＳＩＣ（Application Specific IntegratedCircuit）やＳｏＣ（System on a chip）などの電子部品により構成される。制御部１２０には、さらに、音声信号変換部１２１と、音声解析部１２２と、音声複雑度算出部１２３と、音声コマンド作成部１２４と、音声コマンド選択部１２５と、音声コマンド実行部１２６とが含まれる。 The control unit 120 is a unit that executes a program that defines a control program, a control program, various processing procedures, and the like. Therefore, it has an internal memory necessary for executing these programs, and is composed of electronic components such as ASIC (Application Specific Integrated Circuit) and SoC (System on a chip). The control unit 120 further includes an audio signal conversion unit 121, an audio analysis unit 122, an audio complexity calculation unit 123, an audio command creation unit 124, an audio command selection unit 125, and an audio command execution unit 126. included.

音声信号変換部１２１は、マイクロフォン１０３より入力されたアナログ音声信号をデジタル信号に量子化し、音声データに変換する部である。変換されたデジタルデータは、音声データを一意に示す識別子である音声データIDを割り当てて、音声データ記憶部１１１に格納する。 The audio signal conversion unit 121 is a unit that quantizes the analog audio signal input from the microphone 103 into a digital signal and converts it into audio data. The converted digital data is stored in the audio data storage unit 111 by assigning an audio data ID which is an identifier uniquely indicating the audio data.

音声解析部１２２は、音声信号変換部１２１によって作成された音声データの信号を解析し、音声解析データを作成する。音声解析部１２２の処理の流れを、図４のフローチャートを参照しながら説明を行う。 The voice analysis unit 122 analyzes the signal of the voice data created by the voice signal conversion unit 121 and creates voice analysis data. The processing flow of the voice analysis unit 122 will be described with reference to the flowchart of FIG.

まず、音声解析部１２２は、外部より渡された音声データIDから、音声データ本体を、音声データ格納部１１１から読み出す（ステップＳ１００）。 First, the voice analysis unit 122 reads the voice data body from the voice data storage unit 111 from the voice data ID passed from the outside (step S100).

次に、音声解析部１２２は、読み込んだ音声データに対して、高速フーリエ変換を行い、時間周波数データを算出する（ステップＳ１１０）。 Next, the voice analysis unit 122 performs fast Fourier transform on the read voice data to calculate time frequency data (step S110).

次に、音声解析部１２２は、算出した周波数成分について、振幅スペクトルを計算する（ステップＳ１２０)。 Next, the voice analysis unit 122 calculates an amplitude spectrum for the calculated frequency component (step S120).

次に、音声解析部１２２は、所定の周波数帯域において、一定以上の振幅スペクトルが継続した区間、すなわち、あるレベル以上の音が検出された区間である有音区間を切り出す（ステップＳ１３０）。 Next, the voice analysis unit 122 cuts out a section in which a predetermined or higher amplitude spectrum is continued in a predetermined frequency band, that is, a voiced section in which a sound of a certain level or more is detected (step S130).

図５は、横軸を時間、縦軸を所定の周波数帯域における振幅スペクトルの平均にとったグラフの一例である。閾値１１００は、音が存在するレベルとして見なされる振幅スペクトルの閾値を示す。図５では、３つの有音区間１０００、有音区間１０１０、有音区間１０２０が音声解析部１２２によって切り出される。有音区間１０１０と、有音区間１０２０の間にも、振幅スペクトルが存在しているが、閾値１１００を下回っているため、音声解析部１２２は、有音区間１０１０と、有音区間１０２０とに分けて有音区間を検出する。 FIG. 5 is an example of a graph in which the horizontal axis represents time and the vertical axis represents the average of amplitude spectra in a predetermined frequency band. The threshold value 1100 indicates a threshold value of an amplitude spectrum that is regarded as a level at which sound is present. In FIG. 5, three sound sections 1000, sound sections 1010, and sound sections 1020 are cut out by the voice analysis unit 122. An amplitude spectrum is also present between the voiced section 1010 and the voiced section 1020, but is below the threshold 1100, so that the speech analysis unit 122 includes the voiced section 1010 and the voiced section 1020. Separately detect the voiced section.

次に、音声解析部１２２は、音声解析データを音声解析データ記憶部１１２に保存する（ステップＳ１４０)。 Next, the voice analysis unit 122 stores the voice analysis data in the voice analysis data storage unit 112 (step S140).

以上が、音声解析部１２２の説明である。
The above is the description of the voice analysis unit 122.

音声複雑度算出部１２３は、作成された音声解析データから、音声複雑度を算出する。 The voice complexity calculator 123 calculates the voice complexity from the created voice analysis data.

音声複雑度とは、入力された音声の音響信号としての複雑さを表す数値指標であり、大きければ大きいほど、その音声が複雑であることを示す。音声複雑度の計算は、音声が入力されている長さ（有音区間長）や有音区間数といったパラメータを用いて算出する。有音区間長は、音声解析データ記憶部１１２から、音声データＩＤに対応する有音区間の長さを全て取得し、その総和を求めることで得られる。有音区間数は、音声解析データ記憶部１１２に保存された、音声データＩＤに対応する有音区間の数を求める。 The speech complexity is a numerical index that represents the complexity of the input speech as an acoustic signal. The larger the speech complexity, the more complex the speech is. The speech complexity is calculated using parameters such as the length of speech input (sound interval length) and the number of sound segments. The voiced section length is obtained by obtaining all the lengths of the voiced sections corresponding to the voice data ID from the voice analysis data storage unit 112 and obtaining the sum. As the number of voiced sections, the number of voiced sections corresponding to the voice data ID stored in the voice analysis data storage unit 112 is obtained.

音声複雑度は、有音区間長と、有音区間数の単調増加関数によって算出する。例えば、有音区間長をＬ（秒）、有音区間数をＮとして、音声複雑度Ｃは、以下の数１によって算出する。 The speech complexity is calculated by a sound interval length and a monotonically increasing function of the number of sound intervals. For example, the voice complexity C is calculated by the following formula 1, where the length of the voiced section is L (seconds) and the number of voiced sections is N.

数１のa、及びbは任意の係数である。 In Equation 1, a and b are arbitrary coefficients.

音楽複雑度算出部１２３の音声複雑度の計算は、長さや有音区間数だけではなく、他のパラメータを用いることももちろん可能である。例えば、子音が出現した回数を計上して、その数が大きければ大きいほど音声複雑度も増加するようにしてもよい。 The calculation of the voice complexity of the music complexity calculator 123 can of course use other parameters in addition to the length and the number of voiced sections. For example, the number of times a consonant appears may be counted, and the greater the number, the greater the voice complexity.

また、逆に、「あー」といった単調な音声を検出し、逆にペナルティ項を設け、音声複雑度を低下させるようにしてもよい。具体的な処理例を示すと、有声音に該当する周波数帯域の振幅スペクトルが所定値以上であり、かつ、その条件が同じ周波数帯域で、所定時間以上連続して現れた場合、所定時間をTθ、連続して検出された時間をTとして、音声複雑度を数２のように計算する。 Conversely, monotonous speech such as “Ah” may be detected, and a penalty term may be provided to reduce speech complexity. In a specific processing example, if the amplitude spectrum of the frequency band corresponding to the voiced sound is equal to or greater than a predetermined value and the condition appears continuously in the same frequency band for a predetermined time or longer, the predetermined time is expressed as Tθ. The speech complexity is calculated as shown in Equation 2, where T is a continuously detected time.

また、有音区間が長ければ長いほど、計算される音声複雑度Ｃの増加率も大きくする場合、数１の有音区間長Ｌや有音区間数Ｎに所定の指数を与えるようにしても良い。例えば、以下の数３では、有音区間長Ｌを１．０５乗しているため、有音区間長Ｌが大きければ大きいほど、音声複雑度Ｃの増加速度も緩やかに大きくなる効果が得られる。 In addition, in the case where the increase rate of the calculated speech complexity C is increased as the voiced section is longer, a predetermined index may be given to the voiced section length L and the number of voiced sections N in Equation 1. good. For example, in the following Equation 3, since the voiced section length L is raised to the power of 1.05, the larger the voiced section length L is, the larger the speed of increasing the voice complexity C is. .

音声コマンド作成部１２４は、音声コマンドを作成するためのＵＩをユーザに提示し、作成された音声コマンドを音声コマンド記憶部１１３に格納する。音声コマンドを作成するために、音声の録音を行い、その音声によって実行される実行コマンドＩＤとを登録するＵＩを提示する。音声コマンドを作成するためのUIの一例を、図６に示す。 The voice command creation unit 124 presents a UI for creating a voice command to the user, and stores the created voice command in the voice command storage unit 113. In order to create a voice command, a voice is recorded and a UI for registering an execution command ID executed by the voice is presented. An example of a UI for creating a voice command is shown in FIG.

図６の実行コマンドリスト１２００は、選択可能な実行コマンドの一覧をリスト表示したものである。ユーザは、入力部１０１を介して実行コマンドを選択することができる。選択中の実行コマンドはチェックマークで表示され、図６においては、「氷魔法」が選択されている。 The execution command list 1200 of FIG. 6 displays a list of selectable execution commands. The user can select an execution command via the input unit 101. The execution command being selected is indicated by a check mark. In FIG. 6, “ice magic” is selected.

録音ボタン１２１０は、音声の録音を開始するボタンである。録音ボタン１２１０が押下されると、音声コマンド作成部１２４は、プレイヤーの音声入力受付を開始する。録音ボタン１２１０が押下された時の音声コマンド作成部１２４の処理の流れを、図７のフローチャートを参照しながら説明する。 The recording button 1210 is a button for starting recording of voice. When the record button 1210 is pressed, the voice command creation unit 124 starts accepting voice input from the player. The processing flow of the voice command creation unit 124 when the recording button 1210 is pressed will be described with reference to the flowchart of FIG.

まず、音声コマンド作成部１２４は、マイクロフォン１０３を介して、プレイヤーの音声入力を受け付ける（ステップＳ２００）。 First, the voice command creation unit 124 receives a player's voice input via the microphone 103 (step S200).

次に、音声コマンド作成部１２４は、入力された音声を、音声信号変換部１２１に渡して、音声信号をデジタルの音声データに変換する（ステップＳ２１０）。 Next, the voice command creation unit 124 passes the input voice to the voice signal conversion unit 121, and converts the voice signal into digital voice data (step S210).

次に、音声コマンド作成部１２４は、音声データを音声解析部１２２に渡して、音声解析データを算出させる（ステップＳ２２０）。 Next, the voice command creation unit 124 passes the voice data to the voice analysis unit 122 to calculate the voice analysis data (step S220).

次に、音声コマンド作成部１２４は、音声複雑度算出部１２３に、音声解析データから、音声複雑度を算出させる（ステップＳ２３０）。 Next, the voice command creation unit 124 causes the voice complexity calculation unit 123 to calculate the voice complexity from the voice analysis data (step S230).

以上が、録音ボタン１２１０が押下された時の、音声コマンド作成部１２４の処理である。 The above is the processing of the voice command creation unit 124 when the recording button 1210 is pressed.

図６の説明に戻り、保存ボタン１２２０は、音声コマンドを記憶するボタンである。保存ボタン１２２０が押下されると、音声コマンド作成部１２４は、選択された実行コマンドの実行コマンドＩＤと、録音された音声の音声データＩＤと、算出された音声複雑度と、を組み合わせて、音声コマンド記憶部１１３に記憶する。キャンセルボタン１２３０は、音声コマンドを保存せず、音声コマンド作成画面を抜けるボタンである。 Returning to the explanation of FIG. 6, the save button 1220 is a button for storing a voice command. When the save button 1220 is pressed, the voice command creation unit 124 combines the execution command ID of the selected execution command, the voice data ID of the recorded voice, and the calculated voice complexity, Store in the command storage unit 113. The cancel button 1230 is a button for exiting the voice command creation screen without saving the voice command.

以上が、音声コマンド作成部１２４の説明である。
The above is the description of the voice command creation unit 124.

音声コマンド選択部１２５は、入力された音声と、音声コマンド記憶部１１３に記憶された音声コマンドの音声データとを比較して、最も近い音声コマンドを選択する。音声コマンド選択部１２５の具体的な処理の流れを、図８のフローチャートに基づいて説明を行う。 The voice command selection unit 125 compares the input voice with the voice data of the voice command stored in the voice command storage unit 113, and selects the closest voice command. A specific processing flow of the voice command selection unit 125 will be described based on the flowchart of FIG.

まず、音声コマンド選択部１２５は、マイクロフォン１０３からの音声入力を受け付ける（ステップＳ３００）。 First, the voice command selection unit 125 receives voice input from the microphone 103 (step S300).

次に、音声コマンド選択部１２５は、入力された音声信号を、音声信号変換部１２１に渡し、音声データに変換する（ステップＳ３１０)。 Next, the voice command selection unit 125 passes the input voice signal to the voice signal conversion unit 121 and converts it into voice data (step S310).

次に、音声コマンド選択部１２５は、ステップSXXXで取得した音声データを、音声解析部１２２に渡し、音声解析データを取得する（ステップＳ３２０)。 Next, the voice command selection unit 125 passes the voice data acquired in step SXXX to the voice analysis unit 122, and acquires voice analysis data (step S320).

次に、音声コマンド選択部１２５は、音声コマンド記憶部１１３から、作成された音声コマンドを全て読み込み、各音声コマンドの音声データＩＤに対応する音声解析データと、ステップＳ３２０で算出した音声解析データとの類似度を計算する（ステップＳ３３０）。 Next, the voice command selection unit 125 reads all the created voice commands from the voice command storage unit 113, the voice analysis data corresponding to the voice data ID of each voice command, and the voice analysis data calculated in step S320. Is calculated (step S330).

類似度の算出は、振幅スペクトル同士を比較し、その誤差を求めることによって行う。音声解析データの振幅スペクトルは、Ｎ個の周波数帯域と時間TのＮ × Ｔ行列である。周波数帯域については、どの音声解析データでも同一であるが、時間Tについては、録音時間によって異なる。そのため、２つのデータの誤差を求めるために、時間方向について、動的時間伸縮法を用いて補正する。 The similarity is calculated by comparing the amplitude spectra and calculating the error. The amplitude spectrum of the speech analysis data is an N × T matrix of N frequency bands and time T. The frequency band is the same for all voice analysis data, but the time T differs depending on the recording time. Therefore, in order to obtain the error between the two data, the time direction is corrected using the dynamic time expansion / contraction method.

動的時間伸縮法とは、時系列の長さが異なる２つのデータを比較するために用いられる手法である。具体的には、２つのデータをデータＡ、データＢとおき、データＡの時系列データｔａ（α）（０≦α≦Ａ）と、データＢの時系列データｔｂ（β）（０≦β≦Ｂ，Ａ≦Ｂ）を比較する場合、まず、時系列データｔａ（０）と、時系列データｔｂ（β）の各データを比較して、ユークリッド距離が最小となるｔｂ（β０）を選択する。β０の取り得る範囲は、０≦β０≦（Ｂ−Ａ）である。 The dynamic time expansion / contraction method is a method used for comparing two data having different time series lengths. Specifically, two data are set as data A and data B, time series data ta (α) (0 ≦ α ≦ A) of data A, and time series data tb (β) (0 ≦ β of data B). ≦ B, A ≦ B), the time series data ta (0) and the time series data tb (β) are first compared to select tb (β0) that minimizes the Euclidean distance. To do. The range that β0 can take is 0 ≦ β0 ≦ (B−A).

次に、再び時系列データｔａ（１）と、時系列データｔｂ（β）の各データを比較して、ユークリッド距離が最小となるｔｂ（β１）を選択する。β１の取り得る範囲は、β０＜β１≦（Ｂ−β０）−（Ａ−１）である。βの取り得る範囲を一般化すると、時系列データｔａ（ｍ）の場合、以下の数４のようになる。 Next, the time series data ta (1) and the time series data tb (β) are compared again to select tb (β1) that minimizes the Euclidean distance. The possible range of β1 is β0 <β1 ≦ (B−β0) − (A-1). When the range that β can take is generalized, in the case of time-series data ta (m), the following equation 4 is obtained.

以上の比較計算をｔａ（Ａ）まで行い、求めた距離の総和Ｅを算出する。求めた距離の総和Ｅから、類似度Ｓは次の数５によって算出できる。 The above comparison calculation is performed up to ta (A), and the total E of the obtained distances is calculated. From the total distance E obtained, the similarity S can be calculated by the following equation (5).

次に、音声コマンド選択部１２５は、ステップＳ３３０で算出した類似度の中で、最も高い類似度が算出された音声コマンドを選択する（ステップＳ３４０）。 Next, the voice command selection unit 125 selects the voice command having the highest similarity calculated from the similarities calculated in step S330 (step S340).

次に、ステップＳ３４０で選択された音声コマンドの類似度が、所定の閾値以上かどうかを判断する（ステップＳ３５０)。もし、閾値を下回った場合は、音声コマンド該当なしとして、処理を終了する。閾値以上であるならば、ステップＳ３６０へ進む。 Next, it is determined whether or not the similarity of the voice command selected in step S340 is greater than or equal to a predetermined threshold (step S350). If it falls below the threshold, it is determined that the voice command is not applicable, and the process is terminated. If it is equal to or greater than the threshold, the process proceeds to step S360.

選択された音声コマンドの類似度が所定の閾値以上であるならば（ステップＳ３５０ＹＥＳ）、音声コマンド選択部１２５は、その音声コマンドが入力されたものとして、最終的に選択する（ステップＳ３６０）。 If the similarity of the selected voice command is greater than or equal to a predetermined threshold (YES in step S350), the voice command selection unit 125 finally selects that the voice command has been input (step S360).

音声コマンド実行部１２６は、音声コマンド選択部１２５によって選択された音声コマンドを、実際の処理として実行する。 The voice command execution unit 126 executes the voice command selected by the voice command selection unit 125 as an actual process.

例えば、選択された音声コマンドが、図３に示す音声コマンド１である場合、音声コマンド種別が「炎魔法」である。この場合、音声コマンド種別に対応づけられた、「炎魔法」の処理を行う。さらに、この時、音声コマンドの複雑度に応じて、音声コマンドの効果の度合いを変動させる。例えば、前記「炎魔法」の例において、複雑度に応じて、見た目を変更したり、威力を増加したりといった変化を加える。 For example, when the selected voice command is the voice command 1 shown in FIG. 3, the voice command type is “flame magic”. In this case, the “flame magic” process associated with the voice command type is performed. Further, at this time, the degree of the effect of the voice command is changed according to the complexity of the voice command. For example, in the “fire magic” example, changes such as changing the appearance or increasing the power are added according to the complexity.

音声コマンド選択部１２５、及び音声コマンド実行部１２６の動作を、図９のゲーム画面の例を参照しながら説明を行う。図９では、表示部１０２に、プレイヤーキャラクター１３００と、マイクボタン１３１０と、エネミー１３２０が配置されている。ゲームの目的は、音声入力を行って、音声コマンドである魔法を発動し、エネミー１３２０を倒すことである。 The operations of the voice command selection unit 125 and the voice command execution unit 126 will be described with reference to the example of the game screen in FIG. In FIG. 9, a player character 1300, a microphone button 1310, and an enemy 1320 are arranged on the display unit 102. The purpose of the game is to perform voice input, activate magic that is a voice command, and defeat enemy 1320.

マイクボタン１３１０が押下されると、音声コマンド選択部部１２５は、音声入力を受け付けて、入力された音声に従って、音声コマンドを選択する。音声コマンド実行部１２６は、選択された音声コマンドに従って、所定の処理を実行する。 When microphone button 1310 is pressed, voice command selection unit 125 accepts voice input and selects a voice command according to the input voice. The voice command execution unit 126 executes predetermined processing according to the selected voice command.

図１０は、音声コマンド選択部１２５によって、音声コマンド種別「炎魔法」が選択され、「炎魔法」が発動された様子を示している。すなわち、音声コマンド実行部１２６が、プレイヤーキャラクター１３００から、エネミー１３２０に向かって、「炎魔法」の発動結果である火の玉１３３０が移動する処理を行う。 FIG. 10 shows a state where the voice command selection unit 125 selects the voice command type “flame magic” and activates “flame magic”. That is, the voice command execution unit 126 performs a process of moving the fireball 1330 that is the result of the “fire magic” activation from the player character 1300 toward the enemy 1320.

図１１は、図１０と同じ音声コマンド種別「炎魔法」だが、選択された音声コマンドの音声複雑度がより高い場合の一例である。図１１では、火の玉１３４０が、図１０に示す火の玉１３３０より大きくなっている。すなわち、音声複雑度が高いため、より派手なエフェクトを表示している。 FIG. 11 shows an example in which the voice command type “flame magic” is the same as that in FIG. 10, but the voice complexity of the selected voice command is higher. In FIG. 11, the fireball 1340 is larger than the fireball 1330 shown in FIG. That is, since the voice complexity is high, a more flashy effect is displayed.

以上がゲーム制御装置１００の説明である。 The above is the description of the game control device 100.

本実施例におけるゲーム制御装置の全部、乃至は一部を、ＣＰＵ（Central Processing Unit）や、ＲＡＭ（Read Access Memory）、ＨＤＤ（Hard Disk Drive）等の記憶装置で構成されるコンピュータ内部にインストールし、コンピュータプログラムとして実施するようにしてもよい。 All or part of the game control device in the present embodiment is installed inside a computer constituted by a storage device such as a CPU (Central Processing Unit), a RAM (Read Access Memory), and an HDD (Hard Disk Drive). It may be implemented as a computer program.

また、音声コマンド実行部１２６の実行するコマンドの内容は、選択された音声コマンドの音声複雑度によって変化させるとしたが、さらに、音声コマンド選択部１２５によって算出された類似度も用いて、威力や見た目を変化させるようにしても良い。 In addition, although the content of the command executed by the voice command execution unit 126 is changed according to the voice complexity of the selected voice command, the power calculated by the voice command selection unit 125 is also used, You may make it change appearance.

１００ゲーム制御装置
１０１入力部
１０２表示部
１０３マイクロフォン
１１０記憶部
１１１音声データ記憶部
１１２音声解析データ記憶部
１１３音声コマンド記憶部
１２０制御部
１２１音声信号変換部
１２２音声解析部
１２３音声複雑度算出部
１２４音声コマンド作成部
１２５音声コマンド選択部
１２６音声コマンド実行部
１０００−１０２０有音区間
１１００閾値
１２００実行コマンドリスト
１２１０録音ボタン
１２２０保存ボタン
１２３０キャンセルボタン
１３００プレイヤーキャラクター
１３１０マイクボタン
１３２０エネミー
１３３０，１３４０火の玉 DESCRIPTION OF SYMBOLS 100 Game control apparatus 101 Input part 102 Display part 103 Microphone 110 Storage part 111 Voice data storage part 112 Voice analysis data storage part 113 Voice command storage part 120 Control part 121 Voice signal conversion part 122 Voice analysis part 123 Voice complexity calculation part 124 Voice command creation unit 125 Voice command selection unit 126 Voice command execution unit 1000-1020 Voiced section 1100 Threshold 1200 Execution command list 1210 Record button 1220 Save button 1230 Cancel button 1300 Player character 1310 Microphone button 1320 Enemy 1330, 1340 Fireball

Claims

A speech analysis unit that calculates speech analysis data that is data of acoustic characteristics of the speech signal, a speech complexity calculation unit that calculates speech complexity that is an index indicating the complexity of the speech signal from the speech analysis data, and speech A voice command storage unit that stores a voice command that combines a signal, a voice complexity of the voice signal, and an execution command that indicates an execution process corresponding to the voice signal; The voice analysis data of the first voice signal is acquired as a voice signal, and each of the voice analysis data of the voice signal corresponding to the stored voice command and the voice analysis data of the first voice data are obtained. By comparing, the voice command selection unit that selects the voice command for which the highest similarity is calculated and the processing of the execution command of the selected voice command are executed. While, on the basis of the audio complexity of the selected voice command, the game control device, characterized in that it comprises a voice command execution section for changing the processing content of the execution command.

A speech complexity that is an index indicating the complexity of the speech signal is calculated from the speech analysis data, and a monotonous repetitive pattern is detected from the speech analysis data to reduce the calculated speech complexity. The game control device according to claim 1, further comprising a calculation unit.

The voice command execution unit executes a process of an execution command of the selected voice command, and changes a processing content of the execution command based on a voice complexity of the selected voice command and the similarity. The game control apparatus according to claim 1, wherein the game control apparatus is a game control apparatus.

Calculating voice analysis data which is data of acoustic characteristics of the voice signal; calculating voice complexity which is an index indicating complexity of the voice signal from the voice analysis data; the voice signal; and the voice signal. A step of storing a voice command combining an execution command indicating an execution process corresponding to the voice signal, a voice signal input from the outside as a first voice signal, and the first voice By obtaining the voice analysis data of the signal and comparing each of the voice analysis data of the voice signal corresponding to the stored voice command with the voice analysis data of the first voice data, the highest similarity Selecting the calculated voice command, processing the selected voice command execution command, and executing the selected sound command. Based on the audio complexity of command, game control method characterized by including the step of changing the processing content of the execution command.

Calculating voice analysis data which is data of acoustic characteristics of the voice signal; calculating voice complexity which is an index indicating complexity of the voice signal from the voice analysis data; the voice signal; and the voice signal. A step of storing a voice command combining an execution command indicating an execution process corresponding to the voice signal, a voice signal input from the outside as a first voice signal, and the first voice By obtaining the voice analysis data of the signal and comparing each of the voice analysis data of the voice signal corresponding to the stored voice command with the voice analysis data of the first voice data, the highest similarity Selecting the calculated voice command, processing the selected voice command execution command, and executing the selected sound command. Based on the audio complexity of command, game control program for executing the steps on a computer to change the processing content of the execution command.