WO2003015884A1 - Massively online game comprising a voice modulation and compression system - Google Patents

Massively online game comprising a voice modulation and compression system Download PDF

Info

Publication number
WO2003015884A1
WO2003015884A1 PCT/CH2002/000436 CH0200436W WO03015884A1 WO 2003015884 A1 WO2003015884 A1 WO 2003015884A1 CH 0200436 W CH0200436 W CH 0200436W WO 03015884 A1 WO03015884 A1 WO 03015884A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
player
game
modulation
message
Prior art date
Application number
PCT/CH2002/000436
Other languages
French (fr)
Inventor
Olivier Morgan
Original Assignee
Komodo Entertainment Software Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Komodo Entertainment Software Sa filed Critical Komodo Entertainment Software Sa
Publication of WO2003015884A1 publication Critical patent/WO2003015884A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/85Providing additional services to players
    • A63F13/87Communicating with other players during game play, e.g. by e-mail or chat
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/215Input arrangements for video game devices characterised by their sensors, purposes or types comprising means for detecting acoustic signals, e.g. using a microphone
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/30Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers
    • A63F13/33Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections
    • A63F13/335Interconnection arrangements between game servers and game devices; Interconnection arrangements between game devices; Interconnection arrangements between game servers using wide area network [WAN] connections using Internet
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/033Voice editing, e.g. manipulating the voice of the synthesiser
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/30Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device
    • A63F2300/302Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game characterized by output arrangements for receiving control signals generated by the game device specially adapted for receiving control signals not targeted to a display device or game input means, e.g. vibrating driver's seat, scent dispenser
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F2300/00Features of games using an electronically generated display having two or more dimensions, e.g. on a television screen, showing representations related to the game
    • A63F2300/60Methods for processing data by generating or executing the game program
    • A63F2300/6063Methods for processing data by generating or executing the game program for sound processing

Definitions

  • the present invention relates to Massively Online Games.
  • Massively Online Games means (for the purposes of this document we will henceforth replace the expression "massively online games” by its English abbreviation MOG: Massively Online Game).
  • MOGs are, as their name suggests, computer games where a large number of players connect to a single server to play together in a virtual environment defined by a computer program. The growing success of this kind of games can be explained by the fact that for the first time the player interacted directly with other players instead of being confronted with the limited artificial intelligence of a computer program. Therefore, it is easy to understand that the key element of this kind of games is the communication between the players.
  • the first known MOG is "Ultima Online” produced by Origin and published by Electronic Arts. Following the success of the latter, more recent titles have appeared such as, "EverQuest” produced by Verant Interactive and published by Sony Online Entertainment, and "Anarchy Online” produced and published by Funcom. All these MOGs were designed to work on a single platform: the PC. They therefore naturally opted for "chat", an inter-player communication system that has already proven itself. This communication system is based on the exchange of text over the Internet. A user enters a text on the keyboard of his PC and sends it, either to another user, or to several other users connected to the Internet. These companies chose this solution because it is easy to implement and reliable. However, this system does not transcribe the voice, it only transcribes the content of the message in text form. 3. Summary of the invention
  • the object of the present invention is to provide an improved communication system for MOGs.
  • a system that allows them to send a voice message to other players.
  • the player needs a system to capture the sound of his voice as well as another system to emit the sound coming from the game.
  • the player utters a sound, a word, a sentence in his input system. his.
  • the amount of information that can be captured is limited to a period of the order of ten seconds.
  • the player is warned when he reaches the maximum recording capacity. Once the message has been entered, the system processes the information.
  • the first step of the invention consists in isolating the voice of the player from the audio signal. 1.
  • the sound of the game comes from speakers 10, or headphones 11. If he wears headphones the system will recognize the absence of the sounds produced by the game If it listens to the sounds through a loudspeaker, the system will recognize the presence of the sounds that it emitted a few hundredths of a second ago and subtract them from the input signal. In both cases, a signal is obtained which only includes the player's voice.
  • the second step is compressing the voice to a size less than 4 bps. This compression can be done in several ways:
  • a first system detects each phoneme in the message (note: the system considers blanks or pauses, as a particular phoneme, which is also recognized, and for which the following parameters are also applied); identifies it with the most similar known phoneme, thanks to a dictionary of phonemes 110 and to simple grammar rules 112 applied as a function of previously recognized phonemes 111 (a dictionary and specific grammar rules for each language); records the duration of phoneme 113 (the position of the phoneme in the signal also gives its duration: tnn - tdebut), as well as the intonation component of the voice in the time interval defined by phoneme 120.
  • the audio signal can then be transcribed into a chain of symbols each comprising an indication of duration, as well as an indication of intonation 130. This chain of symbols has a size much smaller than the original message carried by the voice while retaining its characteristic features.
  • the third step is the synthesis of the voice.
  • compression several synthesis systems can be applied:
  • the fourth step is the modulation of the voice.
  • the synthetic voice as it does not yet correspond to the character embodied by the player in the game.
  • a range of modulation is available to give him for example the higher voice of a woman or the lower of a man.
  • the voice is synthesized with a modulation value chosen by default in the modulation range authorized by the game for each character.
  • the player can then listen to his message and modify the modulation within the authorized range until he is satisfied.
  • This operation is an initialization of the voice components of his character and the player will not have to return to it each time he sends a message. Once the modulation chosen by the player it is recorded by the program and re-used at each message synthesis.
  • the steps described above constitute the processing chain necessary to transform and transport the voice of a player to the other connected players. 4. However, once the player has recorded his message he is not obliged to send it immediately. He can, as we have seen in the processing chain, listen to him to make sure that the modulation suits him and also to make sure that the content will be understandable by other players 310. The content of his message is so stored 311 until the player decides to send it 312.
  • this invention applies to massively online games.
  • the best way to realize the application of this invention is to describe a phase of one of these games.
  • a player X stands in front of his game system (PC, console. Set Top Box or other). He does not wear a helmet and therefore hears the sounds emitted by the game through a speaker system.
  • the computer program displays the background in which it must evolve as well as the incarnation of the other players in the form of one or more characters (a single character or a small team of characters for role-playing games, an entire army for real-time strategy games, etc.).
  • Player X can then communicate with other players either by moving their character in the virtual world (nodding, waving, etc.), or by speaking into a microphone.
  • Fig. 1 this drawing represents the sound environment in which the player is at the moment when he is recording sounds, words or sentences intended for other players.
  • the player listens to the game sounds on the loudspeaker, his messages will therefore be recorded with the ambient noise of the game before being processed.
  • the player listens to the game sounds in his helmet, his messages will therefore be recorded without the ambient noise of the game.
  • FIG. 2 this drawing represents one of the possible means of compressing the voice.
  • this case is a compression here by transforming the voice into a chain of symbols.
  • Each symbol is composed of a phoneme detected in the original voice message (thanks to a comparison with a dictionary of phonemes 110 and the application of simple grammar rules 112 establishing, among other things, the possibilities of succession of detected phonemes) to which is added a data item of intonation of the phoneme 120 as well as its duration 113.
  • Fig. 3 this drawing represents one of the possible means for synthesizing the voice following the compression carried out in FIG. 2.
  • the signal coded into a chain of symbols is divided into components of phoneme 210, duration 212 and intonation 213. Thanks to a library of sounds 211 comprising in particular a sound for each phoneme the system synthesizes an artificial voice incorporating the duration of each phoneme and his intonation.
  • Fig. 4 this drawing represents an overview of the invention.
  • a player records his voice message, listens to it and adjusts the modulation in a range predefined by the game 310.
  • the compressed message is stored 311 pending the player's decision to send the message.
  • the message is transported 312 by means to the other players authorized to hear the message (authorization defined by the computer program of the game according to certain parameters: proximity or other).
  • the message is synthesized, then modulated with the modulation component chosen by the sending player.

Abstract

The invention concerns a massively online game incorporating a voice compression and modulation system for enhancing the player's sensations when he is immersed in said virtual environment.

Description

Titretitle
JEUX MASSIVEMENT ONLINE COMPRENANT UN SYSTEME DE MODULATION ET DE COMPRESSIONMASSIVELY ONLINE GAMES INCLUDING A MODULATION AND COMPRESSION SYSTEM
DE LA VOIXVOICE
DescriptionDescription
1. Domaine de l'invention1. Field of the invention
La présente invention se rapporte aux Jeux Massivement Online. Pour cerner le contexte dans lequel on se situe, il convient de bien comprendre ce que la dénomination « Jeux Massivement Online» signifie (pour les besoins de ce document on remplacera dorénavant l'expression «jeux massivement online » par son abréviation anglaise MOG : Massively Online Game). Les MOGs sont, comme l'indique leur nom, des jeux informatiques où un grand nombre de joueurs se connectent à un serveur unique pour jouer ensemble dans un environnement virtuel défini par un programme informatique. Le succès grandissant de ce genre de jeux peut s'expliquer par le fait que pour la première fois le joueur interagi directement avec d'autres joueurs au lieu d'être confronté à l'intelligence artificielle limitée d'un programme informatique. Par conséquent, il est aisé de comprendre que l'élément clé de ce genre de jeux est la communication entre les joueurs.The present invention relates to Massively Online Games. To understand the context in which we are located, it is important to understand what the name "Massively Online Games" means (for the purposes of this document we will henceforth replace the expression "massively online games" by its English abbreviation MOG: Massively Online Game). MOGs are, as their name suggests, computer games where a large number of players connect to a single server to play together in a virtual environment defined by a computer program. The growing success of this kind of games can be explained by the fact that for the first time the player interacted directly with other players instead of being confronted with the limited artificial intelligence of a computer program. Therefore, it is easy to understand that the key element of this kind of games is the communication between the players.
2. Discussion du contexte2. Discussion of the context
Le premier MOG connu est « Ultima Online » produit par Origin et publié par Electronic Arts. Suite au franc succès rencontré par ce dernier, des titres plus récents sont apparus tels que, « EverQuest » produit par Verant Interactive et publié par Sony Online Entertainment, et « Anarchy Online » produit et publié par Funcom. Tous ces MOGs ont été conçus pour fonctionner sur une plateforme unique : le PC. Ils ont donc tout naturellement opté pour le « chat », un système de communication inter-joueur qui a déjà largement fait ses preuves. Ce système de communication est basé sur l'échange de texte à travers Internet. Un utilisateur saisi un texte au clavier de son PC et l'envoie, soit à un autre utilisateur, soit à plusieurs autres utilisateurs connectés à Internet. Ces compagnies ont choisi cette solution car elle est facile à implémenter et fiable. Cependant ce système ne transcrit pas la voix, il ne transcrit que le contenu du message sous forme de texte. 3. Résumé de l'inventionThe first known MOG is "Ultima Online" produced by Origin and published by Electronic Arts. Following the success of the latter, more recent titles have appeared such as, "EverQuest" produced by Verant Interactive and published by Sony Online Entertainment, and "Anarchy Online" produced and published by Funcom. All these MOGs were designed to work on a single platform: the PC. They therefore naturally opted for "chat", an inter-player communication system that has already proven itself. This communication system is based on the exchange of text over the Internet. A user enters a text on the keyboard of his PC and sends it, either to another user, or to several other users connected to the Internet. These companies chose this solution because it is easy to implement and reliable. However, this system does not transcribe the voice, it only transcribes the content of the message in text form. 3. Summary of the invention
L'objet de la présente invention est de fournir un système de communication amélioré destiné aux MOGs. Au lieu de communiquer avec du texte, nous proposons aux joueurs un système qui leur permette d'envoyer un message sous forme de voix aux autres joueurs. Pour que ce système fonctionne il faut au joueur un système pour saisir le son de sa voix ainsi qu'un autre système pour émettre le son en provenance du jeu. Le joueur énonce un son, un mot, une phrase dans son système de saisie du son. La quantité d'information qui peut-être saisie est limitée à une durée de l'ordre d'une dizaine de secondes. Le joueur est prévenu lorsqu'il atteint la capacité maximum d'enregistrement. Une fois le message saisi le système traite l'information.The object of the present invention is to provide an improved communication system for MOGs. Instead of communicating with text, we offer players a system that allows them to send a voice message to other players. For this system to work, the player needs a system to capture the sound of his voice as well as another system to emit the sound coming from the game. The player utters a sound, a word, a sentence in his input system. his. The amount of information that can be captured is limited to a period of the order of ten seconds. The player is warned when he reaches the maximum recording capacity. Once the message has been entered, the system processes the information.
La première étape de l'invention consiste à isoler dans le signal audio la voix du joueur Fig. 1. Il y a ici deux cas de figures possibles, soit le son du jeu provient de haut-parleurs 10, soit d'un casque 11. S'il porte un casque le système va reconnaître l'absence des sons produits par le jeu. S'il écoute les sons sur haut-parleur, le système va reconnaître la présence des sons qu'il a émis quelques centièmes de secondes auparavant et les soustraire au signal d'entrée. Dans les deux cas on obtient un signal qui ne comporte plus que la voix du joueur.The first step of the invention consists in isolating the voice of the player from the audio signal. 1. There are two possible scenarios here, either the sound of the game comes from speakers 10, or headphones 11. If he wears headphones the system will recognize the absence of the sounds produced by the game If it listens to the sounds through a loudspeaker, the system will recognize the presence of the sounds that it emitted a few hundredths of a second ago and subtract them from the input signal. In both cases, a signal is obtained which only includes the player's voice.
La deuxième étape est la compression de la voix à une taille inférieure à 4 bps. Cette compression peut se faire de plusieurs façons :The second step is compressing the voice to a size less than 4 bps. This compression can be done in several ways:
- Système de compression directe de la voix : tel que MPEG, WAVE, etc.- Direct voice compression system: such as MPEG, WAVE, etc.
- Détection des phonèmes contenus dans le message vocal Fig. 2 : un premier système détecte dans le message chaque phonème (remarque : le système considère les blancs ou les pauses, comme un phonème particulier, qui est également reconnu, et pour lequel les paramètres suivants sont également appliqués) ; l'identifie au phonème connu le plus ressemblant, grâce à un dictionnaire de phonèmes 110 et à des règles de grammaire simples 112 appliquées en fonction des phonèmes reconnus auparavant 111 (un dictionnaire et des règles de grammaire spécifiques pour chaque langue) ; enregistre la durée du phonème 113 (la position du phonème dans le signal donne également sa durée : tnn - tdebut), ainsi que la composante d'intonation de la voix dans l'intervalle de temps défini par le phonème 120. Le signal audio peut alors être transcrit en une chaîne de symboles chacun comportant une indication de durée, ainsi qu'une indication d'intonation 130. Cette chaîne de symboles a une taille bien inférieure que le message original porté par la voix tout en conservant ses traits caractéristiques.- Detection of phonemes contained in the voice message Fig. 2: a first system detects each phoneme in the message (note: the system considers blanks or pauses, as a particular phoneme, which is also recognized, and for which the following parameters are also applied); identifies it with the most similar known phoneme, thanks to a dictionary of phonemes 110 and to simple grammar rules 112 applied as a function of previously recognized phonemes 111 (a dictionary and specific grammar rules for each language); records the duration of phoneme 113 (the position of the phoneme in the signal also gives its duration: tnn - tdebut), as well as the intonation component of the voice in the time interval defined by phoneme 120. The audio signal can then be transcribed into a chain of symbols each comprising an indication of duration, as well as an indication of intonation 130. This chain of symbols has a size much smaller than the original message carried by the voice while retaining its characteristic features.
La troisième étape est la synthèse de la voix. Comme pour la compression plusieurs systèmes de synthèse peuvent être appliqués :The third step is the synthesis of the voice. As for compression, several synthesis systems can be applied:
- Synthèse directe de la voix : si un système de compression directe de la voix a été utilisé la synthèse est fournie par le décodeur du système de compression que ce soit MPEG, WAVE ou tout autre système. - Synthèse de la voix à partir d'une chaîne de phonèmes Fig. 3 : ce système ne fonctionne que si l'on a compressé la voix sous forme d'une chaîne de phonèmes avec les informations correspondantes de durée et d'intonation, comme décrit précédemment. Le système produit un son pour chaque phonème 210 grâce à une bibliothèque de sons 211 (une bibliothèque spécifique par langue). La durée du son 212 ainsi que son intonation 213 sont définis par les facteurs qui accompagnent chaque phonème. La chaîne de symboles est donc transcrite à nouveau en un message compréhensible porteur des caractéristiques émotionnelles du message originel.- Direct voice synthesis: if a direct voice compression system has been used, synthesis is provided by the decoder of the compression system, whether MPEG, WAVE or any other system. - Synthesis of the voice from a chain of phonemes Fig. 3: this system only works if the voice has been compressed in the form of a chain of phonemes with the corresponding duration and intonation information, as described above. The system produces a sound for each phoneme 210 thanks to a library of sounds 211 (a specific library for each language). The duration of the sound 212 as well as its intonation 213 are defined by the factors which accompany each phoneme. The chain of symbols is therefore transcribed again into an understandable message carrying the emotional characteristics of the original message.
La quatrième étape est la modulation de la voix. En effet, la voix synthétique telle quelle ne correspond pas encore au personnage incarné par le joueur dans le jeu. Pour chaque personnage une gamme de modulation est à disposition permettant de lui conférer par exemple la voix plus aiguë d'une femme ou celle plus grave d'un homme. La voix est synthétisée avec une valeur de modulation choisie par défaut dans la gamme de modulation autorisée par le jeu pour chaque personnage. Le joueur peut alors écouter son message et modifier dans la gamme autorisée la modulation jusqu'à ce qu'il soit satisfait. Cette opération est une initialisation des composantes de voix de son personnage et le joueur n'aura pas à y revenir à chaque fois qu'il envoie un message. Une fois la modulation choisie par le joueur celle-ci est enregistrée par le programme et ré-utilisée à chaque synthèse de message.The fourth step is the modulation of the voice. Indeed, the synthetic voice as it does not yet correspond to the character embodied by the player in the game. For each character a range of modulation is available to give him for example the higher voice of a woman or the lower of a man. The voice is synthesized with a modulation value chosen by default in the modulation range authorized by the game for each character. The player can then listen to his message and modify the modulation within the authorized range until he is satisfied. This operation is an initialization of the voice components of his character and the player will not have to return to it each time he sends a message. Once the modulation chosen by the player it is recorded by the program and re-used at each message synthesis.
Les étapes décrites ci-dessus constituent la chaîne de traitement nécessaire pour transformer et transporter la voix d'un joueur aux autres joueurs connectés Fig. 4. Cependant, une fois que le joueur a enregistré son message il n'est pas obligé de l'envoyer immédiatement. Il peut, comme nous l'avons vu dans la chaîne de traitement, l'écouter pour s'assurer que la modulation lui convient et aussi pour s'assurer que le contenu sera compréhensible par les autres joueurs 310. Le contenu de son message est donc stocké 311 jusqu'à ce que le joueur décide de l'envoyer 312.The steps described above constitute the processing chain necessary to transform and transport the voice of a player to the other connected players. 4. However, once the player has recorded his message he is not obliged to send it immediately. He can, as we have seen in the processing chain, listen to him to make sure that the modulation suits him and also to make sure that the content will be understandable by other players 310. The content of his message is so stored 311 until the player decides to send it 312.
4. Description de l'invention4. Description of the invention
Comme indiqué préalablement cette invention s'applique aux jeux massivement online. La meilleure façon de réaliser l'application de cette invention est de décrire une phase de l'un de ces jeux. Un joueur X se tient devant son système de jeu (PC, console. Set Top Box ou autre). Il ne porte pas de casque et donc entend les sons émis par le jeu grâce à un système de haut-parleurs. Sur son écran le programme informatique affiche le décor dans lequel il doit évoluer ainsi que l'incarnation des autres joueurs sous forme d'un ou de plusieurs personnages (un personnage seul ou une petite équipe de personnage pour les jeux de rôle, une armée entière pour les jeux de stratégie en temps réel, etc). Le joueur X peut alors communiquer avec les autres joueurs soit en faisant bouger son personnage dans le monde virtuel (hochement de la tête, signe de la main, etc), soit en parlant dans un microphone.As previously indicated, this invention applies to massively online games. The best way to realize the application of this invention is to describe a phase of one of these games. A player X stands in front of his game system (PC, console. Set Top Box or other). He does not wear a helmet and therefore hears the sounds emitted by the game through a speaker system. On its screen the computer program displays the background in which it must evolve as well as the incarnation of the other players in the form of one or more characters (a single character or a small team of characters for role-playing games, an entire army for real-time strategy games, etc.). Player X can then communicate with other players either by moving their character in the virtual world (nodding, waving, etc.), or by speaking into a microphone.
Pour les besoins de cet exemple disons qu'il s'agit d'un jeu de rôle massivement online. Le joueur X a choisi d'incarner une sexagénaire au teint joyeux qui prodigue ses dictons a qui veut bien les entendre. Dans le monde virtuel il se trouve dans un parc ou de nombreux oiseaux chantent sous la pluie. Le joueur X aperçoit un autre personnage sur son écran. Il s'en approche dans le monde virtuel, indique au système qu'il va démarrer un enregistrement par exemple en pressant un bouton et énonce dans son microphone l'un de ces dictons favoris : « après la pluie vient toujours le beau temps ». Pendant qu'il enregistre sa phrase un indicateur affiché à l'écran indique le temps maximum du message, par exemple 10 secondes, ainsi que le temps utilisé dans ce cas 3 secondes. Son dicton est enregistré par le système mais il est mélangé au divers sons émis par le jeu (pluie qui tombe, oiseaux qui chantent, bruit des pas, etc). Sa phrase est traitée pour ne garder que le son de sa voix, puis comprimée et stockée en attente d'une décision du joueur. Le joueur X veut donner à la voix de sa sexagénaire une touche personnelle il demande alors au système d'émettre sur ses haut-parleurs son message. Ce dernier est automatiquement modulé avec une valeur par défaut transformant sa voix en celle d'une femme âgée. Il ajuste à chaque écoute la modulation dans la gamme permise par le jeu jusqu'à satisfaction. Une fois la modulation déterminée, elle est enregistrée et le joueur n'est plus obliger de la régler à nouveau. Il décide alors d'envoyer son dicton à l'autre personnage. Ce dernier, ainsi que tous les personnages que le programme informatique autorisera à entendre (personnage à proximité ou autres conditions remplies), recevront le dicton et l'entendront sur leur haut-parleur ajouté aux bruits ambiants.For the purposes of this example, let's say that this is a massively online role-playing game. Player X has chosen to embody a sexagenarian with a cheerful complexion who lavishes her sayings on anyone who wants to hear them. In the virtual world he is in a park where many birds sing in the rain. Player X sees another character on his screen. He approaches it in the virtual world, indicates to the system that he will start a recording for example by pressing a button and utters in his microphone one of these favorite sayings: "after the rain always comes the good weather". While he is recording his sentence an indicator displayed on the screen indicates the maximum time of the message, for example 10 seconds, as well as the time used in this case 3 seconds. Its saying is recorded by the system but it is mixed with the various sounds emitted by the game (falling rain, birds singing, noise of footsteps, etc.). His sentence is processed to keep only the sound of his voice, then compressed and stored pending a decision by the player. Player X wants to give the voice of his sixties a personal touch so he asks the system to broadcast his message on his speakers. The latter is automatically modulated with a default value transforming his voice into that of an elderly woman. He adjusts the modulation in the range allowed by the game for each listening until satisfaction. Once the modulation is determined, it is saved and the player no longer has to adjust it again. He then decides to send his saying to the other character. The latter, as well as all the characters that the computer program will allow to hear (nearby character or other conditions fulfilled), will receive the saying and will hear it on their loudspeaker added to ambient noise.
5. Liste des dessins5. List of drawings
Fig. 1 : ce dessin représente l'environnement sonore dans lequel se trouve le joueur à l'instant où il enregistre des sons, mots ou phrases destinés à d'autres joueurs. Dans le cas 10 le joueur écoute les sons du jeu sur haut-parleur, ses messages seront donc enregistrés avec le bruit ambiant du jeu avant d'être traités. Dans le cas 11 le joueur écoute les sons du jeu dans son casque, ses messages seront donc enregistrés sans le bruit ambiant du jeu.Fig. 1: this drawing represents the sound environment in which the player is at the moment when he is recording sounds, words or sentences intended for other players. In case 10 the player listens to the game sounds on the loudspeaker, his messages will therefore be recorded with the ambient noise of the game before being processed. In case 11 the player listens to the game sounds in his helmet, his messages will therefore be recorded without the ambient noise of the game.
Fig. 2 : ce dessin représente un des moyens possibles pour compresser la voix. En l'occurrence il s'agit ici d'une compression en transformant la voix en une chaîne de symboles. Chaque symbole est composé d'un phonème détecté dans le message vocal original (grâce à une comparaison par rapport à un dictionnaire de phonèmes 110 et à l'application de règles de grammaire simples 112 établissant, entre autre, les possibilités de succession des phonèmes détectés) auquel on ajoute une donnée d'intonation du phonème 120 ainsi que sa durée 113.Fig. 2: this drawing represents one of the possible means of compressing the voice. In this case it is a compression here by transforming the voice into a chain of symbols. Each symbol is composed of a phoneme detected in the original voice message (thanks to a comparison with a dictionary of phonemes 110 and the application of simple grammar rules 112 establishing, among other things, the possibilities of succession of detected phonemes) to which is added a data item of intonation of the phoneme 120 as well as its duration 113.
Fig. 3 : ce dessin représente un des moyens possibles pour synthétiser la voix suite à la compression effectuée en Fig. 2. Le signal codé en une chaîne de symboles est découpé en composantes de phonème 210, durée 212 et intonation 213. Grâce à une librairie de sons 211 comportant notamment un son pour chaque phonème le système synthétise une voix artificielle incorporant la durée de chaque phonème et son intonation.Fig. 3: this drawing represents one of the possible means for synthesizing the voice following the compression carried out in FIG. 2. The signal coded into a chain of symbols is divided into components of phoneme 210, duration 212 and intonation 213. Thanks to a library of sounds 211 comprising in particular a sound for each phoneme the system synthesizes an artificial voice incorporating the duration of each phoneme and his intonation.
Fig. 4 : ce dessin représente une vue d'ensemble de l'invention. D'un côté un joueur enregistre son message vocal, l'écoute et ajuste la modulation dans une gamme prédéfinie par le jeu 310. Le message compressé est stocké 311 en attente de la décision du joueur d'envoyer le message. Le message est transporté 312 par des moyens vers les autres joueurs autorisés à entendre le message (autorisation définie par le programme informatique du jeu en fonction de certains paramètres : proximité ou autre). Le message est synthétisé, puis modulé avec la composante de modulation choisie par le joueur expéditeur. Fig. 4: this drawing represents an overview of the invention. On the one hand, a player records his voice message, listens to it and adjusts the modulation in a range predefined by the game 310. The compressed message is stored 311 pending the player's decision to send the message. The message is transported 312 by means to the other players authorized to hear the message (authorization defined by the computer program of the game according to certain parameters: proximity or other). The message is synthesized, then modulated with the modulation component chosen by the sending player.

Claims

Revendications claims
1.Un jeu massivement online incorporant les fonctionnalités suivantes :1.A massively online game incorporating the following features:
- des moyens de séparation de la voix du joueur de celle du son émis par le jeu ;- means for separating the player's voice from that of the sound emitted by the game;
- des moyens de compression de la voix du joueur ;- means of compression of the player's voice;
- des moyens permettant au joueur de choisir la modulation de la voix de son personnage dans une gamme déterminée par le programme informatique du jeu ;- Means allowing the player to choose the modulation of the voice of his character in a range determined by the computer program of the game;
- le stockage et transport de l'information compressée ;- storage and transport of compressed information;
- des moyens de synthèse de la voix ;- means of voice synthesis;
- la modulation de la voix synthétique en fonction du facteur de modulation correspondant au personnage et/ou au choix du joueur. - the modulation of the synthetic voice according to the modulation factor corresponding to the character and / or the choice of the player.
PCT/CH2002/000436 2001-08-13 2002-08-12 Massively online game comprising a voice modulation and compression system WO2003015884A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CH1643/01 2001-08-13
CH16432001 2001-08-13

Publications (1)

Publication Number Publication Date
WO2003015884A1 true WO2003015884A1 (en) 2003-02-27

Family

ID=4565744

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CH2002/000436 WO2003015884A1 (en) 2001-08-13 2002-08-12 Massively online game comprising a voice modulation and compression system

Country Status (1)

Country Link
WO (1) WO2003015884A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2031584A1 (en) 2007-08-31 2009-03-04 Alcatel Lucent A voice synthesis method and interpersonal communication method, particularly for multiplayer online games
WO2011063642A1 (en) * 2009-11-27 2011-06-03 北京中星微电子有限公司 Audio data processing method and audio data processing system
US8892228B2 (en) 2008-06-10 2014-11-18 Dolby Laboratories Licensing Corporation Concealing audio artifacts

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0843168A2 (en) * 1996-11-19 1998-05-20 Sony Corporation An information processing apparatus, an information processing method, and a medium for use in a three-dimensional virtual reality space sharing system
WO2002039424A1 (en) * 2000-11-09 2002-05-16 Nokia Corporation Voice avatars for wireless multiuser entertainment services

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0843168A2 (en) * 1996-11-19 1998-05-20 Sony Corporation An information processing apparatus, an information processing method, and a medium for use in a three-dimensional virtual reality space sharing system
WO2002039424A1 (en) * 2000-11-09 2002-05-16 Nokia Corporation Voice avatars for wireless multiuser entertainment services

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2031584A1 (en) 2007-08-31 2009-03-04 Alcatel Lucent A voice synthesis method and interpersonal communication method, particularly for multiplayer online games
US8892228B2 (en) 2008-06-10 2014-11-18 Dolby Laboratories Licensing Corporation Concealing audio artifacts
WO2011063642A1 (en) * 2009-11-27 2011-06-03 北京中星微电子有限公司 Audio data processing method and audio data processing system

Similar Documents

Publication Publication Date Title
CA2602633C (en) Device for communication for persons with speech and/or hearing handicap
CN104778945B (en) The system and method for responding to natural language speech utterance
US20030028380A1 (en) Speech system
Neumark Doing things with voices: Performativity and voice
US20090254826A1 (en) Portable Communications Device
CN109346076A (en) Interactive voice, method of speech processing, device and system
CN108228132A (en) Promote the establishment and playback of audio that user records
EP1277200A1 (en) Speech system
WO2009071795A1 (en) Automatic simultaneous interpretation system
US20220231873A1 (en) System for facilitating comprehensive multilingual virtual or real-time meeting with real-time translation
CN112512649A (en) Techniques for providing audio and video effects
EP3434022A1 (en) Method and device for controlling the setting of at least one audio and/or video parameter, corresponding terminal and computer program
CN112492400B (en) Interaction method, device, equipment, communication method and shooting method
Obadike Low Fidelity: Stereotyped Blackness in the Field of Sound
WO2003015884A1 (en) Massively online game comprising a voice modulation and compression system
CN110992984A (en) Audio processing method and device and storage medium
JPWO2019026395A1 (en) Information processing apparatus, information processing method, and program
WO2022041177A1 (en) Communication message processing method, device, and instant messaging client
JP2021071632A (en) Information processing device, information processing method, and, program
Devine Imperfect sound forever: loudness, listening, formations, and the historiography of sound reproduction
Batcho Revisiting the Howard Dean scream: sound exclusivity in broadcast news
JP2002006900A (en) Method and system for reducing and reproducing voice
Graham Ambient ageism: Exploring ageism in acoustic representations of older adults in AgeTech advertisements
WO2024001462A1 (en) Song playback method and apparatus, and computer device and computer-readable storage medium
Okkema Harvester of Desires: Gaming Amazon Echo through John Cayley's The Listeners.

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BY BZ CA CH CN CO CR CU CZ DE DM DZ EC EE ES FI GB GD GE GH HR HU ID IL IN IS JP KE KG KP KR LC LK LR LS LT LU LV MA MD MG MN MW MX MZ NO NZ OM PH PL PT RU SD SE SG SI SK SL TJ TM TN TR TZ UA UG US UZ VN YU ZA ZM

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ UG ZM ZW AM AZ BY KG KZ RU TJ TM AT BE BG CH CY CZ DK EE ES FI FR GB GR IE IT LU MC PT SE SK TR BF BJ CF CG CI GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP