CN206711600U

CN206711600U - The voice interactive system with emotive function based on reality environment

Info

Publication number: CN206711600U
Application number: CN201720170435.XU
Authority: CN
Inventors: 黄昌正; 林正才; 冀鸣; 刘晓悦; 叶永权
Original assignee: Guangzhou Science And Technology Co Ltd
Current assignee: Guangzhou Science And Technology Co Ltd
Priority date: 2017-02-24
Filing date: 2017-02-24
Publication date: 2017-12-05
Anticipated expiration: 2027-02-24

Abstract

The utility model provides a kind of voice interactive system with emotive function based on reality environment,Including voice mobile terminal,Virtual environment terminal,External server,User speech is gathered by voice mobile terminal,And handled,It is control command or speech exchange information to obtain so as to obtain user speech information,And send to virtual environment terminal,Carry out corresponding control operation and corresponding emotion,Action display and broadcasting,So as to the exchange of more people in actual environment of Virtual User,Function of the present utility model is departing from the dependence to handle,And quantity is not influenceed by button,It is simple to operate,And pass through the operation of user's speech control system,In addition,The mood for going out user by extracting user from user speech information,Action message,So as to which Virtual User mutually exchanges and expressed oneself emotion in actual environment in multiplayer or application,Really realize emotion communication,Further increase experience effect of the user in virtual environment.

Description

The voice interactive system with emotive function based on reality environment

Technical field

It the utility model is related to a kind of technical field of reality environment, it is especially a kind of based on reality environment Voice interactive system with emotive function.

Background technology

Virtual reality (Virtual Reality, referred to as " VR ") is the new and high technology occurred in recent years, and its principle is to utilize Computer simulation produces the virtual world of a three dimensions, and the mould on sense organs such as vision, the sense of hearing, tactiles is provided to user Intend, allow user as on the spot in person, the things in three dimensions can be observed in time, without limitation.And interact control Field processed is one of important application direction of virtual reality technology, and the also fast development for virtual reality technology has played huge need Seek draw.

At present, some science-and-technology enterprises have been proposed corresponding virtual reality control device, for example, Oculus companies of the U.S. HTC Vive that Gear, HTC company that Oculus Rift of release, Samsung of South Korea release release etc..However, these are empty The control system for intending real world devices remains in the control method of handle.

A kind of interaction handle for virtual reality control of Chinese patent 201610869534.7, it is open a kind of for void Intend the operation handle of actual environment, implementation method is complicated, control is not accurate, control instruction limited amount is in key number.And These control modes can not but manipulate for the handicapped people of hand；The handle control flow for domestic consumer It is more complicated, it is to be understood that the function of each button could operate.

Chinese patent is a kind of 201610270381.4 multi-user voice exchange method based on Virtual Reality scene And device；The simply function of voice call exchange of the simple realization in the multiplayer of virtual environment, but can not be in void Expression, mood, action of game role etc. are seen in the game in near-ring border；See that the personage in game is simply simple fixed Expression, nozzle type action speaking, have no emotion in speech exchange.

The content of the invention

In view of the shortcomings of the prior art, the utility model provides the voice with emotive function based on reality environment Interactive system, so as to avoid dependent on complex operation caused by button, sensing equipment in virtual environment, function is by by bond number The problems such as amount limitation.

The technical solution of the utility model is：A kind of interactive voice system with emotive function based on reality environment System, it is characterised in that：Including voice mobile terminal, virtual environment terminal, external server, the external server respectively with language Sound mobile terminal, the connection of virtual environment terminal called, the voice mobile terminal are connected with virtual environment terminal called；

The voice mobile terminal includes processor, and what is be connected with processor is used to gather the voice signal of user, and right The voice acquisition module that the voice signal of collection is pre-processed；

It is connected to the voice signal of pretreatment being converted into the sound identification module of text message with processor；

What is be connected with processor is used to extract the speech emotional characteristic parameter of the parameter with affective characteristics in text message Extraction module；

What is be connected with processor is used to store voice recognition data, the voice control command from external server loading renewal The memory module of database and speech emotional database；

What is be connected with processor is used to be connected with virtual environment terminal called, so as to the control command or voice that will identify that Exchange of information is sent to virtual environment terminal, and is connected for being communicated with external server, so as to by phase in external server The packet answered loads renewal to the wireless communication module in memory module；

The voice acquisition module is connected with sound identification module respectively, the connection of speech emotional characteristic parameter extraction module, The memory module is connected with sound identification module and speech emotional characteristic parameter extraction module respectively；

The virtual environment terminal include be used for store from external server loading renewal virtual portrait emotional facial expressions and The memory cell of intonation and word speed database corresponding to the model library of action, speech emotional；

For playing in the speech exchange information received there is intonation and the word of word speed or the voice of sentence to broadcast

Amplification module,

For showing the emotional facial expressions of virtual portrait and the display module of action in speech exchange information,

For being connected with voice communication of mobile terminal, and communicate and connect with external server, so as to by external server In corresponding packet load that to update into memory cell be communication module.

Described voice acquisition module is mainly microphone.

Described sound identification module includes speech feature extraction unit, phonetic feature comparing unit, comparative result output Unit, the speech feature extraction unit are connected with phonetic feature comparing unit, and the phonetic feature comparing unit is tied compared with Fruit output unit connects.

The speech emotional characteristic parameter extraction module includes affective feature extraction unit, affective characteristics comparing unit, feelings Feel feature output unit, the affective feature extraction unit is connected with affective characteristics comparing unit, and the affective characteristics is more single Member is connected with affective characteristics output unit.

The voice playing module includes intonation matching unit, voice playing unit, the intonation matching unit and voice Broadcast unit connects.

The display module includes action matching unit, display unit, and the action matching unit is connected with display unit.

Voice mobile terminal is attached with virtual environment terminal, after successful connection, the processor of voice mobile terminal, void Near-ring border terminal sends database version querying command to external server respectively, in the memory module of voice inquirement mobile terminal The voice recognition data of storage, the version of voice control command database and speech emotional database and virtual environment terminal Memory cell in intonation and word speed data corresponding to the virtual portrait emotional facial expressions that store and the model library of action, speech emotional Whether the version in storehouse is consistent with external server, updates corresponding latest edition from external server loading if inconsistent Data are into corresponding memory module, memory cell, so that memory module is last state with the data in memory cell；

Voice acquisition module gathers the voice signal of user, and the voice signal of collection the pre- place such as be filtered to, quantified Sent after reason to sound identification module, speech emotional characteristic parameter extraction module；

The voice recognition data stored in sound identification module combination memory module converts voice signals into text message Form, and whether it is control command that text message is matched with the order data in voice control command database；If It is that control command then generates corresponding control command and parameter, and exports to virtual environment terminal and carry out corresponding control operation；

If not control command, then it is speech exchange information, then passes through speech emotional characteristic parameter extraction module analysis The waveform of pretreated voice signal, and the parameter with affective characteristics is extracted, there is affective characteristics by what is extracted Parameter is matched with the mood data of speech emotional database, then that the emotion is special so as to draw corresponding affective characteristics Sign information MAP corresponding word or sentence are simultaneously delivered to virtual environment terminal,

The action matching unit of virtual environment terminal is by visual human's principle in the affective characteristics and memory cell that receive Sense expression and the model library of action are matched, and obtain the emotional facial expressions corresponding to the affective characteristics and action, single by showing Member shows corresponding emotional facial expressions and action；Intonation matching unit is by word corresponding to affective characteristics or sentence and speech emotional pair Data in the intonation and word speed database answered are matched, and so as to obtain intonation and word speed corresponding to the word or sentence, are led to Cross voice playing unit and play the corresponding speech exchange information with intonation and word speed, pass through voice playing module and display Module synchronization plays, so as to the exchange of more people in actual environment of Virtual User.

The beneficial effects of the utility model are：Systemic-function is departing from the dependence to handle, and quantity is not by button shadow Ring, it is simple to operate, and by the operation of user's speech control system, in addition, by extracting user from user speech information Go out mood, the action message of user, and by being played accordingly by the way that voice playing module is synchronous with display module, so as to virtual User mutually exchanges and expressed oneself emotion in actual environment in multiplayer or application, really realizes emotion communication, enters one Step improves experience effect of the user in virtual environment.

Brief description of the drawings

Fig. 1 is Tthe utility model system frame diagram；

Embodiment

Specific embodiment of the present utility model is described further below in conjunction with the accompanying drawings：

As shown in figure 1, a kind of voice interactive system with emotive function based on reality environment, its feature exist In：Including voice mobile terminal, virtual environment terminal, external server, the external server respectively with voice mobile terminal, Virtual environment terminal called is connected, and the voice mobile terminal is connected with virtual environment terminal called；

The voice mobile terminal includes

Voice acquisition module, pre-processed for gathering the voice signal of user, and to collection voice signal；

Sound identification module, for the voice signal of pretreatment to be converted into text message；

Speech emotional characteristic parameter extraction module, there is the ginseng of affective characteristics for extracting in pretreated text message Number；

Memory module, for storing voice recognition data, voice control command data from external server loading renewal Storehouse and speech emotional database；

Wireless communication module, control command or speech exchange information for will identify that are sent to virtual environment terminal, And connected for being communicated with external server, arrive memory module so as to which corresponding packet in external server is loaded into renewal In；

Processor, loaded more for handling the user speech information of collection or sending more newer command to external server The voice messaging of new memory module storage；

The processor respectively with voice acquisition module, sound identification module, speech emotional characteristic parameter extraction module, deposit Store up module, wireless communication module connection；

The virtual environment terminal includes

Memory cell, for storing the virtual portrait emotional facial expressions from external server loading renewal and the model of action Intonation corresponding to storehouse, speech emotional and word speed database；

Voice playing module, for playing the speech text information received；

Display module, for showing emotional facial expressions and the action of virtual portrait phonetic representation；

Communication module, for voice communication of mobile terminal, and with external server communicate connect, so as to by outside take Corresponding packet loads renewal into memory cell in business device.

Described voice acquisition module is mainly microphone.

The voice recognition data stored in sound identification module combination memory module converts voice signals into text message Form, whether it is control command that text message is matched with the order data in voice control command database；If Control command then generates corresponding control command and parameter, and exports to virtual environment terminal and carry out corresponding control operation, tool The control operation of body can be System menu associative operation, such as " menu ", " return ", " exiting ", " beginning ", " it is determined that ", " take Disappear " etc.；Can also be man-machine interactive operation, such as related behaviour in gaming is empty, " advancing 50 meters ", " turning left 60 degree ", " being moved right 30 seconds with the speed of 10 metre per second (m/s)s " etc. operates；

Then it is speech exchange information if not control command, it is pre- by speech emotional characteristic parameter extraction module analysis The waveform of voice signal after processing, and extract the parameter with affective characteristics, the ginseng with affective characteristics that will be extracted Number is matched with the mood data of speech emotional database, so as to draw corresponding affective characteristics, then by the affective characteristics Information MAP corresponding word or sentence, and affective characteristics and the affective characteristics information MAP corresponding word or sentence are conveyed To virtual environment terminal,

Speech emotional data in the speech emotional database are mainly defeated using classifier training of the prior art Go out, be used as training sample by first collecting emotion voice data；Then MFCC parameters, formant and zero-crossing rate are extracted to it Three characteristic parameters simultaneously carry out combinations of features, establish gauss hybrid models；Gauss hybrid models are classified by emotional category, formed The acoustic model database of each emotional category；When receiving the speech data with emotional culture, characteristic parameter is extracted to it, so Match afterwards with the acoustic model under each mood classification, finally obtain the emotion information of the voice.

The virtual portrait emotional facial expressions, action model storehouse are mainly established corresponding to all kinds of emotions by 3D modeling software The 3D person models for the action that expression and the mood habituation of some exaggerations are made.

Simply illustrate principle of the present utility model and most preferred embodiment described in above-described embodiment and specification, do not taking off On the premise of the spirit and scope of the utility model, the utility model also has various changes and modifications, these changes and improvements Both fall within claimed the scope of the utility model.

Claims

A kind of 1. voice interactive system with emotive function based on reality environment, it is characterised in that：Moved including voice Dynamic terminal, virtual environment terminal, external server, the external server respectively with voice mobile terminal, virtual environment terminal Communication connection, the voice mobile terminal are connected with virtual environment terminal called；

The voice mobile terminal includes processor, and what is be connected with processor is used to gather the voice signal of user, and to collection The voice acquisition module that is pre-processed of voice signal；

It is connected to the voice signal of pretreatment being converted into the sound identification module of text message with processor；

What is be connected with processor is used to extract the speech emotional characteristic parameter extraction of the parameter with affective characteristics in text message Module；

What is be connected with processor is used to store voice recognition data, the voice control command data from external server loading renewal Storehouse and the memory module of speech emotional database；

What is be connected with processor is used to be connected with virtual environment terminal called, so as to the control command that will identify that or speech exchange Information is sent to virtual environment terminal, and is connected for being communicated with external server, so that will be corresponding in external server Packet loads renewal to the wireless communication module in memory module；

The voice acquisition module is connected with sound identification module respectively, the connection of speech emotional characteristic parameter extraction module；

The memory module is connected with sound identification module and speech emotional characteristic parameter extraction module respectively；

The virtual environment terminal includes being used to store the virtual portrait emotional facial expressions from external server loading renewal and action Model library, the memory cell of intonation and word speed database corresponding to speech emotional；

There is intonation and the word of word speed or the voice playing module of sentence for playing in the speech exchange information received,

For showing the emotional facial expressions of virtual portrait and the display module of action in speech exchange information,

For being connected with voice communication of mobile terminal, and communicate and connect with external server, so as to by phase in external server It is communication module into memory cell that the packet answered, which loads renewal,.
2. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by：Described voice acquisition module is mainly microphone.
3. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by：Described sound identification module includes speech feature extraction unit, phonetic feature comparing unit, comparative result output Unit, the speech feature extraction unit are connected with phonetic feature comparing unit, and the phonetic feature comparing unit is tied compared with Fruit output unit connects.
4. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by：The speech emotional characteristic parameter extraction module includes affective feature extraction unit, affective characteristics comparing unit, feelings Feel feature output unit, the affective feature extraction unit is connected with affective characteristics comparing unit, and the affective characteristics is more single Member is connected with affective characteristics output unit.
5. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by：The voice playing module includes intonation matching unit, voice playing unit, the intonation matching unit and language Sound broadcast unit connects.
6. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by：The display module includes action matching unit, display unit, and the action matching unit connects with display unit Connect.