CN206711600U - The voice interactive system with emotive function based on reality environment - Google Patents

The voice interactive system with emotive function based on reality environment Download PDF

Info

Publication number
CN206711600U
CN206711600U CN201720170435.XU CN201720170435U CN206711600U CN 206711600 U CN206711600 U CN 206711600U CN 201720170435 U CN201720170435 U CN 201720170435U CN 206711600 U CN206711600 U CN 206711600U
Authority
CN
China
Prior art keywords
voice
module
speech
unit
external server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201720170435.XU
Other languages
Chinese (zh)
Inventor
黄昌正
林正才
冀鸣
刘晓悦
叶永权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Science And Technology Co Ltd
Original Assignee
Guangzhou Science And Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Science And Technology Co Ltd filed Critical Guangzhou Science And Technology Co Ltd
Priority to CN201720170435.XU priority Critical patent/CN206711600U/en
Application granted granted Critical
Publication of CN206711600U publication Critical patent/CN206711600U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

The utility model provides a kind of voice interactive system with emotive function based on reality environment,Including voice mobile terminal,Virtual environment terminal,External server,User speech is gathered by voice mobile terminal,And handled,It is control command or speech exchange information to obtain so as to obtain user speech information,And send to virtual environment terminal,Carry out corresponding control operation and corresponding emotion,Action display and broadcasting,So as to the exchange of more people in actual environment of Virtual User,Function of the present utility model is departing from the dependence to handle,And quantity is not influenceed by button,It is simple to operate,And pass through the operation of user's speech control system,In addition,The mood for going out user by extracting user from user speech information,Action message,So as to which Virtual User mutually exchanges and expressed oneself emotion in actual environment in multiplayer or application,Really realize emotion communication,Further increase experience effect of the user in virtual environment.

Description

The voice interactive system with emotive function based on reality environment
Technical field
It the utility model is related to a kind of technical field of reality environment, it is especially a kind of based on reality environment Voice interactive system with emotive function.
Background technology
Virtual reality (Virtual Reality, referred to as " VR ") is the new and high technology occurred in recent years, and its principle is to utilize Computer simulation produces the virtual world of a three dimensions, and the mould on sense organs such as vision, the sense of hearing, tactiles is provided to user Intend, allow user as on the spot in person, the things in three dimensions can be observed in time, without limitation.And interact control Field processed is one of important application direction of virtual reality technology, and the also fast development for virtual reality technology has played huge need Seek draw.
At present, some science-and-technology enterprises have been proposed corresponding virtual reality control device, for example, Oculus companies of the U.S. HTC Vive that Gear, HTC company that Oculus Rift of release, Samsung of South Korea release release etc..However, these are empty The control system for intending real world devices remains in the control method of handle.
A kind of interaction handle for virtual reality control of Chinese patent 201610869534.7, it is open a kind of for void Intend the operation handle of actual environment, implementation method is complicated, control is not accurate, control instruction limited amount is in key number.And These control modes can not but manipulate for the handicapped people of hand;The handle control flow for domestic consumer It is more complicated, it is to be understood that the function of each button could operate.
Chinese patent is a kind of 201610270381.4 multi-user voice exchange method based on Virtual Reality scene And device;The simply function of voice call exchange of the simple realization in the multiplayer of virtual environment, but can not be in void Expression, mood, action of game role etc. are seen in the game in near-ring border;See that the personage in game is simply simple fixed Expression, nozzle type action speaking, have no emotion in speech exchange.
The content of the invention
In view of the shortcomings of the prior art, the utility model provides the voice with emotive function based on reality environment Interactive system, so as to avoid dependent on complex operation caused by button, sensing equipment in virtual environment, function is by by bond number The problems such as amount limitation.
The technical solution of the utility model is:A kind of interactive voice system with emotive function based on reality environment System, it is characterised in that:Including voice mobile terminal, virtual environment terminal, external server, the external server respectively with language Sound mobile terminal, the connection of virtual environment terminal called, the voice mobile terminal are connected with virtual environment terminal called;
The voice mobile terminal includes processor, and what is be connected with processor is used to gather the voice signal of user, and right The voice acquisition module that the voice signal of collection is pre-processed;
It is connected to the voice signal of pretreatment being converted into the sound identification module of text message with processor;
What is be connected with processor is used to extract the speech emotional characteristic parameter of the parameter with affective characteristics in text message Extraction module;
What is be connected with processor is used to store voice recognition data, the voice control command from external server loading renewal The memory module of database and speech emotional database;
What is be connected with processor is used to be connected with virtual environment terminal called, so as to the control command or voice that will identify that Exchange of information is sent to virtual environment terminal, and is connected for being communicated with external server, so as to by phase in external server The packet answered loads renewal to the wireless communication module in memory module;
The voice acquisition module is connected with sound identification module respectively, the connection of speech emotional characteristic parameter extraction module, The memory module is connected with sound identification module and speech emotional characteristic parameter extraction module respectively;
The virtual environment terminal include be used for store from external server loading renewal virtual portrait emotional facial expressions and The memory cell of intonation and word speed database corresponding to the model library of action, speech emotional;
For playing in the speech exchange information received there is intonation and the word of word speed or the voice of sentence to broadcast
Amplification module,
For showing the emotional facial expressions of virtual portrait and the display module of action in speech exchange information,
For being connected with voice communication of mobile terminal, and communicate and connect with external server, so as to by external server In corresponding packet load that to update into memory cell be communication module.
Described voice acquisition module is mainly microphone.
Described sound identification module includes speech feature extraction unit, phonetic feature comparing unit, comparative result output Unit, the speech feature extraction unit are connected with phonetic feature comparing unit, and the phonetic feature comparing unit is tied compared with Fruit output unit connects.
The speech emotional characteristic parameter extraction module includes affective feature extraction unit, affective characteristics comparing unit, feelings Feel feature output unit, the affective feature extraction unit is connected with affective characteristics comparing unit, and the affective characteristics is more single Member is connected with affective characteristics output unit.
The voice playing module includes intonation matching unit, voice playing unit, the intonation matching unit and voice Broadcast unit connects.
The display module includes action matching unit, display unit, and the action matching unit is connected with display unit.
Voice mobile terminal is attached with virtual environment terminal, after successful connection, the processor of voice mobile terminal, void Near-ring border terminal sends database version querying command to external server respectively, in the memory module of voice inquirement mobile terminal The voice recognition data of storage, the version of voice control command database and speech emotional database and virtual environment terminal Memory cell in intonation and word speed data corresponding to the virtual portrait emotional facial expressions that store and the model library of action, speech emotional Whether the version in storehouse is consistent with external server, updates corresponding latest edition from external server loading if inconsistent Data are into corresponding memory module, memory cell, so that memory module is last state with the data in memory cell;
Voice acquisition module gathers the voice signal of user, and the voice signal of collection the pre- place such as be filtered to, quantified Sent after reason to sound identification module, speech emotional characteristic parameter extraction module;
The voice recognition data stored in sound identification module combination memory module converts voice signals into text message Form, and whether it is control command that text message is matched with the order data in voice control command database;If It is that control command then generates corresponding control command and parameter, and exports to virtual environment terminal and carry out corresponding control operation;
If not control command, then it is speech exchange information, then passes through speech emotional characteristic parameter extraction module analysis The waveform of pretreated voice signal, and the parameter with affective characteristics is extracted, there is affective characteristics by what is extracted Parameter is matched with the mood data of speech emotional database, then that the emotion is special so as to draw corresponding affective characteristics Sign information MAP corresponding word or sentence are simultaneously delivered to virtual environment terminal,
The action matching unit of virtual environment terminal is by visual human's principle in the affective characteristics and memory cell that receive Sense expression and the model library of action are matched, and obtain the emotional facial expressions corresponding to the affective characteristics and action, single by showing Member shows corresponding emotional facial expressions and action;Intonation matching unit is by word corresponding to affective characteristics or sentence and speech emotional pair Data in the intonation and word speed database answered are matched, and so as to obtain intonation and word speed corresponding to the word or sentence, are led to Cross voice playing unit and play the corresponding speech exchange information with intonation and word speed, pass through voice playing module and display Module synchronization plays, so as to the exchange of more people in actual environment of Virtual User.
The beneficial effects of the utility model are:Systemic-function is departing from the dependence to handle, and quantity is not by button shadow Ring, it is simple to operate, and by the operation of user's speech control system, in addition, by extracting user from user speech information Go out mood, the action message of user, and by being played accordingly by the way that voice playing module is synchronous with display module, so as to virtual User mutually exchanges and expressed oneself emotion in actual environment in multiplayer or application, really realizes emotion communication, enters one Step improves experience effect of the user in virtual environment.
Brief description of the drawings
Fig. 1 is Tthe utility model system frame diagram;
Embodiment
Specific embodiment of the present utility model is described further below in conjunction with the accompanying drawings:
As shown in figure 1, a kind of voice interactive system with emotive function based on reality environment, its feature exist In:Including voice mobile terminal, virtual environment terminal, external server, the external server respectively with voice mobile terminal, Virtual environment terminal called is connected, and the voice mobile terminal is connected with virtual environment terminal called;
The voice mobile terminal includes
Voice acquisition module, pre-processed for gathering the voice signal of user, and to collection voice signal;
Sound identification module, for the voice signal of pretreatment to be converted into text message;
Speech emotional characteristic parameter extraction module, there is the ginseng of affective characteristics for extracting in pretreated text message Number;
Memory module, for storing voice recognition data, voice control command data from external server loading renewal Storehouse and speech emotional database;
Wireless communication module, control command or speech exchange information for will identify that are sent to virtual environment terminal, And connected for being communicated with external server, arrive memory module so as to which corresponding packet in external server is loaded into renewal In;
Processor, loaded more for handling the user speech information of collection or sending more newer command to external server The voice messaging of new memory module storage;
The processor respectively with voice acquisition module, sound identification module, speech emotional characteristic parameter extraction module, deposit Store up module, wireless communication module connection;
The voice acquisition module is connected with sound identification module respectively, the connection of speech emotional characteristic parameter extraction module, The memory module is connected with sound identification module and speech emotional characteristic parameter extraction module respectively;
The virtual environment terminal includes
Memory cell, for storing the virtual portrait emotional facial expressions from external server loading renewal and the model of action Intonation corresponding to storehouse, speech emotional and word speed database;
Voice playing module, for playing the speech text information received;
Display module, for showing emotional facial expressions and the action of virtual portrait phonetic representation;
Communication module, for voice communication of mobile terminal, and with external server communicate connect, so as to by outside take Corresponding packet loads renewal into memory cell in business device.
Described voice acquisition module is mainly microphone.
Described sound identification module includes speech feature extraction unit, phonetic feature comparing unit, comparative result output Unit, the speech feature extraction unit are connected with phonetic feature comparing unit, and the phonetic feature comparing unit is tied compared with Fruit output unit connects.
The speech emotional characteristic parameter extraction module includes affective feature extraction unit, affective characteristics comparing unit, feelings Feel feature output unit, the affective feature extraction unit is connected with affective characteristics comparing unit, and the affective characteristics is more single Member is connected with affective characteristics output unit.
The voice playing module includes intonation matching unit, voice playing unit, the intonation matching unit and voice Broadcast unit connects.
The display module includes action matching unit, display unit, and the action matching unit is connected with display unit.
Voice mobile terminal is attached with virtual environment terminal, after successful connection, the processor of voice mobile terminal, void Near-ring border terminal sends database version querying command to external server respectively, in the memory module of voice inquirement mobile terminal The voice recognition data of storage, the version of voice control command database and speech emotional database and virtual environment terminal Memory cell in intonation and word speed data corresponding to the virtual portrait emotional facial expressions that store and the model library of action, speech emotional Whether the version in storehouse is consistent with external server, updates corresponding latest edition from external server loading if inconsistent Data are into corresponding memory module, memory cell, so that memory module is last state with the data in memory cell;
Voice acquisition module gathers the voice signal of user, and the voice signal of collection the pre- place such as be filtered to, quantified Sent after reason to sound identification module, speech emotional characteristic parameter extraction module;
The voice recognition data stored in sound identification module combination memory module converts voice signals into text message Form, whether it is control command that text message is matched with the order data in voice control command database;If Control command then generates corresponding control command and parameter, and exports to virtual environment terminal and carry out corresponding control operation, tool The control operation of body can be System menu associative operation, such as " menu ", " return ", " exiting ", " beginning ", " it is determined that ", " take Disappear " etc.;Can also be man-machine interactive operation, such as related behaviour in gaming is empty, " advancing 50 meters ", " turning left 60 degree ", " being moved right 30 seconds with the speed of 10 metre per second (m/s)s " etc. operates;
Then it is speech exchange information if not control command, it is pre- by speech emotional characteristic parameter extraction module analysis The waveform of voice signal after processing, and extract the parameter with affective characteristics, the ginseng with affective characteristics that will be extracted Number is matched with the mood data of speech emotional database, so as to draw corresponding affective characteristics, then by the affective characteristics Information MAP corresponding word or sentence, and affective characteristics and the affective characteristics information MAP corresponding word or sentence are conveyed To virtual environment terminal,
The action matching unit of virtual environment terminal is by visual human's principle in the affective characteristics and memory cell that receive Sense expression and the model library of action are matched, and obtain the emotional facial expressions corresponding to the affective characteristics and action, single by showing Member shows corresponding emotional facial expressions and action;Intonation matching unit is by word corresponding to affective characteristics or sentence and speech emotional pair Data in the intonation and word speed database answered are matched, and so as to obtain intonation and word speed corresponding to the word or sentence, are led to Cross voice playing unit and play the corresponding speech exchange information with intonation and word speed, pass through voice playing module and display Module synchronization plays, so as to the exchange of more people in actual environment of Virtual User.
Speech emotional data in the speech emotional database are mainly defeated using classifier training of the prior art Go out, be used as training sample by first collecting emotion voice data;Then MFCC parameters, formant and zero-crossing rate are extracted to it Three characteristic parameters simultaneously carry out combinations of features, establish gauss hybrid models;Gauss hybrid models are classified by emotional category, formed The acoustic model database of each emotional category;When receiving the speech data with emotional culture, characteristic parameter is extracted to it, so Match afterwards with the acoustic model under each mood classification, finally obtain the emotion information of the voice.
The virtual portrait emotional facial expressions, action model storehouse are mainly established corresponding to all kinds of emotions by 3D modeling software The 3D person models for the action that expression and the mood habituation of some exaggerations are made.
Speech emotional data in the speech emotional database are mainly defeated using classifier training of the prior art Go out, be used as training sample by first collecting emotion voice data;Then MFCC parameters, formant and zero-crossing rate are extracted to it Three characteristic parameters simultaneously carry out combinations of features, establish gauss hybrid models;Gauss hybrid models are classified by emotional category, formed The acoustic model database of each emotional category;When receiving the speech data with emotional culture, characteristic parameter is extracted to it, so Match afterwards with the acoustic model under each mood classification, finally obtain the emotion information of the voice.
The virtual portrait emotional facial expressions, action model storehouse are mainly established corresponding to all kinds of emotions by 3D modeling software The 3D person models for the action that expression and the mood habituation of some exaggerations are made.
Simply illustrate principle of the present utility model and most preferred embodiment described in above-described embodiment and specification, do not taking off On the premise of the spirit and scope of the utility model, the utility model also has various changes and modifications, these changes and improvements Both fall within claimed the scope of the utility model.

Claims (6)

  1. A kind of 1. voice interactive system with emotive function based on reality environment, it is characterised in that:Moved including voice Dynamic terminal, virtual environment terminal, external server, the external server respectively with voice mobile terminal, virtual environment terminal Communication connection, the voice mobile terminal are connected with virtual environment terminal called;
    The voice mobile terminal includes processor, and what is be connected with processor is used to gather the voice signal of user, and to collection The voice acquisition module that is pre-processed of voice signal;
    It is connected to the voice signal of pretreatment being converted into the sound identification module of text message with processor;
    What is be connected with processor is used to extract the speech emotional characteristic parameter extraction of the parameter with affective characteristics in text message Module;
    What is be connected with processor is used to store voice recognition data, the voice control command data from external server loading renewal Storehouse and the memory module of speech emotional database;
    What is be connected with processor is used to be connected with virtual environment terminal called, so as to the control command that will identify that or speech exchange Information is sent to virtual environment terminal, and is connected for being communicated with external server, so that will be corresponding in external server Packet loads renewal to the wireless communication module in memory module;
    The voice acquisition module is connected with sound identification module respectively, the connection of speech emotional characteristic parameter extraction module;
    The memory module is connected with sound identification module and speech emotional characteristic parameter extraction module respectively;
    The virtual environment terminal includes being used to store the virtual portrait emotional facial expressions from external server loading renewal and action Model library, the memory cell of intonation and word speed database corresponding to speech emotional;
    There is intonation and the word of word speed or the voice playing module of sentence for playing in the speech exchange information received,
    For showing the emotional facial expressions of virtual portrait and the display module of action in speech exchange information,
    For being connected with voice communication of mobile terminal, and communicate and connect with external server, so as to by phase in external server It is communication module into memory cell that the packet answered, which loads renewal,.
  2. 2. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by:Described voice acquisition module is mainly microphone.
  3. 3. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by:Described sound identification module includes speech feature extraction unit, phonetic feature comparing unit, comparative result output Unit, the speech feature extraction unit are connected with phonetic feature comparing unit, and the phonetic feature comparing unit is tied compared with Fruit output unit connects.
  4. 4. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by:The speech emotional characteristic parameter extraction module includes affective feature extraction unit, affective characteristics comparing unit, feelings Feel feature output unit, the affective feature extraction unit is connected with affective characteristics comparing unit, and the affective characteristics is more single Member is connected with affective characteristics output unit.
  5. 5. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by:The voice playing module includes intonation matching unit, voice playing unit, the intonation matching unit and language Sound broadcast unit connects.
  6. 6. a kind of voice interactive system with emotive function based on reality environment according to claim 1, its It is characterised by:The display module includes action matching unit, display unit, and the action matching unit connects with display unit Connect.
CN201720170435.XU 2017-02-24 2017-02-24 The voice interactive system with emotive function based on reality environment Active CN206711600U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201720170435.XU CN206711600U (en) 2017-02-24 2017-02-24 The voice interactive system with emotive function based on reality environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201720170435.XU CN206711600U (en) 2017-02-24 2017-02-24 The voice interactive system with emotive function based on reality environment

Publications (1)

Publication Number Publication Date
CN206711600U true CN206711600U (en) 2017-12-05

Family

ID=60469149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201720170435.XU Active CN206711600U (en) 2017-02-24 2017-02-24 The voice interactive system with emotive function based on reality environment

Country Status (1)

Country Link
CN (1) CN206711600U (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018611A (en) * 2017-04-23 2017-08-04 黄石德龙自动化科技有限公司 A kind of wisdom lamp control system and control method based on speech recognition and emotion
CN108159687A (en) * 2017-12-19 2018-06-15 芋头科技(杭州)有限公司 A kind of automated induction systems and intelligent sound box equipment based on more people's interactive processes
CN108247633A (en) * 2017-12-27 2018-07-06 珠海格力节能环保制冷技术研究中心有限公司 The control method and system of robot
CN108986803A (en) * 2018-06-26 2018-12-11 北京小米移动软件有限公司 Scenery control method and device, electronic equipment, readable storage medium storing program for executing
CN109961152A (en) * 2019-03-14 2019-07-02 广州多益网络股份有限公司 Personalized interactive method, system, terminal device and the storage medium of virtual idol
CN110488975A (en) * 2019-08-19 2019-11-22 深圳市仝智科技有限公司 A kind of data processing method and relevant apparatus based on artificial intelligence

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107018611A (en) * 2017-04-23 2017-08-04 黄石德龙自动化科技有限公司 A kind of wisdom lamp control system and control method based on speech recognition and emotion
CN108159687A (en) * 2017-12-19 2018-06-15 芋头科技(杭州)有限公司 A kind of automated induction systems and intelligent sound box equipment based on more people's interactive processes
CN108159687B (en) * 2017-12-19 2021-06-04 芋头科技(杭州)有限公司 Automatic guidance system and intelligent sound box equipment based on multi-person interaction process
CN108247633A (en) * 2017-12-27 2018-07-06 珠海格力节能环保制冷技术研究中心有限公司 The control method and system of robot
CN108247633B (en) * 2017-12-27 2021-09-03 珠海格力节能环保制冷技术研究中心有限公司 Robot control method and system
CN108986803A (en) * 2018-06-26 2018-12-11 北京小米移动软件有限公司 Scenery control method and device, electronic equipment, readable storage medium storing program for executing
CN109961152A (en) * 2019-03-14 2019-07-02 广州多益网络股份有限公司 Personalized interactive method, system, terminal device and the storage medium of virtual idol
CN110488975A (en) * 2019-08-19 2019-11-22 深圳市仝智科技有限公司 A kind of data processing method and relevant apparatus based on artificial intelligence
CN110488975B (en) * 2019-08-19 2021-04-13 深圳市仝智科技有限公司 Data processing method based on artificial intelligence and related device

Similar Documents

Publication Publication Date Title
CN106710590A (en) Voice interaction system with emotional function based on virtual reality environment and method
CN206711600U (en) The voice interactive system with emotive function based on reality environment
CN110427472A (en) The matched method, apparatus of intelligent customer service, terminal device and storage medium
CN110647636B (en) Interaction method, interaction device, terminal equipment and storage medium
CN111833418B (en) Animation interaction method, device, equipment and storage medium
CN110400251A (en) Method for processing video frequency, device, terminal device and storage medium
CN104461525B (en) A kind of intelligent consulting platform generation system that can customize
CN110413841A (en) Polymorphic exchange method, device, system, electronic equipment and storage medium
CN110286756A (en) Method for processing video frequency, device, system, terminal device and storage medium
CN107797663A (en) Multi-modal interaction processing method and system based on visual human
CN107294837A (en) Engaged in the dialogue interactive method and system using virtual robot
CN106200886A (en) A kind of intelligent movable toy manipulated alternately based on language and toy using method
CN109522835A (en) Children's book based on intelligent robot is read and exchange method and system
CN107831905A (en) A kind of virtual image exchange method and system based on line holographic projections equipment
CN204650422U (en) A kind of intelligent movable toy manipulated alternately based on language
CN104317389B (en) Method and device for identifying character role through action
CN106528859A (en) Data pushing system and method
CN108345385A (en) Virtual accompany runs the method and device that personage establishes and interacts
CN109324688A (en) Exchange method and system based on visual human's behavioral standard
CN108416420A (en) Limbs exchange method based on visual human and system
CN105244042B (en) A kind of speech emotional interactive device and method based on finite-state automata
CN107784355A (en) The multi-modal interaction data processing method of visual human and system
CN109343695A (en) Exchange method and system based on visual human's behavioral standard
WO2022089224A1 (en) Video communication method and apparatus, electronic device, computer readable storage medium, and computer program product
CN108052250A (en) Virtual idol deductive data processing method and system based on multi-modal interaction

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant