CN109324688A - Exchange method and system based on visual human's behavioral standard - Google Patents
Exchange method and system based on visual human's behavioral standard Download PDFInfo
- Publication number
- CN109324688A CN109324688A CN201810953818.3A CN201810953818A CN109324688A CN 109324688 A CN109324688 A CN 109324688A CN 201810953818 A CN201810953818 A CN 201810953818A CN 109324688 A CN109324688 A CN 109324688A
- Authority
- CN
- China
- Prior art keywords
- data
- visual human
- emotion
- human
- interaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
Abstract
The present invention provides a kind of exchange method based on visual human's behavioral standard, visual human is shown by smart machine, starting voice, emotion, vision and sensing capability when being in interaction mode, include: obtaining multi-modal interaction data, multi-modal interaction data is parsed, the interaction for obtaining user is intended to;It is intended to generate visual human's language response data and corresponding visual human's behavior expression data according to interaction, wherein, visual human's behavior expression data include the headwork data of visual human, eye sight control data and facial expression data, body motion data and major beat data;Visual human's behavior expression data are cooperated to export visual human's language response data.The present invention can cooperate output visual human's behavior expression data when exporting multi-modal response data, express the emotion of visual human, so that being able to carry out the smooth interactive experience for exchanging, and making user's enjoyment anthropomorphic between user and visual human.
Description
Technical field
The present invention relates to artificial intelligence fields, specifically, being related to a kind of exchange method based on visual human's behavioral standard
And system.
Background technique
The exploitation of robot multi-modal interactive system is dedicated to imitating human conversation, to attempt to imitate people between context
Interaction between class.But at present for, the exploitation of robot multi-modal interactive system relevant for visual human is also less complete
It is kind, not yet occur carrying out the visual human of multi-modal interaction, it is even more important that there is no and carry out based on visual human itself behavioral standard
Interactive interactive product.
Therefore, the present invention provides a kind of exchange method and system based on visual human's behavioral standard.
Summary of the invention
To solve the above problems, the present invention provides a kind of exchange method based on visual human's behavioral standard, it is described virtual
People is shown by smart machine, starts voice, emotion, vision and sensing capability, the method packet when being in interaction mode
Containing following steps:
Multi-modal interaction data is obtained, the multi-modal interaction data is parsed, the interaction for obtaining user is intended to;
It is intended to generate visual human's language response data and corresponding visual human's behavior expression number according to the interaction
According to, wherein visual human's behavior expression data include the headwork data of visual human, eye sight control data and face
Expression data, body motion data and major beat data;
Visual human's behavior expression data are cooperated to export visual human's language response data.
According to one embodiment of present invention, according to interaction intention generation visual human's language response data and therewith
In the step of corresponding visual human's behavior expression data, also comprise the steps of:
Visual human's language response data is parsed, the virtual human feelings for including in visual human's language response data are extracted
Feel information;
It obtains and the matched visual human's headwork data of visual human's emotion information, eye sight control data and face
Portion's expression data, body motion data and major beat data;
Visual human's behavior expression data are generated according to matched result.
According to one embodiment of present invention, visual human's emotion information includes: positive emotion, negative sense emotion, attitude emotion
And communication emotion, wherein positive emotion includes: happy, self-confident, expect and surprised;Negative sense emotion include it is angry, frightened,
Sad and detest;Attitude emotion includes: accepting and does not accept;Emotion of communicating includes greeting and goodbye.
According to one embodiment of present invention, the step of generating visual human's behavior expression data according to matched result
In, also comprise the steps of:
Classify according to visual human's emotion information, checks headwork data, the eye sight control data of visual human
And visual human's emotion that facial expression data, body motion data and major beat data respectively represent whether there is conflict simultaneously
Replacement.
According to one embodiment of present invention, the visual human has specific virtual image and preset attribute, is generating
When visual human's behavior expression data, with headwork data, the eye sight of the visual human that the preset attribute is not consistent
Control data and facial expression data, body motion data and major beat data are not involved in decision and output.
According to one embodiment of present invention, cooperate visual human's behavior expression data to export visual human's language to return
In the step of answering data comprising the steps of:
Determine the headwork data, eye sight control data and facial expression data, body motion data of visual human
And output time, performance degree and the duration of major beat data.
According to one embodiment of present invention, if current scene is no voice output scene, then according to the current shape of visual human
State exports the headwork data of matched visual human, eye sight control data and facial expression data, body motion data
And major beat data.
According to another aspect of the present invention, a kind of interactive device based on visual human's behavioral standard is additionally provided, it is described
Device includes:
Interaction is intended to obtain module, is used to obtain multi-modal interaction data, solves to the multi-modal interaction data
Analysis, the interaction for obtaining user are intended to;
Visual human's behavior expression data generation module is used to be intended to according to the interaction to generate visual human's language and responds number
Accordingly and corresponding visual human's behavior expression data, wherein visual human's behavior expression data include the head of visual human
Portion's action data, eye sight control data and facial expression data, body motion data and major beat data;
Output module is used to that visual human's behavior expression data to be cooperated to export visual human's language response data.
According to another aspect of the present invention, a kind of program product is additionally provided, program is run for visual human, for holding
The series of instructions of row described in any item method and steps as above.
According to another aspect of the present invention, a kind of interactive system based on visual human's behavioral standard is additionally provided, it is described
System includes:
Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression
With the ability of movement output, the smart machine includes hologram device;
Cloud brain, be used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculate and
Affection computation exports visual human's behavior expression data with visual human described in decision and visual human's language responds number
According to.
Exchange method and system provided by the invention based on visual human's behavioral standard provides a kind of visual human, visual human
Have default image and preset attribute, multi-modal interaction can be carried out with user.Also, it is provided by the invention to be based on visual human
The exchange method and system of behavioral standard can also cooperate output visual human's behavior expression number when exporting multi-modal response data
According to, express the emotion of visual human so that be able to carry out between user and visual human it is smooth exchange, and it is anthropomorphic that user is enjoyed
Interactive experience.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification
It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right
Specifically noted structure is achieved and obtained in claim and attached drawing.
Detailed description of the invention
Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention
It applies example and is used together to explain the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown
It is intended to;
Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure;
Fig. 3 shows the module frame of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure;
Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram;
Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard;
Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of anthropomorphic behavior expression data;
Fig. 7 shows emotion parameter classification schematic diagram according to an embodiment of the invention;
Fig. 8 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard
Cheng Tu;And
Fig. 9 show it is according to an embodiment of the invention user, smart machine and cloud brain between the parties
The flow chart communicated.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, the embodiment of the present invention is made below in conjunction with attached drawing
Further it is described in detail.
It is clear to state, it needs to carry out before embodiment as described below:
The visual human that the present invention mentions is equipped on the smart machine for supporting the input/output modules such as perception, control;With Gao Fang
True 3d virtual figure image is Main User Interface, has the appearance of significant character features;It supports multi-modal human-computer interaction, has
Natural language understanding, visual perception touch the AI abilities such as perception, language voice output, emotional facial expressions movement output;Configurable society
Meeting attribute, personality attribute, personage's technical ability etc. make user enjoy the virtual portrait of intelligent and personalized Flow Experience.
Visual human's smart machine mounted are as follows: have the input of non-tactile, non-mouse-keyboard screen (holography, TV screen,
Multimedia display screen, LED screen etc.), and the smart machine of camera is carried, meanwhile, it can be hologram device, VR equipment, PC
Machine.But other smart machines are not precluded, such as: hand-held plate, naked eye 3D equipment, even smart phone.
Visual human interacts in system level and user, and operating system, such as hologram device are run in the system hardware
Built-in system is windows or MAC OS if PC.
Virtual artificial system application or executable file.
Virtual robot obtains the multi-modal interaction data of user based on the hardware of the smart machine, beyond the clouds the energy of brain
Under power is supported, semantic understanding, visual identity, cognition calculating, affection computation are carried out to multi-modal interaction data, it is defeated to complete decision
Process out.
The cloud brain being previously mentioned is to provide the visual human to carry out semantic understanding (language semantic to the interaction demand of user
Understanding, Action Semantic understanding, visual identity, affection computation, cognition calculate) processing capacity terminal, realize and the friendship of user
Mutually, with the multi-modal response data of the output of visual human described in decision.
Each embodiment of the invention is described in detail with reference to the accompanying drawing.
Fig. 1 shows that the interaction of the interactive system according to an embodiment of the invention based on visual human's behavioral standard is shown
It is intended to.As shown in Figure 1, carrying out multi-modal interaction needs user 101, smart machine 102, visual human 103 and cloud brain
104.Wherein, the user 101 interacted with visual human can be the visual human of true people, another visual human and entity, another
Visual human and entity visual human are similar with the interactive process of visual human with the interactive process of visual human with single people.Therefore,
Only show the multi-modal interactive process of user (people) Yu visual human in Fig. 1.
In addition, smart machine 102 includes 1022 (substantially core processing of display area 1021 and hardware supported equipment
Device).Display area 1021 is used to show that the image of visual human 103, hardware supported equipment 1022 to make with the cooperation of cloud brain 104
With for the data processing in interactive process.Visual human 103 needs screen display carrier to present.Therefore, display area 1021 is wrapped
It includes: holographic screen, TV screen, multimedia display screen and LED screen etc..
The process interacted between visual human and user 101 in Fig. 1 are as follows:
Interaction required early-stage preparations or condition have, and visual human is carried and operated on smart machine 102, and virtual
People has specific image characteristics.Visual human has natural language understanding, visual perception, touches perception, language output, emotion table
The AI abilities such as feelings movement output.In order to cooperate the touch perceptional function of visual human, it is also required to be equipped on smart machine and has touching
Touch the component of perceptional function.According to one embodiment of present invention, in order to promote interactive experience, visual human after being activated just
It is shown in predeterminable area.
It should be noted that the image of visual human 103 and dressing up and being not limited to one mode.Visual human 103 can be with
Have different images and dresss up.The image of visual human 103 is generally 3D high mould animating image.Visual human 103 can have
Different appearance and decoration.The image of every kind of visual human 103 can also correspond to it is a variety of different dress up, the classification dressed up can be according to
Classify according to season, can also classify according to occasion.These images and dresss up and can reside in cloud brain 104, it can also be with
It is present in smart machine 102, can be called at any time when needing to call these images and dress up.
Social property, personality attribute and the personage's technical ability of visual human 103 is also not necessarily limited to a kind of or a kind of.Visual human
103 can have a variety of social properties, multiple personality attribute and a variety of personage's technical ability.These social properties, personality attribute with
And personage's technical ability can arrange in pairs or groups respectively, and be not secured to a kind of collocation mode, user, which can according to need, to be selected and arranges in pairs or groups.
Specifically, social property may include: appearance, name, dress ornament, decoration, gender, native place, age, family pass
The attributes such as system, occupation, position, religious belief, emotion state, educational background;Personality attribute may include: the attributes such as personality, makings;People
The professional skills such as object technical ability may include: sing and dance, tells a story, trains, and the displaying of personage's technical ability is not limited to limbs, table
The technical ability of feelings, head and/or mouth is shown.
In this application, the social property of visual human, personality attribute and personage's technical ability etc. can make multi-modal interaction
Parsing and the result of decision are more prone to or are more suitable for the visual human.
The following are multi-modal interactive processes, firstly, obtaining multi-modal interaction data, solve to multi-modal interaction data
Analysis, the interaction for obtaining user are intended to.The reception device for obtaining multi-modal interaction data is respectively mounted or is configured at smart machine 102
On, these reception devices include the received text device for receiving text, receive the pronunciation receiver of voice, receive taking the photograph for vision
As head and the infrored equipment etc. of reception perception information.
Then, it is intended to generate visual human's language response data and corresponding visual human's behavior expression number according to interaction
According to, wherein visual human's behavior expression data include the headwork data of visual human, eye sight control data and facial expression
Data, body motion data and major beat data.
Finally, cooperation visual human's behavior expression data export visual human's language response data.
Fig. 2 shows the structural frames of the interactive system according to an embodiment of the invention based on visual human's behavioral standard
Figure.As shown in Fig. 2, completing multi-modal interactive needs: user 101, smart machine 102 and cloud brain 104 by system.Its
In, smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device 102D.Cloud
Brain 104 includes communication device 104A.
Interactive system provided by the invention based on visual human's behavioral standard need user 101, smart machine 102 with
And unobstructed communication channel is established between cloud brain 104, so as to complete the interaction of user 101 Yu visual human.In order to complete
At interactive task, smart machine 102 and cloud brain 104 can be provided with the device and component for supporting to complete interaction.With
The object of visual human's interaction can be a side, or multi-party.
Smart machine 102 includes reception device 102A, processing unit 102B, output device 102C and attachment device
102D.Wherein, reception device 102A is for receiving multi-modal interaction data.The example of reception device 102A includes grasping for voice
The microphone of work, scanner, camera (movement touched is not related to using the detection of visible or nonvisible wavelength) etc..Intelligence is set
Standby 102 can obtain multi-modal interaction data by above-mentioned input equipment.Output device 102C is virtual for exporting
The multi-modal reply data that people interacts with user 101, substantially suitable with the configuration of reception device 102A, details are not described herein.
Processing unit 102B is for handling the interaction data transmitted in interactive process by cloud brain 104.Attachment device
102D is used for contacting between cloud brain 104, and processing unit 102B handles the pretreated multi-modal friendship of reception device 102A
Mutual data or the data transmitted by cloud brain 104.Attachment device 102D sends call instruction to call on cloud brain 104
Robot capability.
The communication device 104A that cloud brain 104 includes is for completing writing to each other between smart machine 102.Communication
It keeps in communication and contacts between attachment device 102D on device 104A and smart machine 102, what reception smart machine 102 was sent asks
It asks, and sends the processing result of the sending of cloud brain 104, be Jie linked up between smart machine 102 and cloud brain 104
Matter.
Fig. 3 shows the module of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram.As shown in figure 3, system includes that interaction is intended to obtain module 301, generation module 302 and output module 303.Wherein, it hands over
Mutually it is intended to obtain module 301 to include text collection unit 3011, audio collection unit 3012, vision collecting unit 3013, perception
Acquisition unit 3014 and resolution unit 3015.Generation module 302 includes language response data generation unit 3021 and behavior
Show data generating unit 3022.Output module 303 includes cooperation output unit 3031.
Interaction is intended to obtain module 301 for obtaining multi-modal interaction data, parses, obtains to multi-modal interaction data
Interaction to user is intended to.Visual human 103 shown by smart machine 102, when being in interaction mode starting voice, emotion,
Vision and sensing capability.Text collection unit 3011 is used to acquire text information.Audio collection unit 3012 is used to acquire sound
Frequency information.Vision collecting unit 3013 is used to acquire visual information.Perception acquisition unit 3014 is used to acquire perception information.More than
The example of acquisition unit includes the microphone for voice operating, scanner, camera, sensing control equipment, such as using visible or not
Visible wavelength ray, signal, environmental data etc..Multi-modal interactive number can be obtained by above-mentioned input equipment
According to.Multi-modal interaction may include one of text, audio, vision and perception data, also may include a variety of, the present invention
It is restricted not to this.
Generation module 302 is used to be intended to according to interaction to generate visual human's language response data and corresponding visual human
Behavior expression data, wherein visual human's behavior expression data include the headwork data of visual human, eye sight control data
And facial expression data, body motion data and major beat data.
Language response data generation unit 3021 is used to be intended to according to interaction to generate visual human's language response data.Behavior table
Existing data generating unit 3022 is for generating visual human's behavior expression data corresponding with visual human's language response data.
Output module 303 is for cooperating visual human's behavior expression data to export visual human's language response data.Cooperation output
Unit 3031 be used for when export multi-modal language response data at the time of suitable and position cooperation export visual human's behavior
Show data.
Fig. 4 shows the structure of the interactive system based on visual human's behavioral standard according to another embodiment of the invention
Block diagram.As shown in figure 4, completing interaction needs user 101, smart machine 102 and cloud brain 104.Wherein, smart machine
102 include man-machine interface 401, data processing unit 402, input/output unit 403 and interface unit 404.Cloud brain 104
Interface 1043 and affection computation interface 1044 are calculated comprising semantic understanding interface 1041, visual identity interface 1042, cognition.
Interactive system provided by the invention based on visual human's behavioral standard includes smart machine 102 and cloud brain
104.Visual human 103 runs in smart machine 102, and visual human 103 has default image and preset attribute, in interaction
It can star voice, emotion, vision and sensing capability when state.
In one embodiment, smart machine 102 may include: man-machine interface 401, data processing unit 402, input it is defeated
Device 403 and interface unit 404 out.Wherein, man-machine interface 401 is shown in the predeterminable area of smart machine 102 in fortune
The visual human 103 of row state.
Data processing unit 402 carries out the number generated in multi-modal interactive process for handling user 101 and visual human 103
According to.Processor used can be data processing unit (Central Processing Unit, CPU), can also be that other are logical
With processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng processor is the control centre of terminal, utilizes the various pieces of various interfaces and the entire terminal of connection.
It include memory in smart machine 102, memory mainly includes storing program area and storage data area, wherein is deposited
Store up program area can application program needed for storage program area, at least one function (for example sound-playing function, image play function
Energy is equal) etc.;Storage data area can store according to smart machine 102 use created data (such as audio data, browsing note
Record etc.) etc..In addition, memory may include high-speed random access memory, it can also include nonvolatile memory, such as firmly
Disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital,
SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other volatile solid-states
Part.
Input/output unit 403 is used to obtain multi-modal interaction data and exports the output data in interactive process.It connects
Mouthful unit 404 is used to communicate with the expansion of cloud brain 104, and by with the interface in cloud brain 104, to fetching, to transfer cloud big
Visual human's ability in brain 104.
Cloud brain 104 include semantic understanding interface 1041, visual identity interface 1042, cognition calculate interface 1043 and
Affection computation interface 1044.The above interface is communicated with the expansion of interface unit 404 in smart machine 102.Also, cloud is big
Brain 104 also includes and the corresponding semantic understanding logic of semantic understanding interface 1041, vision corresponding with visual identity interface 1042
Recognition logic and cognition calculate the corresponding cognition calculating logic of interface 1043 and emotion corresponding with affection computation interface 1044
Calculating logic.
As shown in figure 4, each ability interface calls corresponding logical process respectively in multi-modal data resolving.Below
For the explanation of each interface:
Semantic understanding interface 1041 receives the special sound instruction forwarded from interface unit 404, carries out voice knowledge to it
The other and natural language processing based on a large amount of corpus.
Visual identity interface 1042 can be calculated for human body, face, scene according to computer vision algorithms make, deep learning
Method etc. carries out video content detection, identification, tracking etc..Image is identified according to scheduled algorithm, the inspection of quantitative
Survey result.Have image preprocessing function, feature extraction functions, decision making function and concrete application function;
Wherein, image preprocessing function, which can be, carries out basic handling, including color sky to the vision collecting data of acquisition
Between conversion, edge extracting, image transformation and image threshold;
Feature extraction functions can extract the features such as the colour of skin of target, color, texture, movement and coordinate in image and believe
Breath;
Decision making function can be to characteristic information, is distributed to according to certain decision strategy and needs the specific of this feature information
Multi-modal output equipment or multi-modal output application, such as realize Face datection, human limbs identification, motion detection function.
Cognition calculates interface 1043, receives the multi-modal data forwarded from interface unit 404, and cognition calculates interface 1043
Data acquisition, identification and study are carried out to handle multi-modal data, to obtain user's portrait, knowledge mapping etc., to multimode
State output data carries out Rational Decision.
Affection computation interface 1044 receives the multi-modal data forwarded from interface unit 404, utilizes affection computation logic
(can be Emotion identification technology) calculates the current emotional state of user.Emotion identification technology is that one of affection computation is important
Component part, the content of Emotion identification research include facial expression, voice, behavior, text and physiological signal identification etc., are led to
Crossing the above content may determine that the emotional state of user.Emotion identification technology can be monitored only by vision Emotion identification technology
The emotional state of user can also monitor use in conjunction with by the way of using vision Emotion identification technology and sound Emotion identification technology
The emotional state at family, and be not limited thereto.In the present embodiment, it is preferred to use the two in conjunction with mode monitor mood.
Affection computation interface 1044 is to collect mankind face by using image capture device when carrying out vision Emotion identification
Portion's facial expression image is then converted into that data can be analyzed, the technologies such as image procossing is recycled to carry out the analysis of expression mood.Understand face
Expression, it usually needs the delicate variation of expression is detected, such as cheek muscle, mouth variation and choose eyebrow etc..
Fig. 5 shows the exchange method flow chart according to an embodiment of the invention based on visual human's behavioral standard.
As shown in figure 5, parsing, obtaining to multi-modal interaction data firstly, obtain multi-modal interaction data in step S501
The interaction of user is intended to.In multi-modal interactive process, virtual robot obtains multimode by the reception device on smart machine
State interaction data.It may include text data, voice data, perception data and action data etc. in multi-modal interaction data.
Then, in step S502, it is intended to generate visual human's language response data and corresponding void according to interaction
Anthropomorphic behavior expression data, wherein visual human's behavior expression data include the headwork data of visual human, the control of eye sight
Data and facial expression data, body motion data and major beat data.
According to one embodiment of present invention, classify according to visual human's emotion information, check the headwork number of visual human
It is respectively represented according to, eye sight control data and facial expression data, body motion data and major beat data virtual
Human feelings sense is with the presence or absence of conflict and replaces.
According to one embodiment of present invention, visual human has specific virtual image and preset attribute, described in generation
When visual human's behavior expression data, with headwork data, the eye sight control data of the visual human that preset attribute is not consistent
And facial expression data, body motion data and major beat data are not involved in decision and output.
Finally, cooperation visual human's behavior expression data export visual human's language response data in step S503.In order to make
It obtains visual human and reaches more anthropomorphic effect, need to export visual human's behavior expression data when interacting with user.Behavior expression number
According to language response data can be cooperated to export, the more true anthropomorphic interactive experience of user is brought.
According to one embodiment of present invention, in step S503, headwork data, the eye sight of visual human are determined
Control data and facial expression data, performance degree and are held the output time of body motion data and major beat data
The continuous time.
According to one embodiment of present invention, if current scene is no voice output scene, then according to the current shape of visual human
State exports the headwork data of matched visual human, eye sight control data and facial expression data, body motion data
And major beat data.For example, visual human exports, the behavior table of visual human when listening user speaks without language data
It is now forward lean, binocular fixation user, sight follows user mobile, visits ear and the absorbed language for listening to user exports.
In addition, the visual interactive system provided by the invention based on visual human can also cooperate a kind of program product, packet
Containing for executing the series of instructions for completing the exchange method step based on visual human's behavioral standard.Program product can run meter
The instruction of calculation machine, computer instruction includes computer program code, and computer program code can be source code form, object identification code
Form, executable file or certain intermediate forms etc..
Program product may include: can carry computer program code any entity or device, recording medium, USB flash disk,
Mobile hard disk, magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory
Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..
It should be noted that the content that program product includes can be according to making laws in jurisdiction and patent practice is wanted
It asks and carries out increase and decrease appropriate, such as do not include electric carrier wave according to legislation and patent practice, program product in certain jurisdictions
Signal and telecommunication signal.
Fig. 6, which is shown, generates void in the exchange method according to an embodiment of the invention based on visual human's behavioral standard
The flow chart of anthropomorphic behavior expression data.
As shown in fig. 6, in step s 601, parsing visual human's language response data, visual human's language response data is extracted
In include visual human's emotion information.
According to one embodiment of present invention, visual human's emotion information includes: positive emotion, negative sense emotion, attitude emotion
And communication emotion, wherein positive emotion includes: happy, self-confident, expect and surprised;Negative sense emotion include it is angry, frightened,
Sad and detest;Attitude emotion includes: accepting and does not accept;Emotion of communicating includes greeting and goodbye.
In step S602, obtain and the matched visual human's headwork data of visual human's emotion information, eye sight control
Data and facial expression data processed, body motion data and major beat data.
According to one embodiment of present invention, classify according to visual human's emotion information, check the headwork number of visual human
It is respectively represented according to, eye sight control data and facial expression data, body motion data and major beat data virtual
Human feelings sense is with the presence or absence of conflict and replaces.
In step S603, visual human's behavior expression data are generated according to matched result.
According to one embodiment of present invention, visual human has specific virtual image and preset attribute, virtual generating
When people's behavior expression data, with headwork data, eye sight control data and the face of the visual human that preset attribute is not consistent
Portion's expression data, body motion data and major beat data are not involved in decision and output.
Fig. 7 shows emotion parameter classification schematic diagram according to an embodiment of the invention.As shown in fig. 7, visual human
Emotion information includes: positive emotion, negative sense emotion, attitude emotion and communication emotion, wherein positive emotion includes: happily, certainly
Letter, expect and it is surprised;Negative sense emotion includes angry, frightened, sad and detests;Attitude emotion includes: accepting and does not recognize
Together;Emotion of communicating includes greeting and goodbye.
In one example, visual human is set to the Hans of BeiJing, China, and name is " clever youngster ".It is needed in visual human
When expressing angry emotion, the facial expression of visual human can be " staring sb. with glaring eyes ", along with body recycling and limbs little trick.
When visual human needs to express the emotions such as frightened, sad and detest, the mood of visual human is negative, with angry emotion row
It is similar to show.
When visual human needs to express happy emotion, the facial expression of visual human can be facial expression and unfold, smile or
Laugh etc..Body language can be export-oriented movement.Shape movement is embraced for example, opening one's arms.
Visual human need to express it is self-confident, expect and when the emotions such as surprised, the mood of visual human is sign face, and it is happy
The behavior expression of emotion is similar.
When visual human needs to express approval emotion, the headwork of visual human can be posture of nodding.The face of visual human
Portion's expression can be smile.When visual human needs to express and do not accept emotion, the headwork of visual human can be appearance of shaking the head
Gesture.The facial expression of visual human can be serious.
When visual human needs to express greeting emotion, the behavior of visual human can arrange in pairs or groups the body action of happy emotion,
The arm action of visual human can be shape of waving.When visual human needs to express goodbye emotion, the hand motion of visual human can be with
It is goodbye gesture.
According to one embodiment of present invention, the emotion of visual human need to have reasonability.It needs to believe according to visual human's emotion
Breath classification, check visual human headwork data, eye sight control data and facial expression data, body motion data with
And visual human's emotion that major beat data respectively represent whether there is conflict and replace.For example, visual human will not occur simultaneously
The mood wailed and laughed shows.
In addition, visual human has specific virtual image and preset attribute, when generating visual human's behavior expression data, with
Headwork data, the eye sight for the visual human that preset attribute is not consistent control data and facial expression data, body action
Data and major beat data are not involved in decision and output.For example, the personality attribute of visual human be it is introversive, then visual human is former
It is not in the laugh behavior expression rocked with laughter on then.
In addition, the attributive character of visual human can have group and heredity, have group and heredity model
The behavior expression data of visual human in enclosing have similitude.
When exporting visual human's behavior expression data, headwork data, the eye sight control data of visual human are determined
And facial expression data, the output time of body motion data and major beat data, performance degree and duration.
In general, visual human's behavior expression data need that language response data is cooperated to export, therefore, in output
When it needs to be determined that behavior expression data output opportunity and degree.For example, the duration that expression is generally the long period is dynamic
Make, therefore expression often persistently shows in the expression of a word.Body action and the corresponding language data degree of association are high, often exist
Corresponding language expression position nearby is fade-in gradually to go out.
In fact, the emotion parameter of visual human is not limited to enumerated above four, can also include feelings more abundant
Feel parameter, Emotional Behavior on Virtual Human corresponding with emotion parameter is also not unique as shown in Figure 7.Under a certain emotion parameter, visual human can
With there are many headwork data more segmented, eye sight control data and facial expression data, body motion data with
And major beat data.All forms of expression that can show visual human's emotion can apply in the embodiment of the present invention,
The present invention makes limitation not to this.
Fig. 8 shows another stream of the exchange method according to an embodiment of the invention based on visual human's behavioral standard
Cheng Tu.
As shown in figure 8, smart machine 102 is issued to cloud brain 104 and is requested in step S801.Later, in step
In S802, smart machine 102 is constantly in the state for waiting cloud brain 104 to reply.During waiting, smart machine
102 can carry out Clocked operation to returned data the time it takes.
In step S803, if the reply data not returned for a long time, for example, being more than scheduled time span
5S, then smart machine 102 can select to carry out local reply, generate local common reply data.Then, defeated in step S804
The animation cooperated out with local common response, and voice playing equipment is called to carry out voice broadcasting.
Fig. 9 show it is according to an embodiment of the invention user, smart machine and cloud brain between the parties
The flow chart communicated.
In order to realize the multi-modal interaction between smart machine 102 and user 101, user 101, smart machine 102 are needed
And communication connection is set up between cloud brain 104.This communication connection should be it is real-time, unobstructed, can guarantee to hand over
It is mutually impregnable.
In order to complete to interact, some conditions or premise are needed to have.These conditions or premise include smart machine
Visual human is loaded and run in 102, and smart machine 102 has the hardware facility of perception and control function.Visual human exists
Start voice, emotion, vision and sensing capability when in interaction mode.
After completing early-stage preparations, smart machine 102 starts to interact with the expansion of user 101, firstly, smart machine 102 obtains
Multi-modal interaction data.It may include the data of diversified forms in multi-modal interaction data, for example, can in multi-modal interaction data
To include text data, voice data, perception data and action data etc..It is multi-modal configured with receiving in smart machine 102
The relevant device of interaction data, for receiving the multi-modal interaction data of the transmission of user 101.At this point, the two of expanding data transmitting
Side is user 101 and smart machine 102, and the direction of data transmitting is to be transmitted to smart machine 102 from user 101.
Then, smart machine 102 sends to cloud brain 104 and requests.Request cloud brain 104 to multi-modal interaction data
Semantic understanding, visual identity, cognition calculating and affection computation are carried out, to help user to carry out decision.At this point, to multi-modal friendship
Mutual data are parsed, and the interaction for obtaining user is intended to.And according to interaction be intended to generate visual human's language response data and with
Corresponding visual human's behavior expression data, wherein visual human's behavior expression data include the headwork data of visual human, eye
Portion's sight controls data and facial expression data, body motion data and major beat data.Then, cloud brain 104 will
Data transmission is replied to smart machine 102.At this point, two sides of expansion communication are smart machine 102 and cloud brain 104.
Finally, smart machine 102 can cooperate virtually after smart machine 102 receives the data of the transmission of cloud brain 104
People's behavior expression data export visual human's language response data.At this point, two sides of expansion communication are smart machine 102 and user
101。
Exchange method and system provided by the invention based on visual human's behavioral standard provides a kind of visual human, visual human
Have default image and preset attribute, multi-modal interaction can be carried out with user.Also, it is provided by the invention to be based on visual human
The exchange method and system of behavioral standard can also cooperate output visual human's behavior expression number when exporting multi-modal response data
According to, express the emotion of visual human so that be able to carry out between user and visual human it is smooth exchange, and it is anthropomorphic that user is enjoyed
Interactive experience.
It should be understood that disclosed embodiment of this invention is not limited to specific structure disclosed herein, processing step
Or material, and the equivalent substitute for these features that those of ordinary skill in the related art are understood should be extended to.It should also manage
Solution, term as used herein is used only for the purpose of describing specific embodiments, and is not intended to limit.
" one embodiment " or " embodiment " mentioned in specification means the special characteristic described in conjunction with the embodiments, structure
Or characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs
Apply example " or " embodiment " the same embodiment might not be referred both to.
While it is disclosed that embodiment content as above but described only to facilitate understanding the present invention and adopting
Embodiment is not intended to limit the invention.Any those skilled in the art to which this invention pertains are not departing from this
Under the premise of the disclosed spirit and scope of invention, any modification and change can be made in the implementing form and in details,
But scope of patent protection of the invention, still should be subject to the scope of the claims as defined in the appended claims.
Claims (10)
1. a kind of exchange method based on visual human's behavioral standard, which is characterized in that the visual human by smart machine show,
When being in interaction mode, starting voice, emotion, vision and sensing capability, the method are comprised the steps of:
Multi-modal interaction data is obtained, the multi-modal interaction data is parsed, the interaction for obtaining user is intended to;
It is intended to generate visual human's language response data and corresponding visual human's behavior expression data according to the interaction,
In, visual human's behavior expression data include the headwork data of visual human, eye sight control data and facial expression
Data, body motion data and major beat data;
Visual human's behavior expression data are cooperated to export visual human's language response data.
2. the method as described in claim 1, which is characterized in that be intended to generate visual human's language response data according to the interaction
And it in the step of corresponding visual human's behavior expression data, also comprises the steps of:
Visual human's language response data is parsed, the visual human's emotion letter for including in visual human's language response data is extracted
Breath;
It obtains and the matched visual human's headwork data of visual human's emotion information, eye sight control data and facial table
Feelings data, body motion data and major beat data;
Visual human's behavior expression data are generated according to matched result.
3. method according to claim 2, which is characterized in that visual human's emotion information includes: positive emotion, negative sense emotion,
Attitude emotion and communication emotion, wherein positive emotion includes: happy, self-confident, expect and surprised;Negative sense emotion includes life
Gas, fear, sadness and detest;Attitude emotion includes: accepting and does not accept;Emotion of communicating includes greeting and goodbye.
4. method as claimed in claim 3, which is characterized in that generate visual human's behavior expression number according to matched result
According to the step of in, also comprise the steps of:
Classify according to visual human's emotion information, checks the headwork data, eye sight control data and face of visual human
Visual human's emotion that portion's expression data, body motion data and major beat data respectively represent is with the presence or absence of conflict and replaces
It changes.
5. the method as described in claim 1, which is characterized in that the visual human has specific virtual image and default category
Property, when generating visual human's behavior expression data, the headwork data for the visual human not being consistent with the preset attribute,
Eye sight control data and facial expression data, body motion data and major beat data are not involved in decision and output.
6. the method as described in claim 1, which is characterized in that cooperate visual human's behavior expression data output described virtual
Human speech was sayed in the step of response data comprising the steps of:
Determine visual human headwork data, eye sight control data and facial expression data, body motion data and
Output time, performance degree and the duration of major beat data.
7. the method as described in claim 1, which is characterized in that if current scene is no voice output scene, then according to virtual
People's current state exports the headwork data of matched visual human, eye sight control data and facial expression data, body
Action data and major beat data.
8. a kind of interactive device based on visual human's behavioral standard, which is characterized in that described device includes:
Interaction is intended to obtain module, is used to obtain multi-modal interaction data, parses, obtain to the multi-modal interaction data
Interaction to user is intended to;
Visual human's behavior expression data generation module, be used for according to it is described interaction be intended to generate visual human's language response data with
And corresponding visual human's behavior expression data, wherein visual human's behavior expression data include that the head of visual human is dynamic
Make data, eye sight control data and facial expression data, body motion data and major beat data;
Output module is used to that visual human's behavior expression data to be cooperated to export visual human's language response data.
9. a kind of program product runs program for visual human, for executing such as method of any of claims 1-7
The series of instructions of step.
10. a kind of interactive system based on visual human's behavioral standard, which is characterized in that the system includes:
Smart machine is mounted with visual human thereon, for obtaining multi-modal interaction data, and has voice, emotion, expression and moves
Make the ability exported, the smart machine includes hologram device;
Cloud brain is used to carry out the multi-modal interaction data semantic understanding, visual identity, cognition calculating and emotion
It calculates, visual human's behavior expression data and visual human's language response data is exported with visual human described in decision.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810953818.3A CN109324688A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810953818.3A CN109324688A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109324688A true CN109324688A (en) | 2019-02-12 |
Family
ID=65264278
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810953818.3A Pending CN109324688A (en) | 2018-08-21 | 2018-08-21 | Exchange method and system based on visual human's behavioral standard |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109324688A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070944A (en) * | 2019-05-17 | 2019-07-30 | 段新 | Training system is assessed based on virtual environment and the social function of virtual role |
CN110956142A (en) * | 2019-12-03 | 2020-04-03 | 中国太平洋保险(集团)股份有限公司 | Intelligent interactive training system |
CN111515970A (en) * | 2020-04-27 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Interaction method, mimicry robot and related device |
CN112667068A (en) * | 2019-09-30 | 2021-04-16 | 北京百度网讯科技有限公司 | Virtual character driving method, device, equipment and storage medium |
CN115016648A (en) * | 2022-07-15 | 2022-09-06 | 大爱全息(北京)科技有限公司 | Holographic interaction device and processing method thereof |
CN115129163A (en) * | 2022-08-30 | 2022-09-30 | 环球数科集团有限公司 | Virtual human behavior interaction system |
CN115390678A (en) * | 2022-10-27 | 2022-11-25 | 科大讯飞股份有限公司 | Virtual human interaction method and device, electronic equipment and storage medium |
WO2023216765A1 (en) * | 2022-05-09 | 2023-11-16 | 阿里巴巴(中国)有限公司 | Multi-modal interaction method and apparatus |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105427856A (en) * | 2016-01-12 | 2016-03-23 | 北京光年无限科技有限公司 | Invitation data processing method and system for intelligent robot |
CN105843381A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Data processing method for realizing multi-modal interaction and multi-modal interaction system |
CN107009362A (en) * | 2017-05-26 | 2017-08-04 | 深圳市阿西莫夫科技有限公司 | Robot control method and device |
CN107301168A (en) * | 2017-06-01 | 2017-10-27 | 深圳市朗空亿科科技有限公司 | Intelligent robot and its mood exchange method, system |
CN107765852A (en) * | 2017-10-11 | 2018-03-06 | 北京光年无限科技有限公司 | Multi-modal interaction processing method and system based on visual human |
CN107894831A (en) * | 2017-10-17 | 2018-04-10 | 北京光年无限科技有限公司 | A kind of interaction output intent and system for intelligent robot |
CN108416420A (en) * | 2018-02-11 | 2018-08-17 | 北京光年无限科技有限公司 | Limbs exchange method based on visual human and system |
-
2018
- 2018-08-21 CN CN201810953818.3A patent/CN109324688A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105427856A (en) * | 2016-01-12 | 2016-03-23 | 北京光年无限科技有限公司 | Invitation data processing method and system for intelligent robot |
CN105843381A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Data processing method for realizing multi-modal interaction and multi-modal interaction system |
CN107009362A (en) * | 2017-05-26 | 2017-08-04 | 深圳市阿西莫夫科技有限公司 | Robot control method and device |
CN107301168A (en) * | 2017-06-01 | 2017-10-27 | 深圳市朗空亿科科技有限公司 | Intelligent robot and its mood exchange method, system |
CN107765852A (en) * | 2017-10-11 | 2018-03-06 | 北京光年无限科技有限公司 | Multi-modal interaction processing method and system based on visual human |
CN107894831A (en) * | 2017-10-17 | 2018-04-10 | 北京光年无限科技有限公司 | A kind of interaction output intent and system for intelligent robot |
CN108416420A (en) * | 2018-02-11 | 2018-08-17 | 北京光年无限科技有限公司 | Limbs exchange method based on visual human and system |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070944A (en) * | 2019-05-17 | 2019-07-30 | 段新 | Training system is assessed based on virtual environment and the social function of virtual role |
CN110070944B (en) * | 2019-05-17 | 2023-12-08 | 段新 | Social function assessment training system based on virtual environment and virtual roles |
CN112667068A (en) * | 2019-09-30 | 2021-04-16 | 北京百度网讯科技有限公司 | Virtual character driving method, device, equipment and storage medium |
CN110956142A (en) * | 2019-12-03 | 2020-04-03 | 中国太平洋保险(集团)股份有限公司 | Intelligent interactive training system |
CN111515970A (en) * | 2020-04-27 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Interaction method, mimicry robot and related device |
WO2023216765A1 (en) * | 2022-05-09 | 2023-11-16 | 阿里巴巴(中国)有限公司 | Multi-modal interaction method and apparatus |
CN115016648A (en) * | 2022-07-15 | 2022-09-06 | 大爱全息(北京)科技有限公司 | Holographic interaction device and processing method thereof |
CN115016648B (en) * | 2022-07-15 | 2022-12-20 | 大爱全息(北京)科技有限公司 | Holographic interaction device and processing method thereof |
CN115129163A (en) * | 2022-08-30 | 2022-09-30 | 环球数科集团有限公司 | Virtual human behavior interaction system |
CN115390678A (en) * | 2022-10-27 | 2022-11-25 | 科大讯飞股份有限公司 | Virtual human interaction method and device, electronic equipment and storage medium |
CN115390678B (en) * | 2022-10-27 | 2023-03-31 | 科大讯飞股份有限公司 | Virtual human interaction method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109324688A (en) | Exchange method and system based on visual human's behavioral standard | |
CN109271018A (en) | Exchange method and system based on visual human's behavioral standard | |
WO2021043053A1 (en) | Animation image driving method based on artificial intelligence, and related device | |
CN110286756A (en) | Method for processing video frequency, device, system, terminal device and storage medium | |
CN111833418B (en) | Animation interaction method, device, equipment and storage medium | |
CN107340859A (en) | The multi-modal exchange method and system of multi-modal virtual robot | |
CN110400251A (en) | Method for processing video frequency, device, terminal device and storage medium | |
CN112162628A (en) | Multi-mode interaction method, device and system based on virtual role, storage medium and terminal | |
CN107294837A (en) | Engaged in the dialogue interactive method and system using virtual robot | |
CN108942919B (en) | Interaction method and system based on virtual human | |
CN110413841A (en) | Polymorphic exchange method, device, system, electronic equipment and storage medium | |
CN107340865A (en) | Multi-modal virtual robot exchange method and system | |
CN109343695A (en) | Exchange method and system based on visual human's behavioral standard | |
CN106710590A (en) | Voice interaction system with emotional function based on virtual reality environment and method | |
CN108416420A (en) | Limbs exchange method based on visual human and system | |
CN107632706A (en) | The application data processing method and system of multi-modal visual human | |
CN109871450A (en) | Based on the multi-modal exchange method and system for drawing this reading | |
CN107679519A (en) | A kind of multi-modal interaction processing method and system based on visual human | |
CN109086860A (en) | A kind of exchange method and system based on visual human | |
CN107784355A (en) | The multi-modal interaction data processing method of visual human and system | |
CN109032328A (en) | A kind of exchange method and system based on visual human | |
CN108595012A (en) | Visual interactive method and system based on visual human | |
WO2022252866A1 (en) | Interaction processing method and apparatus, terminal and medium | |
CN108052250A (en) | Virtual idol deductive data processing method and system based on multi-modal interaction | |
CN107808191A (en) | The output intent and system of the multi-modal interaction of visual human |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |
|
RJ01 | Rejection of invention patent application after publication |