CN105843118A

CN105843118A - Robot interacting method and robot system

Info

Publication number: CN105843118A
Application number: CN201610179223.8A
Authority: CN
Inventors: 郭家; 石琰
Original assignee: Beijing Guangnian Wuxian Technology Co Ltd
Current assignee: Beijing Guangnian Wuxian Technology Co Ltd
Priority date: 2016-03-25
Filing date: 2016-03-25
Publication date: 2016-08-10
Anticipated expiration: 2036-03-25
Also published as: CN105843118B

Abstract

The invention discloses a robot interacting method and a robot system. The robot interacting method includes the steps of acquiring multi-modal extraneous input information including text information, image information, sound information, robot self-inspection information and sensing information, analyzing the extraneous input information to obtain interaction input information, interaction subject characteristic information and interaction environment characteristic information, analyzing the interaction subject characteristic information and the interaction environment characteristic information to obtain matched interaction scene restrictions, conducting semantic parsing for the interaction input information to obtain the interaction intension of the interaction subject, and outputting multi-modal interaction behaviors according to the interaction intension under the interaction scene restrictions. Compared with the prior art, the method and system better simulate human interaction behavior analyzing and generating process during a man-man interaction process, so that more natural and vivid interaction output can be obtained. Robot use experience is substantially improved.

Description

A kind of robot interactive method and robot system

Technical field

The present invention relates to robot field, in particular relate to a kind of robot interactive method and robot system.

Background technology

Development and the continuous progress of artificial intelligence technology along with computer technology.Small-sized in domestic environments The application of intelligent robot is more extensive, towards the small intelligent robot of domestic just in fast development.

The existing small scale robot towards domestic, majority is the interactive mode of single voice or text, some machine Device people uses limb action mutual with user.Although this enriches interactive form to a certain extent, but due to Alternate acknowledge mechanism is solidification, and the alternate acknowledge of robot is stereotyped, in repeatedly interaction, and machine People often uses single response to respond the interaction request under user's different conditions.This is easy to allow user produce and detests Tired, greatly reduce Consumer's Experience.

Therefore, in order to make the response of robot more natural vividly, improve the Consumer's Experience of robot, need one A kind of new robot interactive method.

Summary of the invention

In order to make the response of robot more natural vividly, improve the Consumer's Experience of robot, the invention provides one Plant robot interactive method, said method comprising the steps of:

The extraneous input information described in multi-modal extraneous input information that gathers comprises Word message, image information, sound Message breath, robot self-inspection information and induction information；

Analyze described extraneous input information to obtain the information that interactively enters, interactive object characteristic information and mutual ring Border characteristic information；

It is analyzed obtaining coupling to described interactive object characteristic information and described interactive environment characteristic information Exchange scenario limit；

The described information that interactively enters is carried out semantic parsing to obtain the mutual intention of interactive object；

Under described exchange scenario limits, carry out multi-modal interbehavior output according to described mutual intention.

In one embodiment, analyze described extraneous input information to determine interactive object characteristic information, including:

Monitor whether described image information comprises humanoid with determine whether exist can be mutual object.

From described image information, interactive object face-image is resolved when described image information comprises humanoid；

Position described interactive object face-image.

Extract from described interactive object face-image and analyze face feature information；

Determine interactive object emotion or interactive object identity that described face feature information characterized.

Monitor and whether described acoustic information comprises interactive object voice to determine whether that existence can be mutual Object.

Described interactive object voice is separated when described acoustic information comprises interactive object voice；

Resolve described interactive object voice with the interactive object emotion determining described interactive object voice and being characterized or User identity.

The invention allows for a kind of robot system, described system includes:

Acquisition module, it is configured to gather multi-modal extraneous input information, and described acquisition module comprises word letter Breath harvester, image information collecting device, acoustic information harvester, robot self-inspection information collecting device With induction information harvester；

Module is analyzed in input, and it is configured to analyze described extraneous input information to obtain the information that interactively enters, mutual Characteristics of objects information and interactive environment characteristic information；

Exchange scenario generation module, it is configured to described interactive object characteristic information and described interactive environment special Reference breath is analyzed the exchange scenario to obtain coupling and limits；

Semantic meaning analysis module, it is configured to the described information that interactively enters is carried out semantic parsing to obtain interactive object Mutual intention；

Mutual output module, it is configured under described exchange scenario limits, carries out many according to described mutual intention The interbehavior output of mode.

In one embodiment, described input is analyzed module and is comprised humanoid confirmation device, and described humanoid confirmation device is joined Be set to monitor in described image information whether comprise humanoid to determine whether to there is interactive object.

In one embodiment, described input analysis module also comprises face-image positioner, described face-image Positioner is configured that

Position described interactive object face-image.

In one embodiment, described input analysis module also comprises face-image resolver, described face-image Resolver is configured that

Compared with prior art, the method and system of the present invention can preferably simulate people in person to person's interaction The analysis of class interbehavior generates process, thus obtains the most lively mutual output, substantially increases machine The application of device people is experienced.

The further feature of the present invention or advantage will illustrate in the following description.Further, the part of the present invention is special Levy or advantage will be become apparent by description, or be appreciated that by implementing the present invention.The present invention Purpose and certain advantages can be come by step specifically noted in description, claims and accompanying drawing Realize or obtain.

Accompanying drawing explanation

Accompanying drawing is for providing a further understanding of the present invention, and constitutes a part for description, with the present invention Embodiment be provided commonly for explain the present invention, be not intended that limitation of the present invention.In the accompanying drawings:

Fig. 1 is method flow diagram according to an embodiment of the invention；

Fig. 2, Fig. 4 and Fig. 6 are that different embodiment according to the subject invention obtains interactive object characteristic information respectively Flow chart；

Fig. 3 and Fig. 5 is the flow process that different embodiment according to the subject invention obtains interactive environment characteristic information respectively Figure；

Fig. 7 is system structure sketch according to an embodiment of the invention.

Detailed description of the invention

Embodiments of the present invention are described in detail, whereby the enforcement of the present invention below with reference to drawings and Examples Personnel can fully understand how application technology means are to solve technical problem for the present invention, and reaches technique effect Realize process and be embodied as the present invention according to the above-mentioned process that realizes.As long as it should be noted that do not constitute conflict, Each embodiment in the present invention and each feature in each embodiment can be combined with each other, the technology formed Scheme is all within protection scope of the present invention.

Robot involved in this specification describes is by actuator, driving means, control system and sense Know that system is constituted.Head, upper limb portion, trunk and lower limb portion is mainly included, in driving means in actuator In, including electric driver, fluid pressure drive device and actuating device of atmospheric pressure.Control system is as the core of robot Heart part, is similar to the brain of people, and it mainly includes processor and joint servo control device.

Sensory perceptual system includes internal sensor and external sensor.External sensor include photographic head, mike, Ultrasound wave (or laser radar, infrared) device, in order to perception external world much information.Photographic head can be arranged on Head, is similar to human eye.Ultrasound wave (or laser radar, infrared) device can be arranged on any portion of trunk On position, or other position, in order to existence or the external environment of auxiliary camera sense object.Robot has There are audition, vision collecting ability.

Needing exist for explanation, the concrete structure of robot involved in the present invention is not limited to foregoing description. According to actual needs, robot on the basis of can realizing method of the present invention, can use arbitrary its His hardware configuration.

Further, what the method for the present invention described realizes in computer systems.This computer system example As being arranged in the control core processor of robot.Such as, method described herein can be implemented as energy To control the software that logic performs, it is performed by the CPU in robot control system.Merit as herein described The programmed instruction set being stored in non-transitory tangible computer computer-readable recording medium can be can be implemented as.When with this When mode realizes, this computer program includes one group of instruction, and when the instruction of this group is run by computer, it promotes meter The method that calculation machine performs to implement above-mentioned functions.FPGA can temporarily or permanently be arranged on non-transitory In tangible computer computer-readable recording medium, such as ROM chip, computer storage, disk or other storages Medium.In addition to realizing with software, logic as herein described may utilize discrete parts, integrated circuit and Programmable logic device (such as, field programmable gate array (FPGA) or microprocessor) is used in combination FPGA, or include that any other equipment of they combination in any embodies.These type of embodiment purports all It is being within the scope of the invention.

The existing small scale robot towards domestic, majority is the interactive mode of single voice or text, easily makes People is weary of.In order to improve Consumer's Experience, some robot uses limb action mutual with user at present.Although this Enrich interactive form to a certain extent, but owing to alternate acknowledge mechanism is solidification, robot mutual Response is stereotyped, and in repeatedly interaction, robot often uses single response to respond user's difference shape Interaction request under state.This is easy to allow user produce and is weary of, greatly reduces Consumer's Experience.

In order to make the response of robot more natural vividly, improving the Consumer's Experience of robot, the present invention proposes one Plant robot interactive method.First the interbehavior of person to person is analysed in depth by the method according to the invention. In the interaction of person to person, the most simplest language that is through is talked, the most then be Talked by word, image and body language.And the above-mentioned interactive mode of corresponding simulation, in prior art Robot use voice, text, image or the interactive mode of limb action.

Considering the complex situations of Health For All further, during talking, first mutual participant can be The semantic understanding that interactive information (language, word, image and body language) from interactive object is carried out, Understand what meaning the other side is intended by earth, then make corresponding answer.In above process, join alternately It is not simply directly to make simple response for the semantic of interactive object with person, but according to being presently in Interactive environment and the concrete condition of interactive object response is made and modifies adjustment targetedly.

For example, for " how is everything going recently " this problem, if interactive object is the friend in work Friend's (not being very close), simple " extremely busy, to hurry in process company affair " just can calculate and suitably return everyday Answer.And if interactive object is close friend or household, then need the most specifically " busy, hurry in work everyday Make, the most all start dizziness, today ... ", the most just can seem and get close to.

Another example, again for " how is everything going recently " this problem, if on the way meeting by chance And interactive object is substantially had something to do, simple " hurry in process company affair everyday, wait and leisure look for a machine Understand us to have a meal together " just can complete dialogue.And if be currently at park take a walk and interactive object bright Aobvious the most boring, then talk that can be detailed " busy, it is engage in business everyday, the most all starts dizziness, the present My god ... ".

It is to say, person to person mutual in, mutual participant can analyze current environment and obtain for assisting Mutual exchange scenario limits (such as interactive object identity, interactive object state and be presently in environment), Interactive mode and the interaction content adjusting self is limited according to exchange scenario.

Based on above-mentioned analysis, in order to make the response of robot more natural vividly, the machine of the method according to the invention People can analyze current exchange scenario in interaction and limit, and limits the friendship generating coupling according to exchange scenario Behavior output mutually.

Next it is embodied as step based on what flow chart described method according to embodiments of the present invention in detail.Accompanying drawing Flow chart shown in step can be in the computer system comprising such as one group of computer executable instructions Perform.Although showing the logical order of each step in flow charts, but in some cases, can be with not It is same as the step shown or described by order execution herein.

As it is shown in figure 1, in the embodiment according to the inventive method, step S100 is first carried out, outside collection Boundary's input information.Limit to obtain sight accurately and effectively, in the step s 100, the extraneous input of collection What information not only comprised user's (robot interactive object) interactively enters (phonetic entry, action instruction, word Input etc.), and comprise other information relevant to user and current interactive environment.Concrete, in this reality Executing in example, robot gathers multi-modal extraneous input information, and extraneous input information comprises Word message, image (such as robot is certainly for information (wherein comprising the action message of user), acoustic information, robot self-inspection information The attitude information of body) and induction information (such as infrared induction ranging information).

Next performing step S110, input information is analyzed to external world.In step s 110, mainly The concrete meaning (concrete meaning relevant to interbehavior) that the extraneous input information of analysis is comprised, and next Analysis result based on external world's input information performs step S111 (acquisition interactively enters information), step S112 (obtaining interactive object characteristic information) and step S113 (obtaining interactive environment characteristic information).

In the step 111 of the present embodiment, in 112,113:

The information of interactively entering refers to interaction content (phonetic entry, action instruction, the literary composition that user sends to robot Word input etc.), the information of interactively entering is that robot makes the corresponding mutual basis responded；

Interactive object characteristic information mainly describe user characteristic attribute (user identity, user emotion, user work as Front condition etc.)；

The environment that interactive environment characteristic information mainly describes robot and user is presently in.

After getting interactive object characteristic information and interactive environment characteristic information, (step 112,113) is the most permissible Perform step S130, obtain exchange scenario and limit step, to interactive object characteristic information and interactive environment feature Information is analyzed the exchange scenario to obtain coupling and limits.After getting the information of interactively entering (step S111) Just can perform step S120, semantic analyzing step, the information of interactively entering is carried out semantic parsing with acquisition alternately The mutual intention of object.Finally performing step S140, interbehavior exports step, under exchange scenario limits, Multi-modal interbehavior output is carried out according to mutual intention.

It should be noted that in the ideal case, analysis result based on external world's input information may determine that all Above-mentioned information.But in some cases, input the disappearance of information due to the external world or robot is in specifically State/environment, (such as, currently analysis result based on external world's input information can only determine a part of above-mentioned information , the most there is not the information of interactively entering in not user, and interactive object characteristic information only comprises and there is not user Label information).In this case, then be according to the information obtained perform step S120,130, and base Corresponding interbehavior output (S140) is carried out in performing result.

Such as, there is not the information of interactively entering, and interactive object characteristic information is only comprising and do not has user's In the case of label information, enter holding state, the interbehavior of the corresponding holding state of output.

To sum up, method according to the invention it is possible to preferably simulate human interaction behavior in person to person's interaction Analysis generate process, thus obtain and the most lively export alternately, substantially increase the application of robot Experience.Next further describe based on specific embodiment the present invention method be embodied as details.

In the extraneous input information that robot gets, image information is a kind of critically important information.At this In a bright embodiment, it is possible not only to analyze the information that interactively enters (the such as user of acquisition user based on image information Gesture instruction), and can based on graphical analysis obtain interactive object characteristic information and interactive environment feature letter Breath.

As in figure 2 it is shown, gather image information in step s 200 and next image information be analyzed. In the present embodiment, first can based on image information monitoring currently whether exist can be mutual object (use Family).I.e. perform step S210, humanoid detecting step, whether the image information that detection gets exists humanoid. Further, in order to avoid the similar object such as picture, model on the impact of humanoid detection (by the portrait in picture, Anthropometric dummy is detected as humanoid), step S210 also comprises In vivo detection step, in detection image information Humanoid whether be live body.

If there is no humanoid, then explanation current robot visual range in do not exist can be mutual object, this Shi Zhihang step S240, exports interactive object characteristic information.In this case, the interactive object feature of output Information is labeled with and there is currently no interactive object.In interbehavior output step the most later, robot Just can export for there is no interactive object according to interactive object characteristic information (there is not the labelling of interactive object) And the interbehavior preset.

Then the interactive object feature of interactive object being further analyzed when there is interactive object, step is first carried out Rapid S220, separated facial image step, the humanoid figure confirmed from step S210 picture resolves interactive object Face-image.Next perform step S231, location face image step, interactive object face-image is carried out Location (namely face/the head of current interactive object being positioned).Just may be used after step S240 completes To perform step S240, output comprises interactive object face-image location information, and (interactive object face/head is fixed Position information) interactive object characteristic information.In interbehavior output step the most later, robot just may be used To export corresponding interbehavior according to interactive object face-image location information, (such as rotary machine head part makes Obtain robot face/eyes face/head just to interactive object).

In the present embodiment, interactive object face-image is also further analyzed.After step S220, Can also carry out step S232, resolve face-image and determine interactive object identity, concrete:

From image information, interactive object face-image is resolved when image information comprises humanoid；

Extract from interactive object face-image and analyze face feature information；

Determine the interactive object identity that face feature information is characterized.

Just can perform step S240 after step S232, output comprises the mutual of interactive object identity information Characteristics of objects information.In interbehavior output step the most later, robot just can be according to interactive object Identity information exports corresponding interbehavior and (such as uses for robot owner and stranger different mutual Strategy).

Further, after step S220, it is also possible to perform step S233, resolve face-image and determine friendship Subjects' mood mutually is concrete:

Determine the interactive object emotion that face feature information is characterized.

Just can perform step S240 after step S233, output comprises the mutual of interactive object emotional information Characteristics of objects information.In interbehavior output step the most later, robot just can be according to interactive object Emotional information exports corresponding interbehavior and (is respectively adopted different time such as angry or sad when interactive object Interactive strategy).

Further, in the embodiment according to the inventive method, it is also possible to based on the analysis to image information Obtain interactive environment characteristic information.As it is shown on figure 3, step S300 is first carried out gather image information.Then Perform step S310, from image information, isolate the background image outside interactive object image.Then the back of the body is resolved Scape image information with confirm interactive environment residing for interactive object/robot (whether be in room, current weather Whether situation, lamplight status, surrounding have other object/person shapes etc.) (step S320).Finally export bag Interactive environment characteristic information (step S330) containing analysis result.Interbehavior output step the most later In, robot just can export corresponding interbehavior (such as when sunlight mistake outside room according to residing interactive environment Time strong, prompting interactive object takes sun-proof measure when leaving).

In the extraneous input information that robot gets, acoustic information is also a kind of critically important information.At this Inventing in an embodiment, the information that interactively enters being possible not only to analyze acquisition user based on acoustic information (is such as used The interactive voice at family, sound indicate), and can also audiovideo analysis obtain interactive object characteristic information and Interactive environment characteristic information.

As shown in Figure 4, in step S400, gather acoustic information and next acoustic information is analyzed. In the present embodiment, user's (friendship of interaction demand first currently whether can be there are based on acoustic information monitoring Object mutually).I.e. performing step S410, interactive object speech detection step, in the acoustic information that detection gets Whether comprise interactive object voice.

If not comprising interactive object voice, then do not send mutual in the range of explanation current robot sound collecting The user of voice, now performs step S440, exports interactive object characteristic information.In this case, output Interactive object characteristic information in be labeled with and there is currently no interactive object.Interbehavior output the most later In step, robot just can export pin according to interactive object characteristic information (there is not the labelling of interactive object) To there is no interactive object and default interbehavior.

When there is interactive object voice (there is interactive object) then interactive object feature to interactive object do into One step analysis, is first carried out step S420, separates interactive object voice step, resolves mutual from acoustic information Object voice.Next perform step S431, positioning interaction object, interactive object voice is carried out source of sound analysis Positioning interaction object voice send position (interactive object position).Just can hold after step S440 completes Row step S440, output comprises the interactive object characteristic information of interactive object location information.Friendship the most later Mutually in behavior output step, robot just can export corresponding interbehavior according to interactive object location information (such as rotary machine head part makes robot face/eyes just to interactive object).

In the present embodiment, interactive object voice is also further analyzed.After step S420, also may be used To perform step S432, resolve interactive object voice and determine interactive object identity, concrete, to interactive object language Sound carries out voiceprint analysis to determine the user identity corresponding to interactive object voice.

Just can perform step S440 after step S432, output comprises the mutual of interactive object identity information Characteristics of objects information.In interbehavior output step the most later, robot just can be according to interactive object Identity information exports corresponding interbehavior and (such as uses for robot owner and stranger different mutual Strategy).

Further, after step S420, it is also possible to perform step S433, interactive object voice is resolved true Determine interactive object emotion, concrete, interactive object voice is carried out voiceprint analysis to determine interactive object voice institute The interactive object emotion characterized.

Just can perform step S440 after step S433, output comprises the mutual of interactive object emotional information Characteristics of objects information.In interbehavior output step the most later, robot just can be according to interactive object Emotional information exports corresponding interbehavior and (is respectively adopted different time such as angry or sad when interactive object Interactive strategy).

Further, in the embodiment according to the inventive method, it is also possible to based on the analysis to acoustic information Obtain interactive environment characteristic information.As it is shown in figure 5, step S500 is first carried out gather acoustic information.Then Perform step S510, from acoustic information, isolate the background sound outside interactive object voice.Then the back of the body is resolved Scape acoustic information is to confirm that the interactive environment residing for interactive object/robot (is in quiet indoor to be in Near roads that vehicle flowrate is big, around whether have other people etc.) (step S520).Finally output comprises solution The interactive environment characteristic information (step S530) of analysis result.In interbehavior output step the most later, Robot just can export corresponding interbehavior (for instance in the road that vehicle flowrate is big according to residing interactive environment User is reminded to note when leaving when (background sound information existing a large amount of vehicle travel and sound of blowing a whistle) near road Traffic safety).

It should be noted that in Fig. 3 or embodiment illustrated in fig. 4, the interactive object characteristic information of final output Comprise and whether there is the expression information of interactive object, interactive object Face location information, interactive object identity information And interactive object emotional information.Certain, in practical implementation, according to concrete image information, on State in information one or more can be empty.It addition, in other embodiments of the present invention, according to concrete friendship Demand can also simplify step mutually, and the generation step cancelling said one or multiple information (such as need not pin User identity is taked the occasion of distinct interaction strategy, can be with cancellation step S232 or step S432).

It addition, in Fig. 2-embodiment illustrated in fig. 5, be based respectively on the analysis to image information and acoustic information and obtain Take detailed interactive object characteristic information and interactive environment characteristic information.But it is to be herein pointed out in reality During execution, Fig. 2-embodiment illustrated in fig. 5 is the most overall respectively performs a flow process part.Simple the most only Rely on an embodiment in Fig. 2-embodiment illustrated in fig. 5 can not obtain information the most accurately and reliably.

Such as, step S210 by whether there is humanoid judging whether in detection image information can be handed over Mutual object, but when image information does not exists humanoid, it is possible to user is beyond robot sight line, Now user still can be based on voice and robot interactive.Equally, step S410 is by detection acoustic information In whether comprise interactive object voice judge whether can be mutual object, but when in acoustic information not When there is interactive object voice, it is possible to user does not send any sound, but is to there is user in face of robot 's.

Therefore, the judged result of step S210 or S410 is not a result really determined, can only be to use It is determined further in auxiliary robot.Based on above-mentioned situation, in an embodiment of the present invention, have employed Perform step S210 and the mode of step S410, comprehensive humanoid detection and interactive object speech detection simultaneously Testing result judge currently whether there is interactive object.

As shown in Figure 6, step S600 is first carried out, gathers image information and acoustic information.Then step is performed Rapid S610, whether humanoid detecting step (comprises In vivo detection step), deposit in the image information that detection gets Humanoid.

If there is humanoid, then performing step S611, separated facial image step, from step S610, institute is really The humanoid figure's picture recognized resolves interactive object face-image.Next perform step S612, resolve face-image and obtain Take the analysis result of face-image.Meanwhile, performing step S613, location face image step, to interactive object Face-image positions.

While processing interactive object face-image (before/after or), robot is for acoustic information Process.Perform step S630, interactive object speech detection step, the acoustic information that detection gets is No comprise interactive object voice.

If acoustic information comprises interactive object voice, then perform step S631, separate interactive object voice, And then perform step S632, resolve interactive object voice.

(step S632 and step S612 after interactive object voice and interactive object face-image resolve It is finished) perform step S640, comprehensive image analysis result and the speech analysis result analyzed is to determine alternately Characteristics of objects information (identity of interactive object, emotion etc.).Particularly, when the result in step S630 it is Acoustic information does not comprise interactive object voice, step S640 is only analyzed image analysis result.

Just can perform step S650 after step S613 and/or step S640 complete, output interactive object is special Reference breath (comprising interactive object Face location information, interactive object identity and/or interactive object emotion).

Particularly, in above process, if the testing result of step S610 is not exist humanoid, then perform Step S620, interactive object speech detection step, whether the acoustic information that detection gets comprises interactive object Voice.If acoustic information does not comprise interactive object voice (the most not existing humanoid in image information), then say Bright currently without interactive object, now perform step S650, export interactive object characteristic information.In this situation Under, the interactive object characteristic information of output is labeled with and there is currently no interactive object.

Then interactive object voice is done when there is interactive object voice (but not existing humanoid in image information) Analyze further, step S621 is first carried out, separate interactive object voice step, resolve from acoustic information and hand over Object voice mutually.Next perform step S622, positioning interaction object, interactive object voice is carried out source of sound and divides That analyses positioning interaction object voice sends position (interactive object position).

The interactive object characteristic information comprising interactive object location information just can be exported after step S622 completes. Robot just can make robot eyes (gather image information according to interactive object location information rotation head Photographic head) just to interactive object (performing step S623).So, again carry out humanoid detection and (perform step S624), the image information that now robot eyes (gathering the photographic head of image information) obtains just comprises people Shape (user images).

Next step S661, separated facial image (step S611) are performed；Perform step S662 further Location face-image (step S613) and step S663 resolve face-image (step S612), hold simultaneously Row step S625 resolves interactive object voice (step S632)；Finally perform step S664 and comprehensively analyze image / speech analysis result (step S640).Finally, step S650 exports interactive object characteristic information.

Particularly, if again carry out humanoid detection (step S624), in the image information that robot obtains The most do not comprise humanoid (user images) (owing to sight line such as is blocked at the reason), the most directly perform step S625, Resolve current user speech (step 632) and finally the result of speech analysis exported by step S650.

Further, in an embodiment of the present invention, also use when analyzing and obtaining interactive environment characteristic information The analysis result of integrated background image (embodiment illustrated in fig. 3) and background sound (embodiment illustrated in fig. 5) is come Analyze and obtain interactive environment characteristic information.

Exchange method based on the present invention the invention allows for a kind of robot system.As it is shown in fig. 7, at this Inventing in an embodiment, robot system includes that module 710, exchange scenario are analyzed in acquisition module 700, input Generation module 730, semantic meaning analysis module 720 and mutual output module 740.

Acquisition module 700 is configured to gather multi-modal extraneous input information.In the present embodiment, acquisition module 700 comprise Word message harvester 701, image information collecting device 702, acoustic information harvester 703 With induction information harvester 704.Further, acquisition module 700 also includes robot self-inspection information gathering Device.Robot self-inspection information collecting device can be realized by the self-inspection parts in robot hardware's parts, or with Software and hardware combining mode realizes, and does not limit to.

It is pointed out that according to real needs, in other embodiments of the present invention, can in acquisition module 700 With one or several in structure said apparatus, or construct the device with other acquisition functions.

Input is analyzed module 710 and is configured to analyze extraneous input information to determine the information of interactively entering, interactive object Characteristic information and interactive environment characteristic information.Exchange scenario generation module 730 is configured to interactive object feature Information and interactive environment characteristic information are analyzed the exchange scenario to obtain coupling and limit.Semantic meaning analysis module 720 are configured to that the information of interactively entering carries out semantic parsing is intended to the mutual of acquisition interactive object.Mutual output Module 740 is configured under exchange scenario limits, and carries out multi-modal interbehavior output according to mutual intention.

Compared with prior art, the system of the present invention can preferably simulate human interaction in person to person's interaction The analysis of behavior generates process, thus obtains the most lively mutual output, substantially increases robot Application is experienced.

Concrete, in the present embodiment, input is analyzed module 710 and is comprised the humanoid confirmation dress for image information Put 711, face-image positioner 712 and face-image resolver 713.Wherein, humanoid confirmation dress Put 711 be configured to monitor in image information whether comprise humanoid to determine whether to there is interactive object.Face Image positioning device 712 is configured that when comprising humanoid in image information and resolves interactive object from image information Face-image；Positioning interaction subjects face image.Face-image resolver 713 is configured that from interactive object Face-image extracts and analyzes face feature information；Determine the interactive object emotion that face feature information is characterized Or interactive object identity.

Input is analyzed module 710 and is also comprised interactive object voice confirmation device 714, voice for acoustic information Positioner 715 and interactive object speech analysis device 716.Wherein, interactive object voice confirms device 714 It is configured to monitor in acoustic information whether comprise interactive object voice to determine whether to there is interactive object.Language Sound positioner 715 is configured that when comprising interactive object voice in acoustic information and carries out interactive object voice Source of sound positions to determine interactive object position.Interactive object speech analysis device 716 is configured to interactive object language Sound carries out vocal print and resolves to determine the interactive object emotion or interactive object identity that interactive object voice characterized.

Needing exist for explanation, according to real needs, in other embodiments of the present invention, module is analyzed in input Can construct in said apparatus in 710 one or several, or construct the device with other analytic functions.

While it is disclosed that embodiment as above, but described content is only to facilitate understand the present invention And the embodiment used, it is not limited to the present invention.Method of the present invention also can have other multiple realities Execute example.Without departing from the spirit of the present invention, those of ordinary skill in the art are when making according to the present invention Go out various corresponding change or deformation, but these change accordingly or deform the claim that all should belong to the present invention Protection domain.

Claims

1. a robot interactive method, it is characterised in that said method comprising the steps of:

Gather multi-modal extraneous input information, described extraneous input information comprise Word message, image information, Acoustic information, robot self-inspection information and induction information；

Method the most according to claim 1, it is characterised in that analyze described extraneous input information with really Determine interactive object characteristic information, including:

Method the most according to claim 2, it is characterised in that analyze described extraneous input information with really Determine interactive object characteristic information, including:

Position described interactive object face-image.

The most according to the method in claim 2 or 3, it is characterised in that analyze described extraneous input information To determine interactive object characteristic information, including:

5. according to the method described in any one of claim 1-4, it is characterised in that analyze the described external world defeated Enter information to determine interactive object characteristic information, including:

6. according to the method described in claim 5, it is characterised in that analyze described extraneous input information with Determine interactive object characteristic information, including:

7. a robot system, it is characterised in that described system includes:

System the most according to claim 7, it is characterised in that described input is analyzed module and comprised humanoid Confirm device, described humanoid confirmation device be configured to monitor in described image information whether comprise humanoid with determine work as Before whether there is interactive object.

System the most according to claim 8, it is characterised in that described input is analyzed module and also comprised face Portion's image positioning device, described face-image positioner is configured that

Position described interactive object face-image.

System the most according to claim 9, it is characterised in that described input is analyzed module and also comprised face Portion's image analysis apparatus, described face-image resolver is configured that