CN105843118B

CN105843118B - A kind of robot interactive method and robot system

Info

Publication number: CN105843118B
Application number: CN201610179223.8A
Authority: CN
Inventors: 郭家; 石琰
Original assignee: Beijing Guangnian Wuxian Technology Co Ltd
Current assignee: Beijing Guangnian Wuxian Technology Co Ltd
Priority date: 2016-03-25
Filing date: 2016-03-25
Publication date: 2018-07-27
Anticipated expiration: 2036-03-25
Also published as: CN105843118A

Abstract

The invention discloses a kind of robot interactive method and robot systems.The method of the present invention includes：Multi-modal extraneous input information is acquired, external world's input information includes text information, image information, acoustic information, robot self-test information and induction information；It analyzes the extraneous input information and interactively enters information, interactive object characteristic information and interactive environment characteristic information to obtain；The interactive object characteristic information and the interactive environment characteristic information are analyzed and limited with obtaining matched exchange scenario；Semantic parsing is carried out to obtain the interaction intention of interactive object to the information that interactively enters；Under exchange scenario restriction, it is intended to carry out multi-modal interbehavior output according to the interaction.Compared with prior art, method and system of the invention can preferably simulate the analysis generating process of human interaction behavior in person to person's interactive process, to obtain more naturally lively interaction output, substantially increase the application experience of robot.

Description

A kind of robot interactive method and robot system

Technical field

The present invention relates to robot field, a kind of robot interactive method and robot system are in particulard relate to.

Background technology

With the continuous development of computer technology and artificial intelligence technology is constantly progressive.In domestic environments small intelligent The application of robot is also more and more extensive, and the small intelligent robot towards household is being grown rapidly.

The existing small scale robot towards household, majority are the interactive mode of single voice or text, certain robots It is interacted with user using limb action.Although this enriches interactive form to a certain extent, due to alternate acknowledge mechanism It is cured, the alternate acknowledge of robot is stereotyped, and in multiple interactive process, robot is often returned using single response Using the interaction request under the different conditions of family.This is easy to that user's generation is allowed to be weary of, greatly reduces user experience.

Therefore, in order to make the response of robot more natural lively, the user experience of robot is improved, is needed a kind of new Robot interactive method.

Invention content

In order to make the response of robot more natural lively, the user experience of robot is improved, the present invention provides a kind of machines Device people's exchange method, the described method comprises the following steps：

It includes text information, image information, sound letter to acquire extraneous input information described in multi-modal extraneous input information Breath, robot self-test information and induction information；

It analyzes the extraneous input information and interactively enters information, interactive object characteristic information and interactive environment spy to obtain Reference ceases；

The interactive object characteristic information and the interactive environment characteristic information are analyzed to obtain matched friendship Mutual scene limits；

Semantic parsing is carried out to obtain the interaction intention of interactive object to the information that interactively enters；

Under exchange scenario restriction, it is intended to carry out multi-modal interbehavior output according to the interaction.

In one embodiment, the extraneous input information is analyzed to determine interactive object characteristic information, including：

In monitoring described image information whether comprising humanoid to determine whether to have the object that can interact.

Interactive object face-image is parsed from described image information when including humanoid in described image information；

Position the interactive object face-image.

It is extracted from the interactive object face-image and analyzes face feature information；

Determine the interactive object mood or interactive object identity that the face feature information is characterized.

Monitor in the acoustic information whether comprising interactive object voice to determine whether to have pair that can be interacted As.

The interactive object voice is detached when in the acoustic information including interactive object voice；

Parse the interactive object mood or user that the interactive object voice is characterized with the determination interactive object voice Identity.

The invention also provides a kind of robot system, the system comprises：

Acquisition module is configured to acquire multi-modal extraneous input information, and the acquisition module is adopted comprising text information Acquisition means, image information collecting device, acoustic information harvester, robot self-test information collecting device and induction information acquisition Device；

Analysis module is inputted, is configured to analyze the extraneous input information and interactively enters information, interactive object to obtain Characteristic information and interactive environment characteristic information；

Exchange scenario generation module is configured to believe the interactive object characteristic information and the interactive environment feature Breath is analyzed to be limited with obtaining matched exchange scenario；

Semantic meaning analysis module is configured to carry out semantic parsing to the information that interactively enters to obtain the friendship of interactive object Mutually it is intended to；

Interaction output module, is configured in the case where the exchange scenario limits, and is intended to carry out according to the interaction multi-modal Interbehavior output.

In one embodiment, the input analysis module includes humanoid confirmation device, and the humanoid confirmation device is configured to It whether monitors in described image information comprising humanoid to determine whether that there are interactive objects.

In one embodiment, the input analysis module also includes face-image positioning device, the face-image positioning Device is configured to：

Position the interactive object face-image.

In one embodiment, the input analysis module also includes face-image resolver, the face-image parsing Device is configured to：

Compared with prior art, method and system of the invention can preferably simulate the mankind in person to person's interactive process and hand over The analysis generating process of mutual behavior, to obtain more naturally lively interaction output, substantially increase robot applies body It tests.

The other feature or advantage of the present invention will illustrate in the following description.Also, the present invention Partial Feature or Advantage will be become apparent by specification, or be appreciated that by implementing the present invention.The purpose of the present invention and part Advantage can be realized or be obtained by specifically noted step in specification, claims and attached drawing.

Description of the drawings

Attached drawing is used to provide further understanding of the present invention, and a part for constitution instruction, the reality with the present invention It applies example and is used together to explain the present invention, be not construed as limiting the invention.In the accompanying drawings：

Fig. 1 is method flow diagram according to an embodiment of the invention；

Fig. 2, Fig. 4 and Fig. 6 are the flow that different embodiment according to the subject invention obtains interactive object characteristic information respectively Figure；

Fig. 3 and Fig. 5 is the flow chart that different embodiment according to the subject invention obtains interactive environment characteristic information respectively；

Fig. 7 is system structure schematic diagram according to an embodiment of the invention.

Specific implementation mode

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings and examples, whereby implementation personnel of the invention Can fully understand that how the invention applies technical means to solve technical problems, and reach technique effect realization process and according to The present invention is embodied according to above-mentioned realization process.If it should be noted that do not constitute conflict, each embodiment in the present invention And each feature in each embodiment can be combined with each other, be formed by technical solution protection scope of the present invention it It is interior.

Involved robot is by executing agency, driving device, control system and perception in this specification description System is constituted.Include mainly head, upper limb portion, trunk and lower limb portion in executing agency, in driving device, including electric drive dress It sets, fluid pressure drive device and actuating device of atmospheric pressure.Core of the control system as robot is similar to the brain of people, Include mainly processor and joint servo control device.

Sensory perceptual system includes internal sensor and external sensor.External sensor includes camera, microphone, ultrasonic wave (or laser radar, infrared) device, to perceive extraneous much information.Camera can be arranged on head, be similar to human eye.It is super Device can be arranged on any part of trunk or other positions for sound wave (or laser radar, infrared), to second camera The presence of head sense object or external environment.Robot has the sense of hearing, vision collecting ability.

What needs to be explained here is that the concrete structure of robot according to the present invention is not limited to foregoing description.According to Other arbitrary hardware configurations may be used on the basis of method of the present invention can be achieved in actual needs, robot.

Further, method of the invention describes to realize in computer systems.The computer system for example may be used To be arranged in the control core processor of robot.For example, method described herein can be implemented as with control logic to come The software of execution is executed by the CPU in robot control system.Function as described herein can be implemented as being stored in non-temporary Program instruction set in when property visible computer readable medium.When implemented in this fashion, which includes one Group instruction, when group instruction is run by computer, it promotes computer to execute the method that can implement above-mentioned function.It is programmable to patrol Collecting temporarily or permanently to be mounted in non-transitory visible computer readable medium, such as ROM chip, calculating Machine memory, disk or other storage mediums.In addition to software come other than realizing, logic as described herein can utilize discrete portion What part, integrated circuit and programmable logic device (such as, field programmable gate array (FPGA) or microprocessor) were used in combination Programmable logic, or embody including any other equipment that they are arbitrarily combined.All such embodiments are intended to fall within this Within the scope of invention.

The existing small scale robot towards household, majority are the interactive modes of single voice or text, are easy to make us detesting It is tired.In order to improve user experience, current certain robots are interacted using limb action with user.Although this is rich to a certain extent Rich interactive form, but since alternate acknowledge mechanism is cured, the alternate acknowledge of robot is stereotyped, is repeatedly interacting In the process, robot often responds the interaction request under user's different conditions using single response.This is easy to that user is allowed to produce It begins to detest tired, greatly reduces user experience.

In order to make the response of robot more natural lively, the user experience of robot is improved, the present invention proposes a kind of machine Device people's exchange method.The interbehavior of person to person is analysed in depth first according to the method for the present invention.In the friendship of person to person During mutually, most directly simplest is exactly to be talked by language, is then further by word, image and limbs Language is talked.And the above-mentioned interactive mode of simulation is corresponded to, robot in the prior art uses voice, text, image or limb The interactive mode of body action.

The complex situations for further considering Health For All, during being talked, interaction participant can be first to coming The semantic understanding carried out from the interactive information (language, word, image and body language) of interactive object, understands other side and thinks on earth Any meaning is expressed, corresponding answer is then made.In above process, interaction participant is not to be directed to interactive object Semanteme simply directly makes simple response, but according to the concrete condition of the interactive environment and interactive object that are presently in Targetedly modification adjustment is made to response.

For example, for " recently how is everything going " this problem, (it is not if interactive object is friend in work It is very close), it is simple " extremely busy, to hurry in processing company affair everyday " suitable answer to be calculated.And if interactive object is Close friend or household then need further specific " doing, be engage in business everyday, start dizziness, today ... recently ", It can just seem in this way and get close to.

Another example, again for " recently how is everything going " this problem, if it is on the way meeting by chance and hand over Mutual object is obviously had something to do, simple " to hurry in processing company affair, waits and leisure look for a chance we eat together everyday Meal " can be completed to talk with.And if be currently at, park is taken a walk and interactive object is apparent also very boring, can be detailed Talk " busy, to be engage in business everyday, all start dizziness, today ... recently ".

That is, in the interaction of person to person, interaction participant can analyze current environment and obtain for assisting interacting Exchange scenario limit (such as interactive object identity, interactive object state and be presently in environment), limited according to exchange scenario It sets the tone the whole interactive mode and interaction content of itself.

Based on above-mentioned analysis, in order to make the response of robot more natural lively, robot according to the method for the present invention exists Current exchange scenario can be analyzed in interactive process to limit, and is limited according to exchange scenario and generated matched interbehavior output.

Next the specific implementation step based on flow chart detailed description according to the method for the embodiment of the present invention.The stream of attached drawing Step shown in journey figure can execute in the computer system comprising such as a group of computer-executable instructions.Although flowing The logical order of each step is shown in journey figure, but in some cases, it can be with different from shown in sequence execution herein The step of going out or describing.

As shown in Figure 1, in an embodiment according to the method for the present invention, step S100 is first carried out, acquires extraneous input Information.In order to obtain accurately and effectively scene restriction, in the step s 100, the extraneous input information of acquisition includes not only user The interactively entering of (robot interactive object) (voice input, action instruction, word input etc.), and include and user and work as The preceding relevant other information of interactive environment.Specifically, in the present embodiment, robot acquires multi-modal extraneous input information, Extraneous input information includes text information, image information (action message for wherein including user), acoustic information, robot self-test Information (such as posture information of robot itself) and induction information (such as infrared induction ranging information).

Next step S110 is executed, extraneous input information is analyzed.In step s 110, mainly analysis is extraneous The concrete meaning (with the relevant concrete meaning of interbehavior) that input information is included, and next based on extraneous input information Analysis result executes step S111 (acquisition interactively enters information), step S112 (obtaining interactive object characteristic information) and step S113 (obtains interactive environment characteristic information).

In the step 111 of the present embodiment, in 112,113：

It refers to interaction content (voice input, action instruction, the word input that user sends out to robot to interactively enter information Deng), it is the basis that robot makes that corresponding interaction is responded to interactively enter information；

Interactive object characteristic information mainly describes characteristic attribute (the current body of user identity, user emotion, user of user Body state etc.)；

The environment that interactive environment characteristic information mainly describes robot and user is presently in.

(step 112,113) can execute step after getting interactive object characteristic information and interactive environment characteristic information Rapid S130, obtain exchange scenario limit step, to interactive object characteristic information and interactive environment characteristic information analyzed with Matched exchange scenario is obtained to limit.When get interactively enter information after (step S111) step S120, language can be executed Adopted analyzing step carries out semantic parsing to obtain the interaction intention of interactive object to interactively entering information.Finally execute step S140, interbehavior export step, under exchange scenario restriction, are intended to the multi-modal interbehavior of progress according to interaction and export.

It should be noted that in the ideal case, based on extraneous input information analysis result can determine it is whole upper State information.But in some cases, since the missing or robot of extraneous input information are in specific state/environment, base Can only determining a part of above- mentioned information in the analysis result of extraneous input information, (for example, current, there is no users, then are not present and hand over Mutual input information, and interactive object characteristic information only includes the label information there is no user).In this case, then it is Step S120,130 are executed according to the information of acquisition, and corresponding interbehavior output (S140) is carried out based on implementing result.

For example, there is no interactively entering information, and interactive object characteristic information only includes the label there is no user In the case of information, into standby mode, the interbehavior of corresponding standby mode is exported.

To sum up, method according to the invention it is possible to preferably simulate point of human interaction behavior in person to person's interactive process Generating process is analysed, to obtain more naturally lively interaction output, substantially increases the application experience of robot.Following base The specific implementation details of the method for the present invention are further described in specific embodiment.

In the extraneous input information that robot is got, image information is a kind of critically important information.In the present invention one In embodiment, acquisition user can be not only analyzed based on image information interactively enters information (such as gesture instruction of user), And image analysis can be based on and obtain interactive object characteristic information and interactive environment characteristic information.

As shown in Fig. 2, acquiring image information in step s 200 and next analyzing image information.In this implementation In example, it can be monitored first based on image information currently with the presence or absence of the object (user) that can be interacted.Step S210 is executed, Humanoid detecting step detects in the image information got with the presence or absence of humanoid.Further, in order to avoid picture, model etc. Influence (be detected as portrait, manikin in picture humanoid) of the similar object to humanoid detection, is also wrapped in step S210 Whether step containing In vivo detection, humanoid in detection image information are live body.

If there is no humanoid, then illustrate to hold at this time there is no the object that can be interacted in current robot visual range Row step S240 exports interactive object characteristic information.In this case, it is labeled with and works as in the interactive object characteristic information of output It is preceding that interactive object is not present.In this way in interbehavior output step later, robot can be according to interactive object feature Information (label that interactive object is not present) output preset interbehavior for no interactive object.

Then the interactive object feature of interactive object is further analyzed when there are interactive object, step is first carried out S220, separated facial image step parse interactive object face-image from the humanoid image confirmed in step S210.It connects down It executes step S231, positions facial image step, interactive object face-image is positioned (namely to currently interaction pair Face/head of elephant is positioned).Step S240 can be executed after the completion of step S240, output includes interactive object face The interactive object characteristic information of portion's framing information (interactive object face/head positioning information).In this way in interaction later Behavior exports in step, and robot can export corresponding interbehavior (example according to interactive object face-image location information As rotary machine head part makes face/head of robot face/eyes face interactive object).

In the present embodiment, interactive object face-image is also further analyzed.After step S220, it can also hold Row step S232, parsing face-image determine interactive object identity, specifically：

Interactive object face-image is parsed from image information when including humanoid in image information；

It is extracted from interactive object face-image and analyzes face feature information；

Determine the interactive object identity that face feature information is characterized.

Step S240 can be executed after step S232, interactive object of the output comprising interactive object identity information is special Reference ceases.In this way in interbehavior output step later, robot can be according to interactive object identity information the output phase The interbehavior (such as using different interactive strategies for robot owner and stranger) answered.

Further, after step S220, step S233 can also be performed, parsing face-image determines interactive object feelings Thread, specifically：

Determine the interactive object mood that face feature information is characterized.

Step S240 can be executed after step S233, interactive object of the output comprising interactive object emotional information is special Reference ceases.In this way in interbehavior output step later, robot can be according to interactive object emotional information the output phase The interbehavior (such as different interactive strategies is respectively adopted when interactive object is angry or sadness) answered.

Further, it in an embodiment according to the method for the present invention, is also based on the analysis to image information and obtains Interactive environment characteristic information.As shown in figure 3, step S300 acquisition image informations are first carried out.Then step S310 is executed, from figure As isolating the background image except interactive object image in information.Then parsing background image information with confirm interactive object/ Residing for robot interactive environment (whether in the room, current weather condition, lamplight status, around whether have other objects/ It is humanoid etc.) (step S320).Finally output includes the interactive environment characteristic information (step S330) of analysis result.In this way at it In interbehavior output step afterwards, robot can be exported according to residing interactive environment corresponding interbehavior (such as Sun-proof measure is taken when prompt interactive object leaves when sunlight is too strong outside room).

In the extraneous input information that robot is got, acoustic information is also a kind of critically important information.In the present invention In one embodiment, based on acoustic information not only can analyze obtain user interactively enter information (such as the interactive voice of user, Sound instruction), and can also audiovideo analysis acquisition interactive object characteristic information and interactive environment characteristic information.

As shown in figure 4, acquiring acoustic information in step S400 and next analyzing acoustic information.In this implementation In example, it can be monitored first based on acoustic information currently with the presence or absence of the user's (interactive object) for having interaction demand.Execute step Rapid S410, whether interactive object speech detection step, it includes interactive object voice to detect in the acoustic information got.

If not including interactive object voice, illustrate not send out interactive voice within the scope of current robot sound collecting User, at this time execute step S440, export interactive object characteristic information.In this case, the interactive object feature of output It is labeled with that there is currently no interactive objects in information.In this way in interbehavior output step later, robot can root Preset interbehavior is exported for no interactive object according to interactive object characteristic information (label that interactive object is not present).

Then the interactive object feature of interactive object is done further when there are interactive object voice (there are interactive objects) Step S420 is first carried out in analysis, detaches interactive object voice step, and interactive object voice is parsed from acoustic information.It connects down Execute step S431, positioning interaction object carries out source of sound analyzing and positioning interactive object voice to interactive object voice and send out Position (interactive object position).Step S440 can be executed after the completion of step S440, output includes interactive object positioning letter The interactive object characteristic information of breath.In this way in interbehavior output step later, robot can be according to interactive object The corresponding interbehavior of location information output (such as rotary machine head part makes robot face/eyes face interaction pair As).

In the present embodiment, interactive object voice is also further analyzed.After step S420, step can also be performed Rapid S432, parsing interactive object voice determine interactive object identity, specifically, carrying out voiceprint analysis to interactive object voice with true Determine the user identity corresponding to interactive object voice.

Step S440 can be executed after step S432, interactive object of the output comprising interactive object identity information is special Reference ceases.In this way in interbehavior output step later, robot can be according to interactive object identity information the output phase The interbehavior (such as using different interactive strategies for robot owner and stranger) answered.

Further, after step S420, step S433 can also be performed, parsing interactive object voice determines interaction pair As mood, specifically, carrying out voiceprint analysis to interactive object voice to determine interactive object feelings that interactive object voice is characterized Thread.

Step S440 can be executed after step S433, interactive object of the output comprising interactive object emotional information is special Reference ceases.In this way in interbehavior output step later, robot can be according to interactive object emotional information the output phase The interbehavior (such as different interactive strategies is respectively adopted when interactive object is angry or sadness) answered.

Further, it in an embodiment according to the method for the present invention, is also based on the analysis to acoustic information and obtains Interactive environment characteristic information.As shown in figure 5, step S500 acquisition acoustic informations are first carried out.Then step S510 is executed, from sound The background sound except interactive object voice is isolated in message breath.Then parsing background sound information with confirm interactive object/ Interactive environment residing for robot (is in quiet interior is in the big near roads of vehicle flowrate, whether surrounding has it Other people etc.) (step S520).Finally output includes the interactive environment characteristic information (step S530) of analysis result.In this way at it In interbehavior output step afterwards, robot can be exported according to residing interactive environment corresponding interbehavior (such as In the big near roads of vehicle flowrate (in background sound information there are a large amount of vehicles traveling and whistle sound) when remind user from Traffic safety is paid attention to when opening).

It should be noted that in Fig. 3 or embodiment illustrated in fig. 4, the interactive object characteristic information of final output includes to be It is no that there are the expression information of interactive object, interactive object Face location information, interactive object identity information and interactive object feelings Thread information.Of course, in practical implementation, according to specific image information, one or more of above- mentioned information can be with For sky.In addition, in other embodiments of the present invention, can also simplify step according to specific interaction demand, cancel said one Or the generation step of multiple information (such as in the occasion that need not be directed to user identity and take distinct interaction strategy, can be cancelled Step S232 or step S432).

In addition, in Fig. 2-embodiment illustrated in fig. 5, it is based respectively on the analysis to image information and acoustic information and obtains in detail Interactive object characteristic information and interactive environment characteristic information.But it is to be herein pointed out in practical implementation, Fig. 2- Embodiment illustrated in fig. 5 is an overall execution flow part respectively.Simple only relies only in Fig. 2-embodiment illustrated in fig. 5 One embodiment can not obtain enough accurately and reliably information.

For example, step S210 is humanoid to determine whether in the presence of pair that can be interacted by whether there is in detection image information As, but when in image information there is no it is humanoid when, it is possible to user is other than robot sight, and user still can be at this time Based on voice and robot interactive.Equally, whether step S410 is sentenced comprising interactive object voice by detecting in acoustic information It is disconnected to whether there is the object that interacted, but when interactive object voice is not present in acoustic information, it is possible to user does not have Any sound is sent out, but there are users in face of robot.

Therefore, the judging result of step S210 or S410 be not one really determine as a result, can only be for assisting Robot is determined further.Based on the above situation, in an embodiment of the present invention, uses and be performed simultaneously step S210 And the mode of step S410, the testing result of humanoid detection and interactive object speech detection is integrated to judge currently whether deposit In interactive object.

As shown in fig. 6, step S600 is first carried out, image information and acoustic information are acquired.Then step S610 is executed, Humanoid detecting step (including In vivo detection step), detects in the image information got with the presence or absence of humanoid.

If there is humanoid, S611 is thened follow the steps, separated facial image step, what is confirmed from step S610 is humanoid Interactive object face-image is parsed in image.Next step S612 is executed, parsing face-image obtains the parsing of face-image As a result.Meanwhile step S613 is executed, facial image step is positioned, interactive object face-image is positioned.

While handling interactive object face-image (or before/after), robot is carried out for acoustic information Processing.Step S630 is executed, whether interactive object speech detection step, it includes interactive object to detect in the acoustic information got Voice.

If including interactive object voice in acoustic information, S631 is thened follow the steps, detaches interactive object voice, and then Step S632 is executed, interactive object voice is parsed.

(step S632 and step S612 have been executed after interactive object voice and interactive object face-image parse Finish) step S640, comprehensive analysis image analysis result and speech analysis result are executed to determine that interactive object characteristic information (is handed over Identity, the mood etc. of mutual object).Particularly, when it is in step S630 the result is that in acoustic information do not include interactive object voice, Image analysis result is only analyzed in step S640.

Step S650 can be executed after the completion of step S613 and/or step S640, export interactive object characteristic information (including interactive object Face location information, interactive object identity and/or interactive object mood).

Particularly, in above process, if the testing result of step S610 is thened follow the steps there is no humanoid S620, whether interactive object speech detection step, it includes interactive object voice to detect in the acoustic information got.If sound Interactive object voice (being yet not present in image information humanoid) is not included in information, then explanation is currently without interactive object, at this time Step S650 is executed, interactive object characteristic information is exported.In this case, it is labeled in the interactive object characteristic information of output There is currently no interactive objects.

Then interactive object voice is done into one when there are interactive object voice (but there is no humanoid in image information) Step analysis, is first carried out step S621, detaches interactive object voice step, and interactive object voice is parsed from acoustic information.It connects Get off to execute step S622, positioning interaction object carries out interactive object voice the hair of source of sound analyzing and positioning interactive object voice Out position (interactive object position).

The interactive object characteristic information comprising interactive object location information can be exported after the completion of step S622.Machine People can rotate head according to interactive object location information and make robot eyes (camera of acquisition image information) face Interactive object (executes step S623).In this way, carrying out humanoid detection (executing step S624) again, robot eyes (are adopted at this time Collect image information camera) obtain image information in just comprising humanoid (user images).

Next step S661, separated facial image (step S611) are executed；Further execute step S662 positioning faces Image (step S613) and step S663 parsing face-images (step S612) are performed simultaneously step S625 parsing interactive objects Voice (step S632)；Finally execute step S664 comprehensive analysis image/speech analysis result (step S640).Finally, in step Interactive object characteristic information is exported in rapid S650.

Particularly, if carry out humanoid detection (step S624) again, in the image information that robot obtains still not Including humanoid (user images) (due to being blocked etc. sight), then step S625 is directly executed, current user's language is parsed Sound (step 632) is simultaneously finally exported the result of speech analysis by step S650.

Further, in an embodiment of the present invention, synthesis is also used when analysis obtains interactive environment characteristic information The analysis result of background image (embodiment illustrated in fig. 3) and background sound (embodiment illustrated in fig. 5) analyzes acquisition interactive environment Characteristic information.

The invention also provides a kind of robot systems for exchange method based on the present invention.As shown in fig. 7, in the present invention one In embodiment, robot system includes acquisition module 700, input analysis module 710, exchange scenario generation module 730, semantic solution Analyse module 720 and interaction output module 740.

Acquisition module 700 is configured to acquire multi-modal extraneous input information.In the present embodiment, acquisition module 700 wraps Harvester containing text information 701, image information collecting device 702, acoustic information harvester 703 and induction information acquisition dress Set 704.Further, acquisition module 700 further includes robot self-test information collecting device.Robot self-test information collecting device It can be realized by the self-test component in robot hardware's component, or be realized in a manner of software and hardware combining, do not limited to.

It should be pointed out that according to specific requirements, in other embodiments of the present invention, can be constructed in acquisition module 700 It is one or several in above-mentioned apparatus, or construct the device with other acquisition functions.

Input analysis module 710 is configured to analyze extraneous input information and interactively enters information, interactive object feature with determination Information and interactive environment characteristic information.Exchange scenario generation module 730 is configured to interactive object characteristic information and interaction Environmental characteristic information is analyzed to be limited with obtaining matched exchange scenario.Semantic meaning analysis module 720 is configured to interactively entering Information is carried out semantic parsing and is intended to the interaction for obtaining interactive object.Interaction output module 740 is configured to limit in exchange scenario Under, it is intended to carry out multi-modal interbehavior output according to interaction.

Compared with prior art, system of the invention can preferably simulate human interaction behavior in person to person's interactive process Analysis generating process, to obtain more naturally lively interaction output, the application for substantially increasing robot is experienced.

Specifically, in the present embodiment, input analysis module 710 include for image information humanoid confirmation device 711, Face-image positioning device 712 and face-image resolver 713.Wherein, humanoid confirmation device 711 is configured to monitoring image Whether comprising humanoid to determine whether that there are interactive objects in information.Face-image positioning device 712 is configured to：Work as image Interactive object face-image is parsed from image information when including humanoid in information；Positioning interaction subjects face image.Face figure As resolver 713 is configured to：It is extracted from interactive object face-image and analyzes face feature information；Determine that facial characteristics is believed Breath characterized interactive object mood or interactive object identity.

It includes the interactive object voice confirmation device 714 for being directed to acoustic information, voice positioning dress to input analysis module 710 also Set 715 and interactive object speech analysis device 716.Wherein, interactive object voice confirms that device 714 is configured to monitoring sound letter In breath whether comprising interactive object voice to determine whether that there are interactive objects.Voice positioning device 715 is configured to：Work as sound Source of sound positioning is carried out to determine interactive object position to interactive object voice when including interactive object voice in message breath.Interaction pair As speech analysis device 716 is configured to carry out vocal print parsing to interactive object voice to determine friendship that interactive object voice is characterized Mutual subjects' mood or interactive object identity.

What needs to be explained here is that according to specific requirements, in other embodiments of the present invention, input in analysis module 710 It can construct one or several in above-mentioned apparatus, or construct the device with other analytic functions.

While it is disclosed that embodiment content as above but described only to facilitate understanding the present invention and adopting Embodiment is not limited to the present invention.Method of the present invention can also have other various embodiments.Without departing substantially from In the case of essence of the present invention, those skilled in the art make various corresponding changes or change in accordance with the present invention Shape, but these corresponding changes or deformation should all belong to the scope of the claims of the present invention.

Claims

1. a kind of robot interactive method, which is characterized in that the described method comprises the following steps：

Acquire multi-modal extraneous input information, external world's input information include text information, image information, acoustic information, Robot self-test information and induction information；

It analyzes the extraneous input information and interactively enters information, interactive object characteristic information and interactive environment feature letter to obtain Breath；

The interactive object characteristic information and the interactive environment characteristic information are analyzed to obtain matched interactive feelings Scape limits；

2. according to the method described in claim 1, it is characterized in that, the analysis extraneous input information is to determine interactive object spy Reference ceases, including：

3. according to the method described in claim 2, it is characterized in that, the analysis extraneous input information is to determine interactive object spy Reference ceases, including：

Position the interactive object face-image.

4. according to the method described in claim 3, it is characterized in that, the analysis extraneous input information is to determine interactive object spy Reference ceases, including：

5. according to the method described in any one of claim 1-4, which is characterized in that the analysis extraneous input information is with true Determine interactive object characteristic information, including：

Monitor in the acoustic information whether comprising interactive object voice to determine whether to have the object that can be interacted.

6. according to the method described in claim 5, which is characterized in that the analysis extraneous input information is to determine interactive object Characteristic information, including：

Parse the interactive object mood or user identity that the interactive object voice is characterized with the determination interactive object voice.

7. a kind of robot system, which is characterized in that the system comprises：

Acquisition module is configured to acquire multi-modal extraneous input information, and the acquisition module includes text information acquisition dress It sets, image information collecting device, acoustic information harvester, robot self-test information collecting device and induction information acquisition dress It sets；

Analysis module is inputted, is configured to analyze the extraneous input information and interactively enters information, interactive object feature to obtain Information and interactive environment characteristic information；

Exchange scenario generation module, be configured to the interactive object characteristic information and the interactive environment characteristic information into Row analysis is limited with obtaining matched exchange scenario；

Semantic meaning analysis module is configured to carry out semantic parsing to the information that interactively enters to obtain the interaction meaning of interactive object Figure；

Interaction output module, is configured in the case where the exchange scenario limits, and is intended to carry out multi-modal friendship according to the interaction Mutual behavior output.

8. system according to claim 7, which is characterized in that the input analysis module includes humanoid confirmation device, institute Humanoid confirmation device is stated whether to be configured in monitoring described image information comprising humanoid to determine whether that there are interactive objects.

9. system according to claim 8, which is characterized in that the input analysis module also includes face-image positioning dress It sets, the face-image positioning device is configured to：

Position the interactive object face-image.

10. system according to claim 9, which is characterized in that the input analysis module also includes that face-image parses Device, the face-image resolver are configured to：