WO2018000268A1 - Method and system for generating robot interaction content, and robot - Google Patents
Method and system for generating robot interaction content, and robot Download PDFInfo
- Publication number
- WO2018000268A1 WO2018000268A1 PCT/CN2016/087753 CN2016087753W WO2018000268A1 WO 2018000268 A1 WO2018000268 A1 WO 2018000268A1 CN 2016087753 W CN2016087753 W CN 2016087753W WO 2018000268 A1 WO2018000268 A1 WO 2018000268A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- robot
- time axis
- signal
- life
- generating
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/0005—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/0005—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
- B25J11/001—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means with emotions simulating means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1628—Programme controls characterised by the control loop
- B25J9/163—Programme controls characterised by the control loop learning, adaptive, model based, rule based expert control
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1694—Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J9/00—Programme-controlled manipulators
- B25J9/16—Programme controls
- B25J9/1694—Programme controls characterised by use of sensors other than normal servo-feedback from position, speed or acceleration sensors, perception control, multi-sensor controlled systems, sensor fusion
- B25J9/1697—Vision controlled systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Definitions
- the invention relates to the field of robot interaction technology, and in particular to a method, a system and a robot for generating robot interactive content.
- an expression is made in the process of human interaction.
- a reasonable expression feedback is given, and the person comes to a life scene on a certain time axis, such as eating, Sleeping, exercise, etc.
- changes in various scene values can affect the feedback of human expression.
- the current desire for robots to make expression feedback is mainly through pre-designed methods and deep learning training corpus.
- This kind of feedback through pre-designed programs and corpus training has the following disadvantages:
- the output of the expression depends on the human text representation, that is, similar to a question-and-answer machine, the different words of the user trigger different expressions.
- the robot actually outputs the expression according to the human pre-designed interaction mode, which leads to the robot.
- the object of the present invention is to provide a method, a system and a robot for generating robot interactive content, so that the robot itself has a human lifestyle within the active interactive variable parameters, enhances the anthropomorphicity of the robot interactive content generation, and enhances the human-computer interaction experience. Improve intelligence.
- a method for generating robot interactive content comprising:
- the robot interaction content is generated in conjunction with the current robot life timeline.
- the method for generating parameters of the life time axis of the robot includes:
- the self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- the step of expanding the self-cognition of the robot specifically comprises: combining the life scene with the self-knowledge of the robot to form a self-cognitive curve based on the life time axis.
- the step of fitting the self-cognitive parameter of the robot to the parameter in the life time axis comprises: using a probability algorithm to calculate each parameter of the robot on the life time axis after the time axis scene parameter is changed.
- the probability of change forms a fitted curve.
- the life time axis refers to a time axis including 24 hours a day
- the parameters in the life time axis include at least a daily life behavior performed by the user on the life time axis and parameter values representing the behavior.
- the multi-modal signal includes at least an image signal
- the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis specifically includes:
- the robot interaction content is generated in conjunction with the current robot life timeline.
- the multi-modal signal includes at least a voice signal
- the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis specifically includes:
- the robot interaction content is generated in conjunction with the current robot life timeline.
- the multi-modal signal includes at least a gesture signal
- the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis specifically includes:
- a robot interaction content is generated in accordance with the current robot life timeline based on the gesture signal and the user intent.
- the invention discloses a system for generating robot interactive content, comprising:
- An intent identification module configured to determine a user intent according to the multimodal signal
- a content generating module configured to combine current according to the multimodal signal and the user intention
- the robot life timeline generates robot interactive content.
- the system comprises a time axis based and artificial intelligence cloud processing module for:
- the self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- the time-based and artificial intelligence cloud processing module is further configured to combine a life scene with a self-awareness of the robot to form a self-cognitive curve based on a life time axis.
- the time axis-based and artificial intelligence cloud processing module is further configured to: use a probability algorithm to calculate a probability of each parameter change of the robot on the life time axis after the time axis scene parameter is changed, to form a fitting curve.
- the life time axis refers to a time axis including 24 hours a day
- the parameters in the life time axis include at least a daily life behavior performed by the user on the life time axis and parameter values representing the behavior.
- the multi-modal signal includes at least an image signal
- the content generating module is specifically configured to: generate, according to the image signal and the user intention, a robot interaction content according to a current robot life time axis.
- the multi-modal signal includes at least a voice signal
- the content generating module is specifically configured to: generate, according to the voice signal and the user intention, a robot interaction content according to a current life time axis of the robot.
- the multi-modal signal includes at least a gesture signal
- the content generating module is specifically configured to: generate, according to the gesture signal and the user intention, a robot interaction content according to a current robot life time axis.
- the invention discloses a robot comprising a system for generating interactive content of a robot as described above.
- a method for generating interactive content of a robot includes: acquiring a multi-modal signal; determining a user intention according to the multi-modal signal; and combining a current life time axis of the robot according to the multi-modal signal and the user intention Generate robot interaction content.
- multi-modal signals such as image signals, speech signals, and robot variable parameters can be combined to more accurately generate robot interaction content, thereby being more accurate, Personification and interaction with people. For people, everyday life has a certain regularity.
- the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content.
- Generate anthropomorphic enhance the human-computer interaction experience and improve intelligence.
- FIG. 1 is a flowchart of a method for generating interactive content of a robot according to Embodiment 1 of the present invention
- FIG. 2 is a schematic diagram of a system for generating interactive content of a robot according to a second embodiment of the present invention.
- Computer devices include user devices and network devices.
- the user equipment or the client includes but is not limited to a computer, a smart phone, a PDA, etc.;
- the network device includes but is not limited to a single network server, a server group composed of multiple network servers, or a cloud computing-based computer or network server. cloud.
- the computer device can operate alone to carry out the invention, and can also access the network and implement the invention through interoperation with other computer devices in the network.
- the network in which the computer device is located includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.
- first means “first,” “second,” and the like may be used herein to describe the various elements, but the elements should not be limited by these terms, and the terms are used only to distinguish one element from another.
- the term “and/or” used herein includes any and all combinations of one or more of the associated listed items. When a unit is referred to as being “connected” or “coupled” to another unit, it can be directly connected or coupled to the other unit, or an intermediate unit can be present.
- a method for generating interactive content of a robot including:
- a method for generating interactive content of a robot includes: acquiring a multi-modal signal; determining a user intention according to the multi-modal signal; and combining a current life time axis of the robot according to the multi-modal signal and the user intention Generate robot interaction content.
- multi-modal signals such as image signals and speech signals can be combined with robot variable parameters to more accurately generate robot interaction content, thereby more accurately and anthropomorphic interaction and communication with people. For people, everyday life has a certain regularity.
- the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content.
- Generate anthropomorphic enhance the human-computer interaction experience and improve intelligence.
- the interactive content can be an expression or text or voice.
- the robot life timeline 300 is completed and set in advance. Specifically, the robot life timeline 300 is a series of parameter collections, and this parameter is transmitted to the system to generate interactive content.
- the multimodal information in this embodiment may be one of user expression, voice information, gesture information, scene information, image information, video information, face information, pupil iris information, light sense information, and fingerprint information.
- voice information voice information
- gesture information scene information
- image information image information
- video information face information
- pupil iris information light sense information
- fingerprint information fingerprint information.
- the life time axis is specifically: according to the time axis of human daily life, the robot is fitted with the time axis of human daily life, and the behavior of the robot follows this fitting line. Move, that is, get the robot's own behavior in a day, so that the robot can perform its own behavior based on the life time axis, such as generating interactive content and communicating with humans. If the robot is always awake, it will act according to the behavior on this timeline, and the robot's self-awareness will be changed according to this timeline.
- the life timeline and variable parameters can be used to change the attributes of self-cognition, such as mood values, fatigue values, etc., and can also automatically add new self-awareness information, such as no previous anger value, based on the life time axis and The scene of the variable factor will automatically add to the self-cognition of the robot based on the scene that previously simulated the human self-cognition.
- the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot life timeline, for example, the current time is 9 am, then the robot knows that the owner is just getting up, then you should ask the owner early, for example, answer "Good morning” as a reply, It is also possible to match an expression, a picture, etc., and the interactive content in the present invention can be understood as a reply of the robot.
- the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot knows that the owner needs to sleep, then he will reply with the words "master good night, sleep well” and the like. It can also be accompanied by expressions, pictures, etc. This kind of approach is more anthropomorphic than simply relying on scene recognition to generate replies and expressions that are more intimate with people's lives.
- the multi-modal signal is generally a combination of a plurality of signals, such as a picture signal plus a voice signal, or a picture signal plus a voice signal plus a gesture signal.
- the method for generating parameters of the robot life time axis includes:
- the self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- the life time axis is added to the self-cognition of the robot itself, so that the robot has an anthropomorphic life. For example, add the cognition of lunch to the robot.
- the step of expanding the self-cognition of the robot specifically includes: combining the life scene with the self-awareness of the robot to form a self-cognitive curve based on the life time axis.
- the life time axis can be specifically added to the parameters of the robot itself.
- the parameter of the self-cognition of the robot and the life time axis The step of fitting the parameters in the method specifically includes: using a probability algorithm, calculating a probability of each parameter change of the robot on the life time axis after the time axis scene parameter is changed, and forming a fitting curve.
- the probability algorithm may be a Bayesian probability algorithm.
- the robot will have sleep, exercise, eat, dance, read books, eat, make up, sleep and other actions. Each action will affect the self-cognition of the robot itself, and combine the parameters on the life time axis with the self-cognition of the robot itself.
- the robot's self-cognition includes, mood, fatigue value, intimacy. , goodness, number of interactions, three-dimensional cognition of the robot, age, height, weight, intimacy, game scene value, game object value, location scene value, location object value, etc. For the robot to identify the location of the scene, such as cafes, bedrooms, etc.
- the machine will perform different actions in the time axis of the day, such as sleeping at night, eating at noon, exercising during the day, etc. All the scenes in the life time axis will have an impact on self-awareness. These numerical changes are modeled by the dynamic fit of the probability model, fitting the probability that all of these actions occur on the time axis.
- the multi-modal signal includes at least an image signal
- the step of generating the robot interaction content in combination with the current robot life time axis according to the multi-modality signal and the user intention specifically includes:
- the robot interaction content is generated in conjunction with the current robot life timeline.
- the multi-modal signal includes at least an image signal, so that the robot can grasp the user's intention, and in order to better understand the user's intention, other signals, such as a voice signal, a gesture signal, etc., are generally added, so that the robot can be more accurately understood. Whether the user is the real expression or the meaning of a joke.
- the multi-modal signal includes at least a voice signal
- the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis specifically includes:
- the robot interaction content is generated in conjunction with the current robot life timeline.
- the multi-modal signal includes at least a gesture signal
- the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis specifically includes:
- the multi-modal signal is used by the user to speak to the robot by using voice: "hungry”, the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes according to the multi-modal signal such as the above-mentioned voice signal plus picture signal.
- the user's intention is that the user is very hungry, and the robot life timeline, for example, the current time is 9 am, then the robot will reply, let the user go to breakfast, and with a cute expression.
- the multi-modal signal is used by the user to speak to the robot by using voice: "hungry”
- the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes the multi-modal signal according to the above-mentioned voice signal plus picture signal.
- the user's intention is that the user is very hungry, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot will reply, eat too late, and have a cute expression.
- the voice signal and the picture signal are generally used to accurately understand the meaning of the user, thereby more accurately replying to the user.
- other signals are more accurate, such as gesture signals, video signals, and the like.
- a system for generating interactive content of a robot includes:
- the obtaining module 201 is configured to acquire a multi-modal signal
- the intent identification module 202 is configured to determine a user intent according to the multimodal signal
- the content generation module 203 is configured to generate the robot interaction content according to the current multi-modality signal and the user intention, in conjunction with the current robot life time axis sent by the robot life timeline module 301.
- multi-modal signals such as image signals and speech signals can be combined with robot variable parameters to more accurately generate robot interaction content, thereby more accurately and anthropomorphic interaction and communication with people.
- everyday life has a certain regularity.
- the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content.
- Generate anthropomorphic enhance the human-computer interaction experience and improve intelligence.
- the interactive content can be an expression or text or voice.
- the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot according to the multi-modal signal such as the above-mentioned voice signal
- the user's intention is recognized as the user is very sleepy, and the robot life timeline, for example, the current time is 9:00 am, then the robot knows that the owner is just getting up, then the owner should ask early, for example, answer "Good morning” as a reply, can also be accompanied by expressions, pictures, etc.
- the interactive content in the present invention can be understood as the reply of the robot.
- the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot knows that the owner needs to sleep, then he will reply with the words "master good night, sleep well” and the like. It can also be accompanied by expressions, pictures, etc. This kind of approach is more anthropomorphic than simply relying on scene recognition to generate replies and expressions that are more intimate with people's lives.
- the multi-modal signal is generally a combination of a plurality of signals, such as a picture signal plus a voice signal, or a picture signal plus a voice signal plus a gesture signal.
- the system includes a time axis based and artificial intelligence cloud processing module for:
- the self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- the life time axis is added to the self-cognition of the robot itself, so that the robot has an anthropomorphic life. For example, add the cognition of lunch to the robot.
- the time-based and artificial intelligence cloud processing module is further configured to combine a life scene with a self-awareness of the robot to form a self-cognitive curve based on a life time axis.
- the life time axis can be specifically added to the parameters of the robot itself.
- the time axis-based and artificial intelligence cloud processing module is further configured to: use a probability algorithm to calculate a probability of each parameter change of the robot on the life time axis after the time axis scene parameter changes, to form a fit curve.
- the probability algorithm may be a Bayesian probability algorithm.
- the robot will have sleep, exercise, eat, dance, read books, eat, make up, sleep and other actions. Each action will affect the self-cognition of the robot itself, and combine the parameters on the life time axis with the self-cognition of the robot itself.
- the robot's self-cognition includes, mood, fatigue value, intimacy. , good feelings, number of interactions, The three-dimensional cognition, age, height, weight, intimacy, game scene value, game object value, location scene value, location object value, etc. of the robot.
- For the robot to identify the location of the scene such as cafes, bedrooms, etc.
- the machine will perform different actions in the time axis of the day, such as sleeping at night, eating at noon, exercising during the day, etc. All the scenes in the life time axis will have an impact on self-awareness. These numerical changes are modeled by the dynamic fit of the probability model, fitting the probability that all of these actions occur on the time axis.
- the multi-modality signal includes at least an image signal
- the content generation module is specifically configured to generate the robot interaction content according to the current robot life time axis according to the image signal and the user intention.
- the multi-modal signal includes at least an image signal, so that the robot can grasp the user's intention, and in order to better understand the user's intention, other signals, such as a voice signal, a gesture signal, etc., are generally added, so that the robot can be more accurately understood. Whether the user is the real expression or the meaning of a joke.
- the multi-modal signal includes at least a voice signal
- the content generating module is specifically configured to: generate, according to the voice signal and the user intention, a robot interaction content according to a current robot life time axis.
- the multi-modality signal includes at least a gesture signal
- the content generation module is specifically configured to generate the robot interaction content according to the current robot life time axis according to the gesture signal and the user intention.
- the multi-modal signal is used by the user to speak to the robot by using voice: "hungry”, the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes according to the multi-modal signal such as the above-mentioned voice signal plus picture signal.
- the user's intention is that the user is very hungry, and the robot life timeline, for example, the current time is 9 am, then the robot will reply, let the user go to breakfast, and with a cute expression.
- the multi-modal signal is used by the user to speak to the robot by using voice: "hungry”
- the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes the multi-modal signal according to the above-mentioned voice signal plus picture signal.
- the user's intention is that the user is very hungry, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot will reply, eat too late, and have a cute expression.
- the voice signal and the picture signal are generally used to accurately understand the meaning of the user, thereby more accurately replying to the user.
- other signals are more accurate, such as gesture signals, video signals, and the like.
- the invention discloses a robot comprising a system for generating interactive content of a robot as described above.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Mechanical Engineering (AREA)
- Robotics (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Algebra (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Manipulator (AREA)
- Toys (AREA)
Abstract
A method for generating robot interaction content, comprising: obtaining a multi-modal signal (S101); determining a user intention according to the multi-modal signal (S102); and generating robot interaction content by combining a current life timeline of a robot according to the multi-modal signal and the user intention (S103). By means of the method, the life timeline where the robot is located is added to generation of the robot interaction content, such that the robot is more humanized when interacting with human and has a human lifestyle within the life timeline, and the humanization of robot interaction content generation, the human-robot interaction experience, and the intelligence can be improved.
Description
本发明涉及机器人交互技术领域,尤其涉及一种机器人交互内容的生成方法、系统及机器人。The invention relates to the field of robot interaction technology, and in particular to a method, a system and a robot for generating robot interactive content.
通常人类再交互过程中做出一个表情,一般是在眼睛看到或者耳朵听到声音之后,经过大脑分析过后进行合理的表情反馈,人来在某一天的时间轴上的生活场景,比如吃饭,睡觉,运动等,各种场景值的变化会影响人类表情的反馈。而对于机器人而言,目前想让机器人做出表情上的反馈,主要通过预先设计好的方式与深度学习训练语料得来,这种通过预先设计好的程序与语料训练的表情反馈存在以下缺点:表情的输出依赖于人类的文本表示,即与一个问答的机器相似,用户不同的话语触发不同的表情,这种情况下机器人实际还是按照人类预先设计好的交互方式进行表情的输出,这导致机器人不能更加拟人化,不能像人类一样,在不同的时间点的生活场景,表现出不同的表情,即机器人交互内容的生成方式完全是被动的,因此表情的生成需要大量的人机交互,导致机器人的智能性很差。Usually, an expression is made in the process of human interaction. Generally, after the eye sees or the ear hears the sound, after the brain analyzes, a reasonable expression feedback is given, and the person comes to a life scene on a certain time axis, such as eating, Sleeping, exercise, etc., changes in various scene values can affect the feedback of human expression. For robots, the current desire for robots to make expression feedback is mainly through pre-designed methods and deep learning training corpus. This kind of feedback through pre-designed programs and corpus training has the following disadvantages: The output of the expression depends on the human text representation, that is, similar to a question-and-answer machine, the different words of the user trigger different expressions. In this case, the robot actually outputs the expression according to the human pre-designed interaction mode, which leads to the robot. Can not be more anthropomorphic, can not be like humans, life scenes at different time points, showing different expressions, that is, the way the robot interactive content is generated is completely passive, so the generation of expressions requires a lot of human-computer interaction, resulting in robots The intelligence is very poor.
因此,如何使得机器人本身在生活时间轴内具有人类的生活方式,提升机器人交互内容生成的拟人性,是本技术领域亟需解决的技术问题。Therefore, how to make the robot itself have a human lifestyle in the life time axis and improve the anthropomorphicity of the robot interactive content generation is a technical problem that needs to be solved in the technical field.
发明内容Summary of the invention
本发明的目的是提供一种机器人交互内容的生成方法、系统及机器人,使得机器人本身在主动交互可变参数内具有人类的生活方式,提升机器人交互内容生成的拟人性,提升人机交互体验,提高智能性。The object of the present invention is to provide a method, a system and a robot for generating robot interactive content, so that the robot itself has a human lifestyle within the active interactive variable parameters, enhances the anthropomorphicity of the robot interactive content generation, and enhances the human-computer interaction experience. Improve intelligence.
本发明的目的是通过以下技术方案来实现的:The object of the present invention is achieved by the following technical solutions:
一种机器人交互内容的生成方法,包括:A method for generating robot interactive content, comprising:
获取多模态信号;Obtaining a multimodal signal;
根据所述多模态信号确定用户意图;Determining a user intent based on the multimodal signal;
根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。
Based on the multimodal signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
优选的,所述机器人生活时间轴的参数的生成方法包括:Preferably, the method for generating parameters of the life time axis of the robot includes:
将机器人的自我认知进行扩展;Extend the robot's self-awareness;
获取生活时间轴的参数;Get the parameters of the life timeline;
对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
优选的,所述将机器人的自我认知进行扩展的步骤具体包括:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。Preferably, the step of expanding the self-cognition of the robot specifically comprises: combining the life scene with the self-knowledge of the robot to form a self-cognitive curve based on the life time axis.
优选的,所述对机器人的自我认知的参数与生活时间轴中的参数进行拟合的步骤具体包括:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。Preferably, the step of fitting the self-cognitive parameter of the robot to the parameter in the life time axis comprises: using a probability algorithm to calculate each parameter of the robot on the life time axis after the time axis scene parameter is changed. The probability of change forms a fitted curve.
优选的,其中,所述生活时间轴指包含一天24小时的时间轴,所述生活时间轴中的参数至少包括用户在所述生活时间轴上进行的日常生活行为以及代表该行为的参数值。Preferably, wherein the life time axis refers to a time axis including 24 hours a day, and the parameters in the life time axis include at least a daily life behavior performed by the user on the life time axis and parameter values representing the behavior.
优选的,所述多模态信号至少包括图像信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:Preferably, the multi-modal signal includes at least an image signal, and the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis, specifically includes:
根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Based on the image signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
优选的,所述多模态信号至少包括语音信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:Preferably, the multi-modal signal includes at least a voice signal, and the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis, specifically includes:
根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Based on the speech signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
优选的,所述多模态信号至少包括手势信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:Preferably, the multi-modal signal includes at least a gesture signal, and the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis, specifically includes:
根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。A robot interaction content is generated in accordance with the current robot life timeline based on the gesture signal and the user intent.
本发明公开一种机器人交互内容的生成系统,包括:The invention discloses a system for generating robot interactive content, comprising:
获取模块,用于获取多模态信号;An acquisition module for acquiring a multimodal signal;
意图识别模块,用于根据所述多模态信号确定用户意图;An intent identification module, configured to determine a user intent according to the multimodal signal;
内容生成模块,用于根据所述多模态信号和所述用户意图,结合当前
的机器人生活时间轴生成机器人交互内容。a content generating module, configured to combine current according to the multimodal signal and the user intention
The robot life timeline generates robot interactive content.
优选的,所述系统包括基于时间轴与人工智能云处理模块,用于:Preferably, the system comprises a time axis based and artificial intelligence cloud processing module for:
将机器人的自我认知进行扩展;Extend the robot's self-awareness;
获取生活时间轴的参数;Get the parameters of the life timeline;
对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
优选的,所述基于时间轴与人工智能云处理模块进一步用于:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。Preferably, the time-based and artificial intelligence cloud processing module is further configured to combine a life scene with a self-awareness of the robot to form a self-cognitive curve based on a life time axis.
优选的,所述基于时间轴与人工智能云处理模块进一步用于:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。Preferably, the time axis-based and artificial intelligence cloud processing module is further configured to: use a probability algorithm to calculate a probability of each parameter change of the robot on the life time axis after the time axis scene parameter is changed, to form a fitting curve.
优选的,其中,所述生活时间轴指包含一天24小时的时间轴,所述生活时间轴中的参数至少包括用户在所述生活时间轴上进行的日常生活行为以及代表该行为的参数值。Preferably, wherein the life time axis refers to a time axis including 24 hours a day, and the parameters in the life time axis include at least a daily life behavior performed by the user on the life time axis and parameter values representing the behavior.
优选的,所述多模态信号至少包括图像信号,所述内容生成模块具体用于:根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Preferably, the multi-modal signal includes at least an image signal, and the content generating module is specifically configured to: generate, according to the image signal and the user intention, a robot interaction content according to a current robot life time axis.
优选的,所述多模态信号至少包括语音信号,所述内容生成模块具体用于:根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Preferably, the multi-modal signal includes at least a voice signal, and the content generating module is specifically configured to: generate, according to the voice signal and the user intention, a robot interaction content according to a current life time axis of the robot.
优选的,所述多模态信号至少包括手势信号,所述内容生成模块具体用于:根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Preferably, the multi-modal signal includes at least a gesture signal, and the content generating module is specifically configured to: generate, according to the gesture signal and the user intention, a robot interaction content according to a current robot life time axis.
本发明公开一种机器人,包括如上述任一所述的一种机器人交互内容的生成系统。The invention discloses a robot comprising a system for generating interactive content of a robot as described above.
相比现有技术,本发明具有以下优点:现有机器人对于应用场景来说,一般是基于固的场景中的问答交互机器人交互内容的生成方法,无法基于当前的场景来更加准确的生成机器人的表情。本发明一种机器人交互内容的生成方法,包括:获取多模态信号;根据所述多模态信号确定用户意图;根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。这样就可以根据多模态信号例如图像信号、语音信号,结合机器人可变参数来更加准确地生成机器人交互内容,从而更加准确、
拟人化的与人进行交互和沟通。对于人来讲每天的生活都具有一定的规律性,为了让机器人与人沟通时更加拟人化,在一天24小时中,让机器人也会有睡觉,运动,吃饭,跳舞,看书,吃饭,化妆,睡觉等动作。因此本发明将机器人所在的生活时间轴加入到机器人的交互内容生成中去,使机器人与人交互时更加拟人化,使得机器人在生活时间轴内具有人类的生活方式,该方法能够提升机器人交互内容生成的拟人性,提升人机交互体验,提高智能性。Compared with the prior art, the present invention has the following advantages: the existing robot is generally based on the method of generating the interactive interactive content of the question and answer interactive robot in the solid scene, and cannot generate the robot more accurately based on the current scene. expression. A method for generating interactive content of a robot includes: acquiring a multi-modal signal; determining a user intention according to the multi-modal signal; and combining a current life time axis of the robot according to the multi-modal signal and the user intention Generate robot interaction content. In this way, multi-modal signals such as image signals, speech signals, and robot variable parameters can be combined to more accurately generate robot interaction content, thereby being more accurate,
Personification and interaction with people. For people, everyday life has a certain regularity. In order to make robots communicate with people more anthropomorphic, let the robots sleep, exercise, eat, dance, read books, eat, make up, etc. in 24 hours a day. Sleep and other actions. Therefore, the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content. Generate anthropomorphic, enhance the human-computer interaction experience and improve intelligence.
图1是本发明实施例一的一种机器人交互内容的生成方法的流程图;1 is a flowchart of a method for generating interactive content of a robot according to Embodiment 1 of the present invention;
图2是本发明实施例二的一种机器人交互内容的生成系统的示意图。2 is a schematic diagram of a system for generating interactive content of a robot according to a second embodiment of the present invention.
虽然流程图将各项操作描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。各项操作的顺序可以被重新安排。当其操作完成时处理可以被终止,但是还可以具有未包括在附图中的附加步骤。处理可以对应于方法、函数、规程、子例程、子程序等等。Although the flowcharts describe various operations as a sequential process, many of the operations can be implemented in parallel, concurrently or concurrently. The order of the operations can be rearranged. Processing may be terminated when its operation is completed, but may also have additional steps not included in the figures. Processing can correspond to methods, functions, procedures, subroutines, subroutines, and the like.
计算机设备包括用户设备与网络设备。其中,用户设备或客户端包括但不限于电脑、智能手机、PDA等;网络设备包括但不限于单个网络服务器、多个网络服务器组成的服务器组或基于云计算的由大量计算机或网络服务器构成的云。计算机设备可单独运行来实现本发明,也可接入网络并通过与网络中的其他计算机设备的交互操作来实现本发明。计算机设备所处的网络包括但不限于互联网、广域网、城域网、局域网、VPN网络等。Computer devices include user devices and network devices. The user equipment or the client includes but is not limited to a computer, a smart phone, a PDA, etc.; the network device includes but is not limited to a single network server, a server group composed of multiple network servers, or a cloud computing-based computer or network server. cloud. The computer device can operate alone to carry out the invention, and can also access the network and implement the invention through interoperation with other computer devices in the network. The network in which the computer device is located includes, but is not limited to, the Internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like.
在这里可能使用了术语“第一”、“第二”等等来描述各个单元,但是这些单元不应当受这些术语限制,使用这些术语仅仅是为了将一个单元与另一个单元进行区分。这里所使用的术语“和/或”包括其中一个或更多所列出的相关联项目的任意和所有组合。当一个单元被称为“连接”或“耦合”到另一单元时,其可以直接连接或耦合到所述另一单元,或者可以存在中间单元。The terms "first," "second," and the like may be used herein to describe the various elements, but the elements should not be limited by these terms, and the terms are used only to distinguish one element from another. The term "and/or" used herein includes any and all combinations of one or more of the associated listed items. When a unit is referred to as being "connected" or "coupled" to another unit, it can be directly connected or coupled to the other unit, or an intermediate unit can be present.
这里所使用的术语仅仅是为了描述具体实施例而不意图限制示例性实施例。除非上下文明确地另有所指,否则这里所使用的单数形式“一个”、“一项”还意图包括复数。还应当理解的是,这里所使用的术语“包括”
和/或“包含”规定所陈述的特征、整数、步骤、操作、单元和/或组件的存在,而不排除存在或添加一个或更多其他特征、整数、步骤、操作、单元、组件和/或其组合。The terminology used herein is for the purpose of describing the particular embodiments, The singular forms "a", "an", It should also be understood that the term "includes" is used herein.
And/or "comprises" the existence of the stated features, integers, steps, operations, units and/or components, and does not exclude the presence or addition of one or more other features, integers, steps, operations, units, components and/or Or a combination thereof.
下面结合附图和较佳的实施例对本发明作进一步说明。The invention will now be further described with reference to the drawings and preferred embodiments.
实施例一Embodiment 1
如图1所示,本实施例中公开一种机器人交互内容的生成方法,包括:As shown in FIG. 1 , a method for generating interactive content of a robot is disclosed in this embodiment, including:
S101、获取多模态信号;S101. Acquire a multi-modal signal.
S102、根据所述多模态信号确定用户意图;S102. Determine a user intent according to the multimodal signal.
S103、根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴300生成机器人交互内容。S103. Generate robot interaction content according to the current robot life timeline 300 according to the multimodal signal and the user intention.
现有机器人对于应用场景来说,一般是基于固的场景中的问答交互机器人交互内容的生成方法,无法基于当前的场景来更加准确的生成机器人的表情。本发明一种机器人交互内容的生成方法,包括:获取多模态信号;根据所述多模态信号确定用户意图;根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。这样就可以根据多模态信号例如图像信号、语音信号,结合机器人可变参数来更加准确地生成机器人交互内容,从而更加准确、拟人化的与人进行交互和沟通。对于人来讲每天的生活都具有一定的规律性,为了让机器人与人沟通时更加拟人化,在一天24小时中,让机器人也会有睡觉,运动,吃饭,跳舞,看书,吃饭,化妆,睡觉等动作。因此本发明将机器人所在的生活时间轴加入到机器人的交互内容生成中去,使机器人与人交互时更加拟人化,使得机器人在生活时间轴内具有人类的生活方式,该方法能够提升机器人交互内容生成的拟人性,提升人机交互体验,提高智能性。交互内容可以是表情或文字或语音等。机器人生活时间轴300是提前进行拟合和设置完成的,具体来讲,机器人生活时间轴300是一系列的参数合集,将这个参数传输给系统进行生成交互内容。For the application scenario, the existing robot is generally based on the method of generating interactive interactive content of the question and answer interaction robot in the solid scene, and cannot generate the expression of the robot more accurately based on the current scene. A method for generating interactive content of a robot includes: acquiring a multi-modal signal; determining a user intention according to the multi-modal signal; and combining a current life time axis of the robot according to the multi-modal signal and the user intention Generate robot interaction content. In this way, multi-modal signals such as image signals and speech signals can be combined with robot variable parameters to more accurately generate robot interaction content, thereby more accurately and anthropomorphic interaction and communication with people. For people, everyday life has a certain regularity. In order to make robots communicate with people more anthropomorphic, let the robots sleep, exercise, eat, dance, read books, eat, make up, etc. in 24 hours a day. Sleep and other actions. Therefore, the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content. Generate anthropomorphic, enhance the human-computer interaction experience and improve intelligence. The interactive content can be an expression or text or voice. The robot life timeline 300 is completed and set in advance. Specifically, the robot life timeline 300 is a series of parameter collections, and this parameter is transmitted to the system to generate interactive content.
本实施例中的多模态信息可以是用户表情、语音信息、手势信息、场景信息、图像信息、视频信息、人脸信息、瞳孔虹膜信息、光感信息和指纹信息等其中的其中一种或几种。本实施例中优选为图片信号加语音信号再加手势信号,这样识别的准确并且识别的效率高。The multimodal information in this embodiment may be one of user expression, voice information, gesture information, scene information, image information, video information, face information, pupil iris information, light sense information, and fingerprint information. Several. In this embodiment, it is preferable to add a voice signal and a gesture signal to the picture signal, so that the recognition is accurate and the recognition efficiency is high.
本实施例中,基于生活时间轴具体是:根据人类日常生活的时间轴,将机器人与人类日常生活的时间轴做拟合,机器人的行为按照这个拟合行
动,也就是得到一天中机器人自己的行为,从而让机器人基于生活时间轴去进行自己的行为,例如生成交互内容与人类沟通等。假如机器人一直唤醒的话,就会按照这个时间轴上的行为行动,机器人的自我认知也会根据这个时间轴进行相应的更改。生活时间轴与可变参数可以对自我认知中的属性,例如心情值,疲劳值等等的更改,也可以自动加入新的自我认知信息,比如之前没有愤怒值,基于生活时间轴和可变因素的场景就会自动根据之前模拟人类自我认知的场景,从而对机器人的自我认知进行添加。In this embodiment, the life time axis is specifically: according to the time axis of human daily life, the robot is fitted with the time axis of human daily life, and the behavior of the robot follows this fitting line.
Move, that is, get the robot's own behavior in a day, so that the robot can perform its own behavior based on the life time axis, such as generating interactive content and communicating with humans. If the robot is always awake, it will act according to the behavior on this timeline, and the robot's self-awareness will be changed according to this timeline. The life timeline and variable parameters can be used to change the attributes of self-cognition, such as mood values, fatigue values, etc., and can also automatically add new self-awareness information, such as no previous anger value, based on the life time axis and The scene of the variable factor will automatically add to the self-cognition of the robot based on the scene that previously simulated the human self-cognition.
例如,多模态信号为用户通过用语音,向机器人说话:“好困啊”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很困,以及机器人生活时间轴,例如当前的时间为上午9点,那么机器人就知道主人是刚刚起床,那么就应该向主人问早,例如回答“早上好”作为回复,也可以配上表情、图片等,本发明中的交互内容可以理解为机器人的回复。而如果多模态信号为用户通过用语音,向机器人说话:“好困啊”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很困,以及机器人生活时间轴,例如当前的时间为晚上9点,那么机器人就知道主人需要睡觉了,那么就会回复“主人晚安,睡个好觉”等类似用语,也可以配上表情、图片等。这种方式要比单纯的靠场景识别生成回复和表情更加贴近人的生活,更加拟人化。其中多模态信号一般为多种信号的组合,例如图片信号加上语音信号,或者图片信号加语音信号再加手势信号等。For example, the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot life timeline, for example, the current time is 9 am, then the robot knows that the owner is just getting up, then you should ask the owner early, for example, answer "Good morning" as a reply, It is also possible to match an expression, a picture, etc., and the interactive content in the present invention can be understood as a reply of the robot. If the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot knows that the owner needs to sleep, then he will reply with the words "master good night, sleep well" and the like. It can also be accompanied by expressions, pictures, etc. This kind of approach is more anthropomorphic than simply relying on scene recognition to generate replies and expressions that are more intimate with people's lives. The multi-modal signal is generally a combination of a plurality of signals, such as a picture signal plus a voice signal, or a picture signal plus a voice signal plus a gesture signal.
根据其中一个示例,所述机器人生活时间轴的参数的生成方法包括:According to one example, the method for generating parameters of the robot life time axis includes:
将机器人的自我认知进行扩展;Extend the robot's self-awareness;
获取生活时间轴的参数;Get the parameters of the life timeline;
对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
这样将生活时间轴加入到机器人本身的自我认知中去,使机器人具有拟人化的生活。例如将中午吃饭的认知加入到机器人中去。In this way, the life time axis is added to the self-cognition of the robot itself, so that the robot has an anthropomorphic life. For example, add the cognition of lunch to the robot.
根据其中另一个示例,所述将机器人的自我认知进行扩展的步骤具体包括:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。这样就可以具体的将生活时间轴加入到机器人本身的参数中去。According to another example, the step of expanding the self-cognition of the robot specifically includes: combining the life scene with the self-awareness of the robot to form a self-cognitive curve based on the life time axis. In this way, the life time axis can be specifically added to the parameters of the robot itself.
根据其中另一个示例,所述对机器人的自我认知的参数与生活时间轴
中的参数进行拟合的步骤具体包括:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。这样就可以具体的将机器人的自我认知的参数与生活时间轴中的参数进行拟合。其中概率算法可以是贝叶斯概率算法。According to another example, the parameter of the self-cognition of the robot and the life time axis
The step of fitting the parameters in the method specifically includes: using a probability algorithm, calculating a probability of each parameter change of the robot on the life time axis after the time axis scene parameter is changed, and forming a fitting curve. In this way, the parameters of the robot's self-cognition can be specifically matched with the parameters in the life time axis. The probability algorithm may be a Bayesian probability algorithm.
例如,在一天24小时中,使机器人会有睡觉,运动,吃饭,跳舞,看书,吃饭,化妆,睡觉等动作。每个动作会影响机器人本身的自我认知,将生活时间轴上的参数与机器人本身的自我认知进行结合,拟合后,即让机器人的自我认知包括了,心情,疲劳值,亲密度,好感度,交互次数,机器人的三维的认知,年龄,身高,体重,亲密度,游戏场景值,游戏对象值,地点场景值,地点对象值等。为机器人可以自己识别所在的地点场景,比如咖啡厅,卧室等。For example, in 24 hours a day, the robot will have sleep, exercise, eat, dance, read books, eat, make up, sleep and other actions. Each action will affect the self-cognition of the robot itself, and combine the parameters on the life time axis with the self-cognition of the robot itself. After fitting, the robot's self-cognition includes, mood, fatigue value, intimacy. , goodness, number of interactions, three-dimensional cognition of the robot, age, height, weight, intimacy, game scene value, game object value, location scene value, location object value, etc. For the robot to identify the location of the scene, such as cafes, bedrooms, etc.
机器一天的时间轴内会进行不同的动作,比如夜里睡觉,中午吃饭,白天运动等等,这些所有的生活时间轴中的场景,对于自我认知都会有影响。这些数值的变化采用的概率模型的动态拟合方式,将这些所有动作在时间轴上发生的几率拟合出来。The machine will perform different actions in the time axis of the day, such as sleeping at night, eating at noon, exercising during the day, etc. All the scenes in the life time axis will have an impact on self-awareness. These numerical changes are modeled by the dynamic fit of the probability model, fitting the probability that all of these actions occur on the time axis.
根据其中另一个示例,所述多模态信号至少包括图像信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:According to another example, the multi-modal signal includes at least an image signal, and the step of generating the robot interaction content in combination with the current robot life time axis according to the multi-modality signal and the user intention specifically includes:
根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。多模态信号至少包括图像信号,这样可以让机器人掌握用户的意图,而为了更好的了解到用户的意图,一般会加入其它信号,例如语音信号、手势信号等,这样可以更加准确的了解到用户到底是真实的表达的意思,还是开玩笑试探的意思。Based on the image signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline. The multi-modal signal includes at least an image signal, so that the robot can grasp the user's intention, and in order to better understand the user's intention, other signals, such as a voice signal, a gesture signal, etc., are generally added, so that the robot can be more accurately understood. Whether the user is the real expression or the meaning of a joke.
根据其中另一个示例,所述多模态信号至少包括语音信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:According to another example, the multi-modal signal includes at least a voice signal, and the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis, specifically includes:
根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Based on the speech signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
根据其中另一个示例,所述多模态信号至少包括手势信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:According to another example, the multi-modal signal includes at least a gesture signal, and the step of generating the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis, specifically includes:
根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生
成机器人交互内容。According to the gesture signal and the user intention, combined with the current robot life time axis
Into the robot interactive content.
例如,多模态信号为用户通过用语音,向机器人说话:“饿了”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很饿,以及机器人生活时间轴,例如当前的时间为上午9点,那么机器人就会回复,让用户去吃早饭,并配上可爱的表情。而如果多模态信号为用户通过用语音,向机器人说话:“饿了”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很饿,以及机器人生活时间轴,例如当前的时间为晚上9点,那么机器人就会回复,太晚了少吃点,并配上可爱的表情。For example, the multi-modal signal is used by the user to speak to the robot by using voice: "hungry", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes according to the multi-modal signal such as the above-mentioned voice signal plus picture signal. The user's intention is that the user is very hungry, and the robot life timeline, for example, the current time is 9 am, then the robot will reply, let the user go to breakfast, and with a cute expression. If the multi-modal signal is used by the user to speak to the robot by using voice: "hungry", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes the multi-modal signal according to the above-mentioned voice signal plus picture signal. The user's intention is that the user is very hungry, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot will reply, eat too late, and have a cute expression.
本实施例中一般通过语音信号和图片信号就可以较为准确地了解到用户的意思,从而更加准确的回复用户。当然加上其他信号更加准确,例如手势信号,视频信号等。In this embodiment, the voice signal and the picture signal are generally used to accurately understand the meaning of the user, thereby more accurately replying to the user. Of course, other signals are more accurate, such as gesture signals, video signals, and the like.
实施例二Embodiment 2
如图2所示,本实施例中公开一种机器人交互内容的生成系统,包括:As shown in FIG. 2, in this embodiment, a system for generating interactive content of a robot includes:
获取模块201,用于获取多模态信号;The obtaining module 201 is configured to acquire a multi-modal signal;
意图识别模块202,用于根据所述多模态信号确定用户意图;The intent identification module 202 is configured to determine a user intent according to the multimodal signal;
内容生成模块203,用于根据所述多模态信号和所述用户意图,结合机器人生活时间轴模块301发送的当前的机器人生活时间轴生成机器人交互内容。The content generation module 203 is configured to generate the robot interaction content according to the current multi-modality signal and the user intention, in conjunction with the current robot life time axis sent by the robot life timeline module 301.
这样就可以根据多模态信号例如图像信号、语音信号,结合机器人可变参数来更加准确地生成机器人交互内容,从而更加准确、拟人化的与人进行交互和沟通。对于人来讲每天的生活都具有一定的规律性,为了让机器人与人沟通时更加拟人化,在一天24小时中,让机器人也会有睡觉,运动,吃饭,跳舞,看书,吃饭,化妆,睡觉等动作。因此本发明将机器人所在的生活时间轴加入到机器人的交互内容生成中去,使机器人与人交互时更加拟人化,使得机器人在生活时间轴内具有人类的生活方式,该方法能够提升机器人交互内容生成的拟人性,提升人机交互体验,提高智能性。交互内容可以是表情或文字或语音等。In this way, multi-modal signals such as image signals and speech signals can be combined with robot variable parameters to more accurately generate robot interaction content, thereby more accurately and anthropomorphic interaction and communication with people. For people, everyday life has a certain regularity. In order to make robots communicate with people more anthropomorphic, let the robots sleep, exercise, eat, dance, read books, eat, make up, etc. in 24 hours a day. Sleep and other actions. Therefore, the present invention adds the life time axis in which the robot is located to the interactive content generation of the robot, and makes the robot more humanized when interacting with the human, so that the robot has a human lifestyle in the life time axis, and the method can enhance the robot interaction content. Generate anthropomorphic, enhance the human-computer interaction experience and improve intelligence. The interactive content can be an expression or text or voice.
例如,多模态信号为用户通过用语音,向机器人说话:“好困啊”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号
加上图片信号综合判断,识别用户的意图为用户很困,以及机器人生活时间轴,例如当前的时间为上午9点,那么机器人就知道主人是刚刚起床,那么就应该向主人问早,例如回答“早上好”作为回复,也可以配上表情、图片等,本发明中的交互内容可以理解为机器人的回复。而如果多模态信号为用户通过用语音,向机器人说话:“好困啊”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很困,以及机器人生活时间轴,例如当前的时间为晚上9点,那么机器人就知道主人需要睡觉了,那么就会回复“主人晚安,睡个好觉”等类似用语,也可以配上表情、图片等。这种方式要比单纯的靠场景识别生成回复和表情更加贴近人的生活,更加拟人化。其中多模态信号一般为多种信号的组合,例如图片信号加上语音信号,或者图片信号加语音信号再加手势信号等。For example, the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot according to the multi-modal signal such as the above-mentioned voice signal
In addition to the comprehensive judgment of the picture signal, the user's intention is recognized as the user is very sleepy, and the robot life timeline, for example, the current time is 9:00 am, then the robot knows that the owner is just getting up, then the owner should ask early, for example, answer "Good morning" as a reply, can also be accompanied by expressions, pictures, etc., the interactive content in the present invention can be understood as the reply of the robot. If the multi-modal signal is used by the user to speak to the robot by using voice: "good sleepy", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges according to the multi-modal signal such as the above-mentioned voice signal plus the picture signal. Identifying the user's intention is that the user is very sleepy, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot knows that the owner needs to sleep, then he will reply with the words "master good night, sleep well" and the like. It can also be accompanied by expressions, pictures, etc. This kind of approach is more anthropomorphic than simply relying on scene recognition to generate replies and expressions that are more intimate with people's lives. The multi-modal signal is generally a combination of a plurality of signals, such as a picture signal plus a voice signal, or a picture signal plus a voice signal plus a gesture signal.
根据其中一个示例,所述系统包括基于时间轴与人工智能云处理模块,用于:According to one example, the system includes a time axis based and artificial intelligence cloud processing module for:
将机器人的自我认知进行扩展;Extend the robot's self-awareness;
获取生活时间轴的参数;Get the parameters of the life timeline;
对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
这样将生活时间轴加入到机器人本身的自我认知中去,使机器人具有拟人化的生活。例如将中午吃饭的认知加入到机器人中去。In this way, the life time axis is added to the self-cognition of the robot itself, so that the robot has an anthropomorphic life. For example, add the cognition of lunch to the robot.
根据其中另一个示例,所述基于时间轴与人工智能云处理模块进一步用于:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。这样就可以具体的将生活时间轴加入到机器人本身的参数中去。According to another example, the time-based and artificial intelligence cloud processing module is further configured to combine a life scene with a self-awareness of the robot to form a self-cognitive curve based on a life time axis. In this way, the life time axis can be specifically added to the parameters of the robot itself.
根据其中另一个示例,所述基于时间轴与人工智能云处理模块进一步用于:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。这样就可以具体的将机器人的自我认知的参数与生活时间轴中的参数进行拟合。其中概率算法可以是贝叶斯概率算法。According to another example, the time axis-based and artificial intelligence cloud processing module is further configured to: use a probability algorithm to calculate a probability of each parameter change of the robot on the life time axis after the time axis scene parameter changes, to form a fit curve. In this way, the parameters of the robot's self-cognition can be specifically matched with the parameters in the life time axis. The probability algorithm may be a Bayesian probability algorithm.
例如,在一天24小时中,使机器人会有睡觉,运动,吃饭,跳舞,看书,吃饭,化妆,睡觉等动作。每个动作会影响机器人本身的自我认知,将生活时间轴上的参数与机器人本身的自我认知进行结合,拟合后,即让机器人的自我认知包括了,心情,疲劳值,亲密度,好感度,交互次数,
机器人的三维的认知,年龄,身高,体重,亲密度,游戏场景值,游戏对象值,地点场景值,地点对象值等。为机器人可以自己识别所在的地点场景,比如咖啡厅,卧室等。For example, in 24 hours a day, the robot will have sleep, exercise, eat, dance, read books, eat, make up, sleep and other actions. Each action will affect the self-cognition of the robot itself, and combine the parameters on the life time axis with the self-cognition of the robot itself. After fitting, the robot's self-cognition includes, mood, fatigue value, intimacy. , good feelings, number of interactions,
The three-dimensional cognition, age, height, weight, intimacy, game scene value, game object value, location scene value, location object value, etc. of the robot. For the robot to identify the location of the scene, such as cafes, bedrooms, etc.
机器一天的时间轴内会进行不同的动作,比如夜里睡觉,中午吃饭,白天运动等等,这些所有的生活时间轴中的场景,对于自我认知都会有影响。这些数值的变化采用的概率模型的动态拟合方式,将这些所有动作在时间轴上发生的几率拟合出来。The machine will perform different actions in the time axis of the day, such as sleeping at night, eating at noon, exercising during the day, etc. All the scenes in the life time axis will have an impact on self-awareness. These numerical changes are modeled by the dynamic fit of the probability model, fitting the probability that all of these actions occur on the time axis.
根据其中另一个示例,所述多模态信号至少包括图像信号,所述内容生成模块具体用于:根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。According to another example, the multi-modality signal includes at least an image signal, and the content generation module is specifically configured to generate the robot interaction content according to the current robot life time axis according to the image signal and the user intention.
多模态信号至少包括图像信号,这样可以让机器人掌握用户的意图,而为了更好的了解到用户的意图,一般会加入其它信号,例如语音信号、手势信号等,这样可以更加准确的了解到用户到底是真实的表达的意思,还是开玩笑试探的意思。The multi-modal signal includes at least an image signal, so that the robot can grasp the user's intention, and in order to better understand the user's intention, other signals, such as a voice signal, a gesture signal, etc., are generally added, so that the robot can be more accurately understood. Whether the user is the real expression or the meaning of a joke.
根据其中另一个示例,所述多模态信号至少包括语音信号,所述内容生成模块具体用于:根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。According to another example, the multi-modal signal includes at least a voice signal, and the content generating module is specifically configured to: generate, according to the voice signal and the user intention, a robot interaction content according to a current robot life time axis.
根据其中另一个示例,所述多模态信号至少包括手势信号,所述内容生成模块具体用于:根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。According to another example, the multi-modality signal includes at least a gesture signal, and the content generation module is specifically configured to generate the robot interaction content according to the current robot life time axis according to the gesture signal and the user intention.
例如,多模态信号为用户通过用语音,向机器人说话:“饿了”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很饿,以及机器人生活时间轴,例如当前的时间为上午9点,那么机器人就会回复,让用户去吃早饭,并配上可爱的表情。而如果多模态信号为用户通过用语音,向机器人说话:“饿了”,多模态信号可以加上图片信号,机器人根据多模态信号如上述的语音信号加上图片信号综合判断,识别用户的意图为用户很饿,以及机器人生活时间轴,例如当前的时间为晚上9点,那么机器人就会回复,太晚了少吃点,并配上可爱的表情。For example, the multi-modal signal is used by the user to speak to the robot by using voice: "hungry", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes according to the multi-modal signal such as the above-mentioned voice signal plus picture signal. The user's intention is that the user is very hungry, and the robot life timeline, for example, the current time is 9 am, then the robot will reply, let the user go to breakfast, and with a cute expression. If the multi-modal signal is used by the user to speak to the robot by using voice: "hungry", the multi-modal signal can be added with a picture signal, and the robot comprehensively judges and recognizes the multi-modal signal according to the above-mentioned voice signal plus picture signal. The user's intention is that the user is very hungry, and the robot lives on the timeline. For example, the current time is 9:00 pm, then the robot will reply, eat too late, and have a cute expression.
本实施例中一般通过语音信号和图片信号就可以较为准确地了解到用户的意思,从而更加准确的回复用户。当然加上其他信号更加准确,例如手势信号,视频信号等。
In this embodiment, the voice signal and the picture signal are generally used to accurately understand the meaning of the user, thereby more accurately replying to the user. Of course, other signals are more accurate, such as gesture signals, video signals, and the like.
本发明公开一种机器人,包括如上述任一所述的一种机器人交互内容的生成系统。The invention discloses a robot comprising a system for generating interactive content of a robot as described above.
以上内容是结合具体的优选实施方式对本发明所作的进一步详细说明,不能认定本发明的具体实施只局限于这些说明。对于本发明所属技术领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干简单推演或替换,都应当视为属于本发明的保护范围。
The above is a further detailed description of the present invention in connection with the specific preferred embodiments, and the specific embodiments of the present invention are not limited to the description. It will be apparent to those skilled in the art that the present invention may be made without departing from the spirit and scope of the invention.
Claims (17)
- 一种机器人交互内容的生成方法,其特征在于,包括:A method for generating interactive content of a robot, comprising:获取多模态信号;Obtaining a multimodal signal;根据所述多模态信号确定用户意图;Determining a user intent based on the multimodal signal;根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Based on the multimodal signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
- 根据权利要求1所述的生成方法,其特征在于,所述机器人生活时间轴的参数的生成方法包括:The generating method according to claim 1, wherein the generating method of the parameter of the life time axis of the robot comprises:将机器人的自我认知进行扩展;Extend the robot's self-awareness;获取生活时间轴的参数;Get the parameters of the life timeline;对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- 根据权利要求2所述的生成方法,其特征在于,所述将机器人的自我认知进行扩展的步骤具体包括:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。The generating method according to claim 2, wherein the step of expanding the self-cognition of the robot specifically comprises: combining the life scene with the self-awareness of the robot to form a self-cognitive curve based on the life time axis.
- 根据权利要求2所述的生成方法,其特征在于,所述对机器人的自我认知的参数与生活时间轴中的参数进行拟合的步骤具体包括:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。The generating method according to claim 2, wherein the step of fitting the parameter of the self-cognition of the robot to the parameter in the life time axis comprises: using a probability algorithm to calculate the robot on the life time axis The probability of each parameter change after the time axis scene parameter is changed forms a fitted curve.
- 根据权利要求2所述的生成方法,其特征在于,其中,所述生活时间轴指包含一天24小时的时间轴,所述生活时间轴中的参数至少包括用户在所述生活时间轴上进行的日常生活行为以及代表该行为的参数值。The generating method according to claim 2, wherein the living time axis refers to a time axis including 24 hours a day, and the parameter in the living time axis includes at least a user performing on the living time axis. Daily life behavior and the values of the parameters that represent the behavior.
- 根据权利要求1所述的生成方法,其特征在于,所述多模态信号至少包括图像信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:The generating method according to claim 1, wherein the multimodal signal includes at least an image signal, and the generating robot interaction is combined with a current life time axis of the robot according to the multimodal signal and the user intention The steps of the content specifically include:根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。Based on the image signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
- 根据权利要求1所述的生成方法,其特征在于,所述多模态信号至少包括语音信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:The generating method according to claim 1, wherein the multi-modal signal comprises at least a voice signal, and the generating a robot interaction according to the current multi-modal signal and the user intention combined with a current robot life time axis The steps of the content specifically include:根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。 Based on the speech signal and the user intent, the robot interaction content is generated in conjunction with the current robot life timeline.
- 根据权利要求1所述的生成方法,其特征在于,所述多模态信号至少包括手势信号,所述根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容的步骤具体包括:The generating method according to claim 1, wherein the multi-modal signal comprises at least a gesture signal, and the generating a robot interaction according to the current multi-modal signal and the user intention combined with a current robot life time axis The steps of the content specifically include:根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。A robot interaction content is generated in accordance with the current robot life timeline based on the gesture signal and the user intent.
- 一种机器人交互内容的生成系统,其特征在于,包括:A system for generating interactive content of a robot, comprising:获取模块,用于获取多模态信号;An acquisition module for acquiring a multimodal signal;意图识别模块,用于根据所述多模态信号确定用户意图;An intent identification module, configured to determine a user intent according to the multimodal signal;内容生成模块,用于根据所述多模态信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。And a content generating module, configured to generate the robot interaction content according to the current multi-modality signal and the user intention, in combination with the current robot life time axis.
- 根据权利要求8所述的生成系统,其特征在于,所述系统包括基于时间轴与人工智能云处理模块,用于:The generating system according to claim 8, wherein the system comprises a time axis based and artificial intelligence cloud processing module for:将机器人的自我认知进行扩展;Extend the robot's self-awareness;获取生活时间轴的参数;Get the parameters of the life timeline;对机器人的自我认知的参数与生活时间轴中的参数进行拟合,生成机器人生活时间轴。The self-cognitive parameters of the robot are fitted to the parameters in the life time axis to generate a robot life time axis.
- 根据权利要求10所述的生成系统,其特征在于,所述基于时间轴与人工智能云处理模块进一步用于:将生活场景与机器人的自我认识相结合形成基于生活时间轴的自我认知曲线。The generating system according to claim 10, wherein the time-based and artificial intelligence cloud processing module is further configured to combine a life scene with a self-awareness of the robot to form a self-cognitive curve based on a life time axis.
- 根据权利要求10所述的生成系统,其特征在于,所述基于时间轴与人工智能云处理模块进一步用于:使用概率算法,计算生活时间轴上的机器人在时间轴场景参数改变后的每个参数改变的概率,形成拟合曲线。The generating system according to claim 10, wherein the time-based and artificial intelligence cloud processing module is further configured to: use a probability algorithm to calculate each of the robots on the life time axis after the time axis scene parameter changes The probability of a parameter change forms a fitted curve.
- 根据权利要求10所述的生成系统,其特征在于,其中,所述生活时间轴指包含一天24小时的时间轴,所述生活时间轴中的参数至少包括用户在所述生活时间轴上进行的日常生活行为以及代表该行为的参数值。The generating system according to claim 10, wherein said life time axis refers to a time axis including 24 hours a day, and parameters in said life time axis include at least a user performing on said life time axis Daily life behavior and the values of the parameters that represent the behavior.
- 根据权利要求9所述的生成系统,其特征在于,所述多模态信号至少包括图像信号,所述内容生成模块具体用于:根据所述图像信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。The generating system according to claim 9, wherein the multimodal signal includes at least an image signal, and the content generating module is specifically configured to: combine the current robot life according to the image signal and the user intention The timeline generates robot interaction content.
- 根据权利要求9所述的生成系统,其特征在于,所述多模态信号至少包括语音信号,所述内容生成模块具体用于:根据所述语音信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。The generating system according to claim 9, wherein the multi-modal signal comprises at least a voice signal, and the content generating module is specifically configured to: combine the current robot life according to the voice signal and the user intention The timeline generates robot interaction content.
- 根据权利要求9所述的生成系统,其特征在于,所述多模态信号 至少包括手势信号,所述内容生成模块具体用于:根据所述手势信号和所述用户意图,结合当前的机器人生活时间轴生成机器人交互内容。The generating system according to claim 9, wherein said multimodal signal At least the gesture signal is included, and the content generating module is specifically configured to: generate the robot interaction content according to the current robot life time axis according to the gesture signal and the user intention.
- 一种机器人,其特征在于,包括如权利要求9至16任一所述的一种机器人交互内容的生成系统。 A robot comprising a robot interactive content generating system according to any one of claims 9 to 16.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2016/087753 WO2018000268A1 (en) | 2016-06-29 | 2016-06-29 | Method and system for generating robot interaction content, and robot |
CN201680001744.2A CN106462254A (en) | 2016-06-29 | 2016-06-29 | Robot interaction content generation method, system and robot |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2016/087753 WO2018000268A1 (en) | 2016-06-29 | 2016-06-29 | Method and system for generating robot interaction content, and robot |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018000268A1 true WO2018000268A1 (en) | 2018-01-04 |
Family
ID=58215746
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/087753 WO2018000268A1 (en) | 2016-06-29 | 2016-06-29 | Method and system for generating robot interaction content, and robot |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106462254A (en) |
WO (1) | WO2018000268A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111970536A (en) * | 2020-07-24 | 2020-11-20 | 北京航空航天大学 | Method and device for generating video based on audio |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018200637A1 (en) * | 2017-04-28 | 2018-11-01 | Southie Autonomy Works, Llc | Automated personalized feedback for interactive learning applications |
CN109202921B (en) * | 2017-07-03 | 2020-10-20 | 北京光年无限科技有限公司 | Human-computer interaction method and device based on forgetting mechanism for robot |
CN107491511A (en) * | 2017-08-03 | 2017-12-19 | 深圳狗尾草智能科技有限公司 | The autognosis method and device of robot |
CN107563517A (en) * | 2017-08-25 | 2018-01-09 | 深圳狗尾草智能科技有限公司 | Robot autognosis real time updating method and system |
CN107992935A (en) * | 2017-12-14 | 2018-05-04 | 深圳狗尾草智能科技有限公司 | Method, equipment and the medium of life cycle is set for robot |
CN108297098A (en) * | 2018-01-23 | 2018-07-20 | 上海大学 | The robot control system and method for artificial intelligence driving |
CN108363492B (en) * | 2018-03-09 | 2021-06-25 | 南京阿凡达机器人科技有限公司 | Man-machine interaction method and interaction robot |
CN109376282A (en) * | 2018-09-26 | 2019-02-22 | 北京子歌人工智能科技有限公司 | A kind of method and apparatus of human-machine intelligence's chat based on artificial intelligence |
CN109976338A (en) * | 2019-03-14 | 2019-07-05 | 山东大学 | A kind of multi-modal quadruped robot man-machine interactive system and method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1392826A (en) * | 2000-10-05 | 2003-01-22 | 索尼公司 | Robot apparatus and its control method |
US7685518B2 (en) * | 1998-01-23 | 2010-03-23 | Sony Corporation | Information processing apparatus, method and medium using a virtual reality space |
CN104951077A (en) * | 2015-06-24 | 2015-09-30 | 百度在线网络技术(北京)有限公司 | Man-machine interaction method and device based on artificial intelligence and terminal equipment |
CN105058389A (en) * | 2015-07-15 | 2015-11-18 | 深圳乐行天下科技有限公司 | Robot system, robot control method, and robot |
CN105082150A (en) * | 2015-08-25 | 2015-11-25 | 国家康复辅具研究中心 | Robot man-machine interaction method based on user mood and intension recognition |
CN105490918A (en) * | 2015-11-20 | 2016-04-13 | 深圳狗尾草智能科技有限公司 | System and method for enabling robot to interact with master initiatively |
CN105701211A (en) * | 2016-01-13 | 2016-06-22 | 北京光年无限科技有限公司 | Question-answering system-oriented active interaction data processing method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105093986A (en) * | 2015-07-23 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Humanoid robot control method based on artificial intelligence, system and the humanoid robot |
-
2016
- 2016-06-29 CN CN201680001744.2A patent/CN106462254A/en active Pending
- 2016-06-29 WO PCT/CN2016/087753 patent/WO2018000268A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7685518B2 (en) * | 1998-01-23 | 2010-03-23 | Sony Corporation | Information processing apparatus, method and medium using a virtual reality space |
CN1392826A (en) * | 2000-10-05 | 2003-01-22 | 索尼公司 | Robot apparatus and its control method |
CN104951077A (en) * | 2015-06-24 | 2015-09-30 | 百度在线网络技术(北京)有限公司 | Man-machine interaction method and device based on artificial intelligence and terminal equipment |
CN105058389A (en) * | 2015-07-15 | 2015-11-18 | 深圳乐行天下科技有限公司 | Robot system, robot control method, and robot |
CN105082150A (en) * | 2015-08-25 | 2015-11-25 | 国家康复辅具研究中心 | Robot man-machine interaction method based on user mood and intension recognition |
CN105490918A (en) * | 2015-11-20 | 2016-04-13 | 深圳狗尾草智能科技有限公司 | System and method for enabling robot to interact with master initiatively |
CN105701211A (en) * | 2016-01-13 | 2016-06-22 | 北京光年无限科技有限公司 | Question-answering system-oriented active interaction data processing method and system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111970536A (en) * | 2020-07-24 | 2020-11-20 | 北京航空航天大学 | Method and device for generating video based on audio |
CN111970536B (en) * | 2020-07-24 | 2021-07-23 | 北京航空航天大学 | Method and device for generating video based on audio |
Also Published As
Publication number | Publication date |
---|---|
CN106462254A (en) | 2017-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018000268A1 (en) | Method and system for generating robot interaction content, and robot | |
WO2018000259A1 (en) | Method and system for generating robot interaction content, and robot | |
Salichs et al. | Mini: a new social robot for the elderly | |
WO2018153359A1 (en) | Emotion state prediction method and robot | |
CN107030691B (en) | Data processing method and device for nursing robot | |
US10628714B2 (en) | Entity-tracking computing system | |
US20210191506A1 (en) | Affective interaction systems, devices, and methods based on affective computing user interface | |
Tang et al. | A novel multimodal communication framework using robot partner for aging population | |
WO2018000267A1 (en) | Method for generating robot interaction content, system, and robot | |
CN107870994A (en) | Man-machine interaction method and system for intelligent robot | |
CN108847226A (en) | The agency managed in human-computer dialogue participates in | |
WO2018006374A1 (en) | Function recommending method, system, and robot based on automatic wake-up | |
CN109789550A (en) | Control based on the social robot that the previous role in novel or performance describes | |
WO2018006371A1 (en) | Method and system for synchronizing speech and virtual actions, and robot | |
WO2018006370A1 (en) | Interaction method and system for virtual 3d robot, and robot | |
WO2021217282A1 (en) | Method for implementing universal artificial intelligence | |
WO2018006372A1 (en) | Method and system for controlling household appliance on basis of intent recognition, and robot | |
WO2018000258A1 (en) | Method and system for generating robot interaction content, and robot | |
WO2018000261A1 (en) | Method and system for generating robot interaction content, and robot | |
Calvo et al. | Introduction to affective computing | |
WO2018000266A1 (en) | Method and system for generating robot interaction content, and robot | |
Khalid et al. | Determinants of trust in human-robot interaction: Modeling, measuring, and predicting | |
CN117668763B (en) | Digital human all-in-one machine based on multiple modes and multiple mode perception and identification method thereof | |
WO2018000260A1 (en) | Method for generating robot interaction content, system, and robot | |
CN111949773A (en) | Reading equipment, server and data processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16906669 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16906669 Country of ref document: EP Kind code of ref document: A1 |