WO2024124481A1 - 人机交互装置及人机交互方法 - Google Patents

人机交互装置及人机交互方法 Download PDF

Info

Publication number
WO2024124481A1
WO2024124481A1 PCT/CN2022/139291 CN2022139291W WO2024124481A1 WO 2024124481 A1 WO2024124481 A1 WO 2024124481A1 CN 2022139291 W CN2022139291 W CN 2022139291W WO 2024124481 A1 WO2024124481 A1 WO 2024124481A1
Authority
WO
WIPO (PCT)
Prior art keywords
human
interaction
event
state
computer
Prior art date
Application number
PCT/CN2022/139291
Other languages
English (en)
French (fr)
Inventor
杨健勃
曹临杰
Original Assignee
北京可以科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京可以科技有限公司 filed Critical 北京可以科技有限公司
Priority to PCT/CN2022/139291 priority Critical patent/WO2024124481A1/zh
Publication of WO2024124481A1 publication Critical patent/WO2024124481A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/033Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor

Definitions

  • the embodiments of the present application relate to the field of artificial intelligence technology, and more specifically, to a human-computer interaction device and a human-computer interaction method.
  • robots are a relatively important type of robot, which can increase the user's happiness and reduce the user's stress through interaction with the user.
  • the embodiments of the present application provide a human-computer interaction device and a human-computer interaction method.
  • the human-computer interaction device can flexibly process detected events according to the interaction mode and interaction state, thereby improving the interaction effect between the user and the human-computer interaction device and helping to improve the user experience.
  • a human-computer interaction device wherein the human-computer interaction device has multiple interaction modes, each of the interaction modes includes at least one interaction state, and the interaction state is used to indicate the state of the human-computer interaction device in the process of interacting with the outside world.
  • the human-computer interaction device includes: a detection unit, used to detect a first event; and a processing unit, used to determine a first processing method for the first event according to the first interaction mode of the human-computer interaction device and the first interaction state under the first interaction mode, wherein the first interaction mode is one of the multiple interaction modes.
  • the embodiments of the present application can flexibly process events based on the interaction mode and the interaction state under the interaction mode, thereby improving the accuracy of the human-computer interaction device's analysis of the user's behavioral intentions or external environment, improving the interaction effect between the user and the human-computer interaction device, and helping to improve the user experience.
  • the human-computer interaction device further includes: a first switching unit, configured to switch from the second interaction state in the first interaction mode to the first interaction state in the first interaction mode according to the first event.
  • the first event can be called an interactive state switching event.
  • the first event can be a pick-up event or a fall event.
  • the human-computer interaction device After detecting the first event, the human-computer interaction device does not switch the current interaction mode but only switches the current interaction state based on the human-computer interaction device's analysis of the user's behavioral intention or the external environment, and determines the processing method for the first event based on the switched interaction state and the current first interaction mode, which can further improve the interaction effect between the user and the human-computer interaction device and further help to improve the user experience.
  • the human-computer interaction device further includes: a second switching unit, configured to switch from a third interaction state in a second interaction mode to the first interaction state in the first interaction mode according to the first event, wherein the second interaction mode is one of the multiple interaction modes.
  • the first event may be referred to as an interactive mode switching event.
  • the first event may be a close-up face event, a long-distance face event, a human figure event, a voice input event, or an application program control event.
  • the human-computer interaction device After detecting the first event, not only changes the current interaction mode but also switches the current interaction state based on the human-computer interaction device's analysis of the user's behavioral intention or the external environment, and determines the processing method for the first event based on the switched interaction state and the switched first interaction mode, which can further improve the interaction effect between the user and the human-computer interaction device and further help to improve the user experience.
  • a human-computer interaction method is provided, and the human-computer interaction method is applied to a human-computer interaction device, the human-computer interaction device has multiple interaction modes, each of the interaction modes includes at least one interaction state, and the interaction state is used to indicate the state of the human-computer interaction device in the process of interacting with the outside world.
  • the human-computer interaction method includes: detecting a first event; determining a first processing method for the first event according to the first interaction mode of the human-computer interaction device and the first interaction state under the first interaction mode, wherein the first interaction mode is one of the multiple interaction modes.
  • the method further includes: switching from the second interaction state in the first interaction mode to the first interaction state in the first interaction mode according to the first event.
  • the first event is a picking up event or a falling down event.
  • the method further includes: switching from a third interaction state in a second interaction mode to the first interaction state in the first interaction mode according to the first event, the second interaction mode being one of the multiple interaction modes.
  • the first event is a close-up face event, a long-distance face event, a human shape event, a voice input event, or an application manipulation event.
  • a human-computer interaction device comprising: one or more processors; one or more memories; and one or more computer programs, wherein the one or more computer programs are stored in the one or more memories, and the one or more computer programs include instructions, which, when executed by the one or more processors, enable the human-computer interaction device to perform the human-computer interaction method as described in the second aspect or any possible implementation of the second aspect.
  • a computer-readable storage medium comprising computer instructions.
  • the human-computer interaction device executes the human-computer interaction method as described in the second aspect or any possible implementation of the second aspect.
  • a chip comprising at least one processor and an interface circuit, wherein the interface circuit is used to provide program instructions or data to the at least one processor, and the at least one processor is used to execute the program instructions to implement the human-computer interaction method as described in the second aspect or any possible implementation method of the second aspect.
  • a computer program product which includes computer instructions; when part or all of the computer instructions are run on a computer, the human-computer interaction method described in the second aspect or any possible implementation of the second aspect is executed.
  • FIG1 is a schematic diagram of the hardware structure of a human-computer interaction device provided in an embodiment of the present application.
  • FIG2 is a schematic flowchart of an example of a human-computer interaction method provided in an embodiment of the present application.
  • FIG3 is a schematic flowchart of another human-computer interaction method provided in an embodiment of the present application.
  • FIG4 is a schematic flowchart of another human-computer interaction method provided in an embodiment of the present application.
  • FIG5 is a schematic structural diagram of an example of a human-computer interaction device provided in an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of another example of a human-computer interaction device provided in an embodiment of the present application.
  • first and second are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of the indicated technical features.
  • a feature defined as “first” or “second” may explicitly or implicitly include one or more of the features.
  • plural means two or more.
  • FIG1 shows a schematic diagram of a structure of a human-computer interaction device 100 provided in an embodiment of the present application.
  • the human-computer interaction device 100 may include a processor 110, an actuator 111, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna, a wireless communication module 150, a sensor module 160, an audio module 170, a speaker 170A, a microphone 170B, a camera 180, a display screen 190, etc.
  • a processor 110 an actuator 111
  • an external memory interface 120 an internal memory 121
  • USB universal serial bus
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the human-computer interaction device 100.
  • the human-computer interaction device 100 may include more or fewer components than shown in the figure, or combine certain components, or split certain components, or arrange the components differently.
  • the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include a graphics processing unit (GPU), a controller, a memory, etc.
  • GPU graphics processing unit
  • Different processing units may be independent devices or integrated into one or more processors.
  • the controller may be the nerve center and command center of the human-machine interaction device 100.
  • the controller may generate an operation control signal according to the instruction operation code and the timing signal to complete the control of fetching and executing instructions.
  • the memory is used to store instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory can store instructions or data that the processor 110 has just used or cyclically used. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. This avoids repeated access, reduces the waiting time of the processor 110, and thus improves the efficiency of the system.
  • the processor 110 may include one or more interfaces.
  • the interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input/output (GPIO) interface, and/or a USB interface.
  • I2C inter-integrated circuit
  • I2S inter-integrated circuit sound
  • PCM pulse code modulation
  • UART universal asynchronous receiver/transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input/output
  • USB interface a USB interface.
  • the I2C interface is a bidirectional synchronous serial bus, including a serial data line (SDA) and a serial clock line (SCL).
  • SDA serial data line
  • SCL serial clock line
  • the I2S interface can be used for audio communication.
  • the processor 110 may include multiple groups of I2
  • the processor 110 can be coupled to the audio module 170 through the I2S bus to achieve communication between the processor 110 and the audio module 170.
  • the PCM interface can also be used for audio communication, sampling, quantizing and encoding analog signals.
  • the audio module 170 and the wireless communication module 150 can be coupled via a PCM bus interface.
  • the UART interface is a universal serial data bus for asynchronous communication.
  • the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • the UART interface is generally used to connect the processor 110 and the wireless communication module 150.
  • the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 190 and the camera 180.
  • the GPIO interface can be configured by software.
  • the GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface can be used to connect the processor 110 with the camera 180, the display screen 190, the wireless communication module 150, the sensor module 160, the audio module 170, etc.
  • the interface connection relationship between the modules illustrated in the embodiment of the present application is only a schematic illustration and does not constitute a structural limitation on the human-computer interaction device 100.
  • the human-computer interaction device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the actuator 111 is used to control the human-computer interaction device 100 to move, rotate, jump, etc.
  • the actuator 111 is also used to control the trunk to rotate relative to the legs, the legs to rotate relative to the trunk, the trunk to shake, or the ears to rotate along the trunk, etc.
  • the actuator 111 may include at least one motor.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the human-computer interaction device 100.
  • an external memory card such as a Micro SD card
  • the internal memory 121 can be used to store computer executable program codes, which include instructions.
  • the processor 110 executes various functional applications and data processing of the human-computer interaction device 100 by running the instructions stored in the internal memory 121.
  • the internal memory 121 may include a program storage area and a data storage area.
  • the program storage area may store an operating system, an application required for at least one function (such as a sound playback function, an image playback function, etc.), etc.
  • the data storage area may store data (such as audio data, etc.) created during the use of the human-computer interaction device 100, etc.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one disk storage device, a flash memory device, a universal flash storage (UFS), etc.
  • UFS universal flash storage
  • the USB interface 130 is an interface that complies with USB standard specifications, and may be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
  • the USB interface 130 may be used to connect a charger to charge the human-machine interaction device 100, and may also be used to transmit data between the human-machine interaction device 100 and peripheral devices.
  • the charging management module 140 is used to receive charging input from a charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 140 can receive charging input from a wired charger through the USB interface 130.
  • the charging management module 140 can receive wireless charging input through a wireless charging coil of the human-computer interaction device 100. While the charging management module 140 is charging the battery 142, it can also power the electronic device through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the wireless communication module 150 can provide wireless communication solutions including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), etc. for application in the human-computer interaction device 100.
  • WLAN wireless local area networks
  • Wi-Fi wireless fidelity
  • BT Bluetooth
  • the antenna of the human-machine interaction device 100 is coupled to the wireless communication module 150 , so that the human-machine interaction device 100 can communicate with the network and other devices through wireless communication technology.
  • the sensor module 160 may include at least one sensor.
  • the sensor module 160 includes a touch sensor, a distance sensor, a posture sensor, etc.
  • the touch sensor is a capacitive sensor, which can be set on the top of the head, neck, back, abdomen, etc. of the human-computer interaction device to sense the user's touch, tap, and other interactive actions.
  • the distance sensor is used to measure the distance between the human-computer interaction device and an object in the external environment or a user.
  • the posture sensor is a gyroscope, which is used to sense the posture change of the human-computer interaction device.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals.
  • the audio module 170 can also be used to encode and decode audio signals.
  • the audio module 170 can be arranged in the processor 110, or some functional modules of the audio module 170 can be arranged in the processor 110.
  • the speaker 170A also called “speaker”
  • the microphone 170B also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
  • the human-computer interaction device 100 can implement audio functions such as voice playback and recording through the audio module 170, the speaker 170A, the microphone 170B, and the processor 110.
  • the camera 180 is used to capture static images or videos, so that the processor 110 can detect events based on the images or videos acquired by the camera 180 , and can thus process the events, etc.
  • the display screen 190 is used to display images, videos, etc.
  • the display screen 190 can display expression animations to express the current emotional state of the human-computer interaction device.
  • FIG. 2 is a schematic flowchart of a human-computer interaction method 200 provided in an embodiment of the present application.
  • the human-computer interaction method 200 shown in FIG. 2 can be applied to the human-computer interaction device 100 shown in FIG. 1 .
  • method 200 includes S210 and S220, and S220 is performed after S210.
  • S210 and S220 are described in detail below.
  • the events involved in the embodiments of the present application can be understood as external environment change events of the human-computer interaction device.
  • events may include but are not limited to obstacle events, face events, close-range face events, long-range face events, touch events, gesture events, application manipulation events, voice input events, desktop edge events, human shape events, pick-up events, falling events, etc.
  • the human-computer interaction device may include a detection unit (such as a distance sensor, a camera, etc.), and the human-computer interaction device determines whether an obstacle event occurs through the detection unit.
  • a detection unit such as a distance sensor, a camera, etc.
  • the human-computer interaction device will periodically send a detection signal through a distance sensor. Based on the signal reflected from the detection signal received by the human-computer interaction device, and/or combined with the static image or video captured by the camera, the human-computer interaction device can detect whether there are obstacles in the external environment. At this time, the human-computer interaction device has detected an obstacle event.
  • the human-computer interaction device can also determine whether it is a sudden obstacle event or an approaching obstacle event based on the distance between the obstacle and the human-computer interaction device. Specifically, when the distance between the obstacle and the human-computer interaction device is less than or equal to a first distance threshold (e.g., 20 cm), it can be determined to be a sudden obstacle event; when the distance between the obstacle and the human-computer interaction device is greater than the first distance threshold, it can be determined to be an approaching obstacle event.
  • a first distance threshold e.g. 20 cm
  • the human-computer interaction device may include a camera, which captures images or videos through the camera, and determines whether there is a face event based on the captured images or videos.
  • the human-computer interaction device may select target rectangular areas from left to right and from top to bottom on the image, and use the target rectangular areas as an observation window. Then, the features in the image area corresponding to each observation window are extracted, and it is determined whether the extracted features are features corresponding to a human face. If the extracted features match the features corresponding to a human face, it can be considered that there is a human face in the external environment of the human-computer interaction device, and the human-computer interaction device has detected a face event.
  • the size of the target rectangular area can be determined based on the sizes of multiple faces that are statistically analyzed.
  • the human-computer interaction device when the human-computer interaction device detects a human face, it can be determined whether it is a close-range face event or a long-range face event based on the distance between the human face and the human-computer interaction device.
  • the human-computer interaction device detects a human face
  • the distance between the human face and the human-computer interaction device is less than or equal to the second distance threshold
  • the human-computer interaction device detects a human face, and the distance between the human face and the human-computer interaction device is greater than a second distance threshold, it can be determined that a long-distance face event exists.
  • the human-computer interaction device may include a touch sensor, and the human-computer interaction device determines whether a touch event occurs through a signal detected by the touch sensor.
  • the touch sensor is a capacitive sensor, and a capacitance signal change is detected in a certain event, and the capacitance signal is greater than a threshold value 1, then it can be considered that the user has touched the human-computer interaction device, and the human-computer interaction device has detected the touch event.
  • the threshold value 1 can be determined based on multiple capacitance signals corresponding to multiple touch events.
  • the human-computer interaction device may capture images or videos through a camera, and determine whether there is a gesture event based on the captured images or videos.
  • the human-computer interaction device first extracts features from the image, and then matches the features in the image with preset gesture features (which may be stored in the human-computer interaction device). If the features in the image match the preset gesture features, then it can be considered that the user is making a gesture to the human-computer interaction device, and the human-computer interaction device has detected the gesture event.
  • preset gesture features which may be stored in the human-computer interaction device.
  • the human-computer interaction apparatus may determine whether there is an application manipulation event based on controls related to the application.
  • the human-computer interaction device when the human-computer interaction device receives an operation instruction from a user to manipulate a control related to an application, the human-computer interaction device detects an application manipulation event.
  • the human-computer interaction device may include a microphone, and the human-computer interaction device may determine whether there is a voice input event according to whether the microphone acquires sound in the external environment.
  • the human-computer interaction device detects a voice input event.
  • a plurality of light sensors may be disposed at the bottom of the human-computer interaction device, so that it is possible to determine whether a desktop edge event occurs based on the light intensity collected by each light sensor.
  • threshold 3 can be determined based on the light intensity collected by the light sensor when the human-computer interaction device stands still on the desktop with the bottom parallel to the desktop, and the light intensity collected by the light sensor when the human-computer interaction device walks on the desktop.
  • Threshold 4 can be determined based on the light intensity collected by the light sensor when the human-computer interaction device walks on the desktop.
  • the human-computer interaction device may collect images or videos through a camera, and determine whether there is a human-shaped event based on the collected images or videos.
  • the human-computer interaction device first extracts features from the image, and then matches the features in the image with joint features of the human body and torso features of the human body (which may be stored in the human-computer interaction device). If the features in the image match the preset hand joints and torso features, then it can be considered that there is a user in the external environment of the human-computer interaction device, and the user is not very close to the human-computer interaction device (no face is detected), and the human-computer interaction device detects the human event.
  • the joint features of the human body and the torso features of the human body can be determined based on the joint features of multiple human bodies and the torso features of the human body.
  • the human-computer interaction device may be provided with a posture sensor, and determine whether there is a picking-up event or a falling-down event based on a signal detected by the posture sensor.
  • the human-computer interaction device detects the picking up event.
  • the human-computer interaction device For example, based on the attitude angle information detected by the attitude sensor and combined with the external environment information collected by the camera, it can be determined whether the human-computer interaction device has fallen to the ground. At this time, the human-computer interaction device has detected a fall event.
  • S220 Determine a first processing method for a first event according to a first interaction mode of the human-computer interaction device and a first interaction state in the first interaction mode.
  • the interaction mode of the human-computer interaction device may be set to multiple interaction modes.
  • the interaction modes of the human-computer interaction device may include, but are not limited to, standby mode, human shape mode, face mode, application control mode, voice control mode, desktop mode, etc.
  • the human-computer interaction device has multiple interaction modes, and the first interaction mode is one of the multiple interaction modes of the human-computer interaction device, and which interaction mode among the multiple interaction modes is specifically determined based on the current external environment of the human-computer interaction device.
  • the human shape mode may be the interaction mode of the human-computer interaction device when it detects the presence of a human shape in its external environment.
  • the face mode may be the interaction mode of the human-computer interaction device when it detects the presence of a close-up face in its external environment.
  • the application control mode may be the interaction mode of the human-computer interaction device when it detects that an application is being manipulated.
  • the voice control mode may be the interaction mode of the human-computer interaction device when it detects voice.
  • the desktop mode may be the interaction mode of the human-computer interaction device when it detects the presence of a desktop in its external environment.
  • the standby mode may be the default or regular interaction mode of the human-computer interaction device.
  • the humanoid mode may include a standby state, a following state, a stroking state, a searching state, a gesture state, one or more hooking states, a picking up state, a falling state, and the like.
  • the standby state in the humanoid mode is the default or normal interactive state of the human-machine interaction device in the humanoid mode, and its standby purpose is to attract the user to approach and interact with it.
  • an event related to its standby purpose is detected (such as a humanoid moving away event, a humanoid disappearing event, a stroking event, a gesture event, etc.)
  • the human-machine interaction device will switch from the standby state to the corresponding interactive state in the same mode.
  • the human-computer interaction device When the human-computer interaction device detects a human figure and the human figure is getting farther and farther away from it, the human-computer interaction device can enter the following state of the human figure mode.
  • the main purpose of the human-computer interaction device is to follow the user, and weak feedback or no feedback will be given to interference events (such as obstacle events).
  • the human-machine interaction device When the human-machine interaction device detects the user's stroking action, the human-machine interaction device can be in the stroking state of the human-shaped mode. In the stroking state of the human-shaped mode, the main purpose of the human-machine interaction device is to interact with the user.
  • the human-machine interaction device When the human-machine interaction device detects that the human figure disappears, the human-machine interaction device may be in a human-figure mode to find a person.
  • the human-machine interaction device In the human-figure mode to find a person, the human-machine interaction device mainly aims to find a user, and it may preferentially process external human figure events or human face events.
  • the human-machine interaction device When the human-machine interaction device detects a specific gesture of the user, the human-machine interaction device may be in a gesture state of the human-shaped mode. In the gesture state of the human-shaped mode, the main purpose of the human-machine interaction device is to recognize the user's interactive gesture and make corresponding feedback.
  • the human-machine interaction device When the human-machine interaction device detects that it is picked up, the human-machine interaction device can be in the picked up state of the human-shaped mode.
  • the main purpose of the human-machine interaction device In the picked up state of the human-shaped mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and it can give priority to processing the event of its posture returning to normal.
  • the human-machine interaction device When the human-machine interaction device detects that the user has fallen, the human-machine interaction device may be in a human-shaped mode of falling down. In the human-shaped mode of falling down, the main purpose of the human-machine interaction device is to restore its posture to normal, and the human-machine interaction device may give priority to processing the event of restoring its posture to normal.
  • the face mode when the first interaction mode is the face mode, the face mode may include a standby state, one or more stroking states, a searching state, a gesture state, one or more hooking states, a picking up state, a falling state, etc.
  • the standby state in the face mode is the default or normal interaction state of the human-machine interaction device in the face mode, and its standby purpose is to interact with the user at close range.
  • an event related to its standby purpose is detected (such as a touch event, a face disappearance event, a gesture event, etc.)
  • the human-machine interaction device will switch from the standby state to the corresponding interaction state in the same mode.
  • the human-machine interaction device When the human-machine interaction device detects the user's stroking action, the human-machine interaction device may be in a stroking state of the face mode. In the stroking state of the face mode, the main purpose of the human-machine interaction device is to interact with the user.
  • the human-machine interaction device When the human-machine interaction device detects that the face disappears, the human-machine interaction device may be in a human-face mode to find a person. In the human-face mode to find a person, the main purpose of the human-machine interaction device is to find the user, and it may give priority to processing external human face events.
  • the human-machine interaction device When the human-machine interaction device detects a specific gesture of the user, the human-machine interaction device may be in a gesture state of the face mode. In the gesture state of the face mode, the main purpose of the human-machine interaction device is to recognize the user's interactive gesture and make corresponding feedback.
  • the human-machine interaction device When the human-machine interaction device wants to attract the user to interact with it, the human-machine interaction device can be in the attractive state of the face mode. In the attractive state of the face mode, the main purpose of the human-machine interaction device is to attract the user to interact with it.
  • the human-machine interaction device When the human-machine interaction device detects that it is picked up, the human-machine interaction device can be in the picked up state of the face mode.
  • the main purpose of the human-machine interaction device In the picked up state of the face mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and it can give priority to processing the event of its posture returning to normal.
  • the human-machine interaction device When the human-machine interaction device detects that the user has fallen, the human-machine interaction device may be in a falling state in a face mode. In the falling state in the face mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and the event of restoring its posture to normal may be processed first.
  • the voice control mode may include a standby state, an incomprehensible state, a specific command feedback state, a person-finding state, a picking-up state, a falling state, and the like.
  • the standby state in the voice control mode is the default or normal interactive state of the human-machine interaction device in the voice control mode, and its standby purpose is to wait for the user to issue a voice command.
  • the human-machine interaction device is in the standby state in the voice control mode, it can process external voice input events first.
  • the human-computer interaction device When the human-computer interaction device detects the user's voice command, but after semantic analysis of the voice command, no corresponding voice analysis result is obtained, the human-computer interaction device may be in the incomprehensible state of the voice control mode.
  • the main purpose of the human-computer interaction device is to feedback to the user that it does not understand the user's voice command, and it may prioritize processing of external voice commands.
  • the human-computer interaction device When the human-computer interaction device detects the user's voice command and performs semantic analysis on the voice command to obtain the corresponding voice analysis result, the human-computer interaction device can be in the specific command feedback state of the voice control mode.
  • the main purpose of the human-computer interaction device is to execute the action corresponding to the user's voice command, and it can give priority to processing external voice commands.
  • the human-machine interaction device may be in the search state of the voice control mode.
  • the main purpose of the human-machine interaction device is to find the user, and it may give priority to processing external face events and voice input events.
  • the human-machine interaction device When the human-machine interaction device detects that it is picked up, the human-machine interaction device can be in the picked up state of the voice control mode.
  • the main purpose of the human-machine interaction device In the picked up state of the voice control mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and it can give priority to processing the event of its posture returning to normal.
  • the human-machine interaction device When the human-machine interaction device detects that the user has fallen, the human-machine interaction device may be in the falling state of the voice control mode. In the falling state of the voice control mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and it may give priority to processing the event of restoring its posture to normal.
  • the standby mode may include a standby state, a stroking state, an obstacle state, a picking up state, a falling state, and the like.
  • the standby state in the standby mode is the default or normal interactive state of the human-machine interaction device in the standby mode, and the standby purpose is autonomous navigation and/or observation of the external environment.
  • the human-machine interaction device is in the standby state of the standby mode, it can autonomously navigate and process events in the external environment.
  • the human-machine interaction device When the human-machine interaction device detects the user's stroking action, the human-machine interaction device may be in a stroking state of the standby mode. In the stroking state of the standby mode, the main purpose of the human-machine interaction device is to interact with the user.
  • the human-machine interaction device When the human-machine interaction device detects that it is picked up, the human-machine interaction device can be in the picked up state of the standby mode.
  • the main purpose of the human-machine interaction device In the picked up state of the standby mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and it can give priority to processing the event of its posture returning to normal.
  • the human-machine interaction device When the human-machine interaction device detects that the user has fallen, the human-machine interaction device may be in a falling state in a standby mode. In the falling state in the standby mode, the main purpose of the human-machine interaction device is to restore its posture to normal, and the human-machine interaction device may give priority to processing the event of restoring its posture to normal.
  • the human-machine interaction device When the human-machine interaction device detects an obstacle event, the human-machine interaction device may be in an obstacle state of a standby mode. In the obstacle state of the standby mode, the main purpose of the human-machine interaction device is to determine whether to avoid the obstacle or interact with the obstacle according to the type of the obstacle.
  • S220 is described in detail by taking the first event as case 1, case 2 and case 3 as an example.
  • Case 1 The first event is an interactive state switching event.
  • the method 200 further includes S230 , which is executed between S210 and S220 .
  • S230 is described in detail below.
  • S230 Switch from the second interaction state in the first interaction mode to the first interaction state in the first interaction mode according to the first event.
  • the first interaction state and the second interaction state are respectively two different interaction states in the first interaction mode.
  • the first event may be an abnormal posture event, such as a pick-up event or a fall event, etc.
  • the human-computer interaction device When the human-computer interaction device detects an abnormal posture event, it may switch from any other interaction state (second interaction state) to an abnormal posture state (first interaction state) to prevent the abnormal posture event from causing harm to the human-computer interaction device itself or the user.
  • the human-computer interaction device may first switch from the standby state (an example of the second interaction state) to the pick-up state (an example of the first interaction state). Then, according to the standby mode and the pick-up state in the standby mode, determining the first processing method of the pick-up event may include: twisting the body slightly to indicate to the user to put oneself down.
  • the human-computer interaction device may first switch from the standby state (an example of the second interaction state) to the fall state (an example of the first interaction state). Then, according to the standby mode and the fall state in the standby mode, determining the first processing method for the fall event may include: prompting the user to enter the fall state with a voice, and further, entering the dormant state after the voice prompt; or standing up again by itself.
  • the human-computer interaction device may first switch from the gesture state (an example of the second interaction state) to the pick-up state (an example of the first interaction state). Then, according to the humanoid mode (or face mode) and the pick-up state in the humanoid mode (or face mode), determining the first processing method for the pick-up event may include: twisting the body slightly to signal the user to put oneself down.
  • the human-computer interaction device may first switch from the gesture state (an example of the second interaction state) to the falling state (an example of the first interaction state). Then, according to the humanoid mode (or face mode) and the falling state in the humanoid mode (or face mode), determining the first processing method for the falling event may include: prompting the user to enter the falling state by voice, and further, entering the dormant state after the voice prompt; or standing up again by itself.
  • the human-computer interaction device may first switch from the specific instruction state (an example of the second interaction state) to the pick-up state (an example of the first interaction state). Then, according to the face mode and the pick-up state in the face mode, determining that the first processing method for the pick-up event may include: twisting the body slightly to signal the user to put oneself down.
  • the human-computer interaction device may first switch from the specific instruction state (an example of the second interaction state) to the fall state (an example of the first interaction state). Then, according to the voice control mode and the fall state in the voice control mode, determining the first processing method for the fall event may include: giving a voice prompt to the user to enter the fall state, and further, entering a dormant state after the voice prompt; or standing up again by itself.
  • the first event may also be an event not related to abnormal posture.
  • the first event may be an event related to the standby purpose in a specific interaction mode.
  • the first interaction mode is the face mode
  • the second interaction state is the standby state.
  • the purpose of the standby state is to interact with the user at close range, so the events related to the standby purpose may include touch events, gesture events, face disappearance events, etc.
  • the first event is a gesture event
  • the human-computer interaction device can switch from the standby state (an example of the second interaction state) to the gesture state (an example of the first interaction state).
  • the human-computer interaction device will perform corresponding actions according to the user's specific gestures. For example, when the user's fist bump gesture is detected, the human-computer interaction device can perform a fist bump action.
  • the first interaction mode is the standby mode and the second interaction state is the standby state.
  • the standby purpose is autonomous navigation and/or observation of the external environment, so the events related to the standby purpose may include obstacle events, empty field events, touch events, etc.
  • the first event is an obstacle event
  • the human-computer interaction device can decide whether to interact with the obstacle according to the type of obstacle. If the obstacle type meets the interaction condition, the human-computer interaction device can switch from the standby state (an example of the second interaction state) to the obstacle state (an example of the first interaction state). In the obstacle state, the human-computer interaction device can push the obstacle to play.
  • the first interaction mode is the humanoid mode
  • the second interaction state is the standby state.
  • the purpose of the standby state is to attract the user to approach and interact with the user. Therefore, events related to the standby purpose may include humanoid moving away events, humanoid disappearing events, gesture events, etc.
  • the human-computer interaction device may switch from the standby state (an example of the second interaction state) to the following state (an example of the first interaction state). In the following state, the human-computer interaction state will follow the user's actions to attract the user's attention.
  • the first interaction mode is the voice control mode
  • the second interaction state is the standby state.
  • the purpose of the standby state is to wait for receiving the user's voice command
  • the events related to the standby purpose may include specific command events, command incomprehensible events, and voice command not received events.
  • the first event is a specific command event
  • the human-computer interaction device can be switched from the standby state (an example of the second interaction state) to the specific command state (an example of the first interaction state).
  • the specific command state the human-computer interaction device can perform corresponding actions according to the specific content of the voice command issued by the user.
  • Case 2 The first event is an interactive mode switching event.
  • the method 200 further includes S240 , which is executed between S210 and S220 .
  • S240 is described in detail below.
  • S240 Switch from the third interaction state in the second interaction mode to the first interaction state in the first interaction mode according to the first event, wherein the second interaction mode is one of the plurality of interaction modes.
  • the first event is an interaction mode switching event
  • the first event may be, but is not limited to, a close-up face event, a long-distance face event, a human shape event, a voice input event, or an application program manipulation event.
  • the human-computer interaction device may first switch from the stroking state of the standby mode (an example of the second interaction mode) or the obstacle state (an example of the third interaction state) to the standby state (an example of the first interaction state) of the face mode (an example of the first interaction mode). Then, according to the face mode and the standby state in the face mode, the processing method of the close-range face event is determined to include: comparing the detected face information with the recorded face information to confirm whether the face is recognized. If it is recognized face information, the human-computer interaction device can perform feedback to express happiness. If it is unknown face information, the human-computer interaction device can perform feedback to express doubt.
  • the human-computer interaction device may first switch from the stroking state (an example of the third interaction state) in the humanoid mode (an example of the second interaction mode) to the first-time face-seen state (an example of the first interaction state) in the face mode (an example of the first interaction mode).
  • the processing method for the close-up face event is determined to include: comparing the detected face information with the recorded face information to confirm whether the face is recognized, and if it is recognized face information, the human-computer interaction device may perform feedback expressing happiness, and if it is unknown face information, the human-computer interaction device may perform feedback expressing doubt.
  • the human-computer interaction device may first switch from the search state (an example of the third interaction state) of the face mode (an example of the second interaction mode) to the standby state (an example of the first interaction state) of the human-shaped mode (an example of the first interaction mode). Then, according to the human-shaped mode and the standby state in the human-shaped mode, determining the processing method of the long-distance face event may include: attracting the user to interact with the human-computer interaction device.
  • the human-computer interaction device may perform larger or more obvious actions, such as singing, dancing, turning in circles, raising its legs to say hello, etc., so as to more easily attract the attention of distant users and attract users to get closer.
  • the bionic effect of the human-computer interaction device is improved, which helps to improve the user experience.
  • the human-computer interaction device may first switch from the standby state (an example of the third interaction state) of the face mode (an example of the second interaction mode) to the standby state (an example of the first interaction state) of the humanoid mode (an example of the first interaction mode). Then, according to the humanoid mode and the standby state in the humanoid mode, determining the processing method of the humanoid event may include: attracting the user to interact with the human-computer interaction device.
  • the human-computer interaction device may perform larger or more obvious actions, such as singing, dancing, turning in circles, raising its legs to say hello, etc., so as to more easily attract the attention of distant users and attract users to get closer.
  • the bionic effect of the human-computer interaction device is improved, which helps to improve the user experience.
  • the human-computer interaction device may first switch from the search state (an example of the third interaction state) of the face mode (an example of the second interaction mode) to the standby state (an example of the first interaction state) of the voice control mode (an example of the first interaction mode). Then, according to the voice control mode and the standby state in the voice control mode, determining the processing method of the voice input event may include: executing an action corresponding to the voice input event. For example, if the voice input event is "hello, loona", the action corresponding to the voice input event may include: tilting the head and tilting the ears to make a listening action. Thereby improving the bionic effect of the human-computer interaction device and helping to improve the user experience.
  • the human-computer interaction device may first switch from the standby state (an example of the third interaction state) of the standby mode (an example of the second interaction mode) to the standby state (an example of the first interaction state) of the application control mode (an example of the first interaction mode). Then, according to the application control mode and the standby state in the application control mode, determining the processing method of the application control event may include: waiting with ears pricked up, or spinning in circles excitedly.
  • the method 200 does not need to perform any steps between S210 and S220.
  • the first interaction mode is the current interaction mode of the human-machine interaction device
  • the first interaction state is the current interaction state of the human-machine interaction device.
  • other events may include events other than the first event described in Case 1 and Case 2, that is, other events include events other than the interactive state switching event and the interactive mode switching event.
  • a first processing method for the touch event may include: in the picking up state, the indicator light on the ear flashes, or the touch event is ignored and no response is given.
  • a first processing method for the obstacle event may include: in the following state, responding weakly or not responding to the obstacle event, bypassing the obstacle and continuing to follow the user.
  • the human-computer interaction device when the first event is a state switching event, the human-computer interaction device does not need to switch the current first interaction mode of the human-computer interaction device, but only needs to switch the current second interaction state of the human-computer interaction device to the first interaction state, and then determine the first processing method of the first event according to the first interaction state and the first interaction mode.
  • the first event is an interaction mode switching event, it is not only necessary to switch the current second interaction mode of the human-computer interaction device to the first interaction mode, but also to switch the current third interaction state of the human-computer interaction device to the first interaction state, and then determine the first processing method of the first event according to the first interaction state and the first interaction mode.
  • the human-computer interaction device When the first event is other events (non-state switching events or non-interaction mode switching events), the human-computer interaction device does not need to switch the current first interaction mode of the human-computer interaction device, nor does it need to switch the current first interaction state, and directly determines the first processing method of the first event according to the first interaction state and the first interaction mode. In this way, compared with the existing solution that uses the same feedback for the same event, in the above method 200, the human-computer interaction device can flexibly process the event based on the interaction mode and the interaction state under the interaction mode, thereby improving the accuracy of the human-computer interaction device's analysis of the user's behavioral intention or external environment, improving the interaction effect between the user and the human-computer interaction device, and helping to improve the user experience.
  • FIG. 5 is a schematic structural diagram of an example of a human-computer interaction device 300 provided in an embodiment of the present application.
  • the human-computer interaction device 300 has multiple interaction modes, each of which includes at least one interaction state, and the interaction state is used to indicate the state of the human-computer interaction device 300 in the process of interacting with the outside world.
  • the human-computer interaction device 300 includes a detection unit 310 and a processing unit 320.
  • the detection unit 310 is used to detect a first event;
  • the processing unit 320 is used to determine a first processing method for the first event according to the first interaction mode of the human-computer interaction device and the first interaction state in the first interaction mode, wherein the first interaction mode is one of the multiple interaction modes.
  • the human-computer interaction device 300 further includes a first switching unit, configured to switch from the second interaction state in the first interaction mode to the first interaction state in the first interaction mode according to the first event.
  • a first switching unit configured to switch from the second interaction state in the first interaction mode to the first interaction state in the first interaction mode according to the first event.
  • the first event may be a pick-up event or a fall-down event.
  • the human-computer interaction device 300 further includes a second switching unit for switching from a third interaction state in the second interaction mode to a first interaction state in the first interaction mode according to a first event, and the second interaction mode is one of multiple interaction modes.
  • the first event may be a close-up face event, a long-distance face event, a human shape event, a voice input event, or a manipulation application event.
  • FIG. 6 shows a schematic structural diagram of a human-computer interaction device 400 provided in an embodiment of the present application.
  • the human-computer interaction device 400 includes: one or more processors 410, one or more memories 420, and the one or more memories 420 store one or more computer programs, and the one or more computer programs include instructions.
  • the instructions are executed by the one or more processors 410, the human-computer interaction device 400 executes the above method 200.
  • the embodiment of the present application provides a computer program product, which, when executed in a human-computer interaction device, enables the human-computer interaction device to execute the above method 200. Its implementation principle and technical effect are similar to those of the above method 200, and will not be described in detail here.
  • the embodiment of the present application provides a computer-readable storage medium, which contains instructions.
  • the instructions When the instructions are executed on a device, the device executes the above method 200.
  • the implementation principle and technical effect are similar and will not be described here.
  • the embodiment of the present application provides a chip, the chip is used to execute instructions, and when the chip is running, the above method 200 is executed.
  • the implementation principle and technical effect are similar, and will not be repeated here.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art.
  • the computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种人机交互装置(300)及人机交互方法,该人机交互装置(300)具有多个交互模式,每个所述交互模式包括至少一个交互状态,所述交互状态用于指示所述人机交互装置(300)与外界交互的过程中所处的状态,所述人机交互装置(300)包括:检测单元(310),用于检测第一事件;处理单元(320),用于根据所述人机交互装置(300)所处的第一交互模式和所述第一交互模式下的第一交互状态,确定对所述第一事件的第一处理方式,其中,所述第一交互模式为所述多个交互模式之一。这样,该人机交互装置(300)能够灵活地根据交互模式和交互状态对检测到的事件进行处理,提高了用户与人机交互装置(300)的互动效果,有助于提升用户体验。

Description

人机交互装置及人机交互方法 技术领域
本申请实施例涉及人工智能技术领域,并且更具体地,涉及一种人机交互装置及人机交互方法。
背景技术
随着人工智能技术的不断地发展,机器人的种类越来越多。其中,家庭机器人是比较重要的一种机器人,其可以通过与用户之间的交互,来增加用户的幸福感、减轻用户的压力。
然而,现有的家庭机器人与用户之间的交互均是预先写好且不再产生变化的,即,现有的家庭机器人只能机械化地进行人机交互,使得用户体验较差。
发明内容
本申请实施例提供了一种人机交互装置及人机交互方法,该人机交互装置能够灵活地根据交互模式和交互状态对检测到的事件进行处理,提高用户与人机交互装置的互动效果,有助于提升用户体验。
第一方面,提供了一种人机交互装置,所述人机交互装置具有多个交互模式,每个所述交互模式包括至少一个交互状态,所述交互状态用于指示所述人机交互装置与外界交互的过程中所处的状态,所述人机交互装置包括:检测单元,用于检测第一事件;处理单元,用于根据所述人机交互装置所处的第一交互模式和所述第一交互模式下的第一交互状态,确定对所述第一事件的第一处理方式,其中,所述第一交互模式为所述多个交互模式之一。
相比于对同一事件都采用相同反馈的现有方案而言,本申请实施例能够灵活地基于交互模式和该交互模式下的交互状态下,对事件进行处理,进而可以提升人机交互装置对用户的行为意图或外界环境分析的准确度,提高用户与人机交互装置的互动效果,有助于提升用户体验。
在一种可实现的方式中,所述人机交互装置还包括:第一切换单元,用于根据所述第一事件,从所述第一交互模式下的第二交互状态切换至所述第一交互模式下的所述第一交互状态。
此时,该第一事件可以称为交互状态切换事件。例如,该第一事件可以为拿起事件或倒地事件等。
人机交互装置在检测到第一事件后,根据人机交互装置对用户的行为意图或外界环境的分析,不切换当前的交互模式仅切换当前的交互状态,并根据切换后的交互状态以及当前的第一交互模式确定对第一事件的处理方式,可以进一步提高用户与人机交互装置的互动效果,进一步有助于提升用户体验。
在另一种可实现的方式中,所述人机交互装置还包括:第二切换单元,用于根据所述 第一事件,从第二交互模式下的第三交互状态切换至所述第一交互模式下的所述第一交互状态,所述第二交互模式为所述多个交互模式之一。
此时,该第一事件可以称为交互模式切换事件。例如,该第一事件为近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件。
人机交互装置在检测到第一事件后,根据人机交互装置对用户的行为意图或外界环境的分析,不仅换当前的交互模式同时也切换当前的交互状态,并根据切换后的交互状态以及切换后的第一交互模式确定对第一事件的处理方式,可以进一步提高用户与人机交互装置的互动效果,进一步有助于提升用户体验。
进一步提升人机交互装置对用户的行为意图或外界环境分析的准确度,提高用户与人机交互装置的互动效果,进一步有助于提升用户体验。
第二方面,提供了一种人机交互方法,所述人机交互方法应用于人机交互装置,所述人机交互装置具有多个交互模式,每个所述交互模式包括至少一个交互状态,所述交互状态用于指示所述人机交互装置与外界交互的过程中所处的状态,所述人机交互方法包括:检测第一事件;根据所述人机交互装置所处的第一交互模式和所述第一交互模式下的第一交互状态,确定对所述第一事件的第一处理方式,其中,所述第一交互模式为所述多个交互模式之一。
在一种可实现的方式中,所述方法还包括:根据所述第一事件,从所述第一交互模式下的第二交互状态切换至所述第一交互模式下的所述第一交互状态。
示例性地,所述第一事件为拿起事件或倒地事件。
在另一种可实现的方式中,所述方法还包括:根据所述第一事件,从第二交互模式下的第三交互状态切换至所述第一交互模式下的所述第一交互状态,所述第二交互模式为所述多个交互模式之一。
示例性地,所述第一事件为近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件。
第二方面中任一项可能的实现方式的技术效果可以参考相应第一方面的实现方式的技术效果,这里不再赘述。
第三方面,提供了一种人机交互装置,包括:一个或多个处理器;一个或多个存储器;以及一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述一个或多个存储器中,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述人机交互装置执行如第二方面或第二方面中的任一可能实现方式中所述的人机交互方法。
第四方面,提供了一种计算机可读存储介质,包括计算机指令,当所述计算机指令在人机交互装置上运行时,使得所述人机交互装置执行如第二方面或第二方面中的任一可能实现方式中所述的人机交互方法。
第五方面,提供了一种芯片,包括至少一个处理器和接口电路,所述接口电路用于为所述至少一个处理器提供程序指令或者数据,所述至少一个处理器用于执行所述程序指令,以实现如第二方面或第二方面中的任一可能实现方式中所述的人机交互方法。
第六方面,提供了一种计算机程序产品,所述计算机程序产品包括计算机指令;当部分或全部所述计算机指令在计算机上运行时,使得第二方面或第二方面中的任一可能实现方式中所述的人机交互方法被执行。
附图说明
图1是本申请实施例提供的一例人机交互装置的硬件结构示意图。
图2是本申请实施例提供的一例人机交互方法的示意性流程图。
图3是本申请实施例提供的另一例人机交互方法的示意性流程图。
图4是本申请实施例提供的又一例人机交互方法的示意性流程图。
图5是本申请实施例提供的一例人机交互装置的示意性结构图。
图6是本申请实施例提供的另一例人机交互装置的示意性结构图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“复数个”或者“多个”是指两个或多于两个。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
示例性地,图1示出了本申请实施例提供的一例人机交互装置100的结构示意图
例如,如图1所示,人机交互装置100可以包括处理器110,执行器111,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线,无线通信模块150,传感器模块160,音频模块170,扬声器170A,麦克风170B,摄像头180,显示屏190等。
可以理解的是,本申请实施例示意的结构并不构成对人机交互装置100的具体限定。在本申请另一些实施例中,人机交互装置100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括图形处理器(graphics processing unit,GPU),控制器,存储器,等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是人机交互装置100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
存储器用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal  asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,和/或USB接口等。其中,I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块150可以通过PCM总线接口耦合。UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块150。MIPI接口可以被用于连接处理器110与显示屏190,摄像头180等外围器件。GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头180,显示屏190,无线通信模块150,传感器模块160,音频模块170等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对人机交互装置100的结构限定。在本申请另一些实施例中,人机交互装置100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
执行器111用于控制人机交互装置100移动、旋转、跳跃等。可选地,在一些实施例中,若人机交互装置100包括耳朵、躯干和腿部,执行器111还用于控制躯干相对于腿部转动、腿部相对于躯干转动、躯干摇晃、或耳朵沿躯干旋转等。可选地,在一些实施例中,执行器111可以包括至少一个电机。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展人机交互装置100的存储能力。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行人机交互装置100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储人机交互装置100使用过程中所创建的数据(比如音频数据等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为人机交互装置100充电,也可以用于人机交互装置100与外围设备之间传输数据。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过人机交互装置100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。电源管理模块141用于连接电池142,充电管理模块140与处理器110。
无线通信模块150可以提供应用在人机交互装置100上的包括无线局域网(wireless  local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT)等无线通信的解决方案。
在一些实施例中,人机交互装置100的天线和无线通信模块150耦合,使得人机交互装置100可以通过无线通信技术与网络以及其他设备通信。
传感器模块160可以包括至少一个传感器。例如,传感器模块160包括触摸传感器、距离传感器、姿态传感器等。在一些实施例中,触摸传感器为电容传感器,可以设置于人机交互装置的头顶、颈部、背部、腹部等位置,用于感知用户的抚摸、轻拍等交互动作。距离传感器用于测量人机交互装置与外界环境物体或用户之间的距离。姿态传感器为陀螺仪,用于感知人机交互装置的姿态变化。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。麦克风170B,也称“话筒”,“传声器”,用于将声音信号转换为电信号。
人机交互装置100可以通过音频模块170,扬声器170A,麦克风170B,以及处理器110等实现音频功能。例如语音播放,录音等。
摄像头180用于捕获静态图像或视频,以便处理器110可以根据摄像头180获取的图像或视频进行事件的检测,从而可以对事件进行处理等。
显示屏190用于显示图像,视频等。在一些实施例中,显示屏190可以显示表情动画,表现人机交互装置的当前情绪状态。
下面结合图2至图4,对本申请实施例提供的人机交互方法进行描述。
图2是本申请实施例提供的一例人机交互方法200的示意性流程图。
该图2所示的人机交互方法200可应用于如图1所示的人机交互装置100中。
例如,如图2所示,方法200包括S210和S220,S220在S210之后执行。下面详细介绍S210和S220。
S210,检测第一事件。
本申请实施例涉及的事件可以理解为人机交互装置的外界环境变化事件。
示例性地,事件可以包括但不限于障碍物事件、人脸事件、近距离人脸事件、远距离人脸事件、抚摸事件、手势事件、操控应用程序事件、语音输入事件、桌面边缘事件、人形事件、拿起事件、倒地事件等。
示例性地,人机交互装置可以包括探测单元(如距离传感器、摄像头等),人机交互装置通过探测单元,来确定是否存在障碍物事件。
例如,人机交互装置会通过距离传感器周期性地发送探测信号,根据人机交互装置接收到探测信号反射回来的信号,和/或结合摄像头采集的静态图像或视频,人机交互装置可以探测到外界环境中是否存在障碍物,此时人机交互装置便检测到了障碍物事件。
进一步地,人机交互装置在确定存在障碍物事件后,还可以根据障碍物与人机交互装置之间的距离,来确定是突然障碍物事件还是趋近型障碍物事件。具体的,当障碍物与人机交互装置之间的距离小于或等于第一距离阈值(例如20cm),可确定是突然障碍物事件;当障碍物与人机交互装置之间的距离大于第一距离阈值,可确定是趋近型障碍物事件。
示例性地,人机交互装置可以包括摄像头,通过摄像头采集图像或视频,并根据采集 的图像或视频,确定是否存在人脸事件。
例如,人机交互装置可以在图像上从左往右、从上往下依次选择目标矩形区域,将目标矩形区域作为一个观察窗口。然后,提取每个观察窗口对应的图像区域中的特征,并根据提取的特征判断是否为人脸对应的特征。若提取的特征与人脸对应的特征匹配,那么可认为人机交互装置的外界环境中存在人脸,此时人机交互装置便检测到了人脸事件。其中,目标矩形区域的大小可以是基于统计的多个人脸的大小而确定的。
示例性地,当人机交互装置检测到人脸,可以根据进一步人脸与人机交互装置之间的距离来,确定是近距离人脸事件还是远距离人脸事件。
例如,当人机交互装置检测到人脸,且该人脸与人机交互装置之间的距离小于或等于第二距离阈值时,可确定存在近距离人脸事件。
又例如,当人机交互装置检测到人脸,且该人脸与人机交互装置之间的距离大于第二距离阈值时,可确定存在远距离人脸事件。
示例性地,人机交互装置可以包括触摸传感器,人机交互装置通过触摸传感器检测到的信号,来确定是否存在抚摸事件。
例如,若触摸传感器为电容传感器,在某个事件检测到了电容信号变化,且电容信号大于阈值1,那么可认为用户抚摸了人机交互装置,此时人机交互装置便检测到了该抚摸事件。其中,阈值1可以是基于多个抚摸事件对应的多个电容信号确定的。
示例性地,人机交互装置可以通过摄像头采集图像或视频,并根据采集的图像或视频,确定是否存在手势事件。
例如,人机交互装置先提取图像中的特征,然后将图像中的特征与预设的手势特征(可以是存储在人机交互装置中)进行匹配,若图像中的特征与预设的手势特征匹配,那么可认为用户在向人机交互装置做手势,此时人机交互装置便检测到了该手势事件。
示例性地,人机交互装置可以根据与应用程序相关的控件,来确定是否存在操控应用程序事件。
例如,人机交互装置在接收到用户操控与应用程序相关的控件的操作指令时,那么此时人机交互装置便检测到了操控应用程序事件。
示例性地,人机交互装置可以包括麦克风,进而人机交互装置根据麦克风是否获取到外界环境中的声音,来确定是否存在语音输入事件。
例如,当人机交互装置的麦克风获取到外界环境中的声音后,此时人机交互装置便检测到了语音输入事件。
示例性地,人机交互装置的底部可以设置多个光传感器,这样,可以根据每个光传感器采集的光强,确定是否存在桌面边缘事件。
例如,在人机交互装置的多个光传感器中,存在一个光传感器采集的光强小于阈值3、存在一个光传感器采集的光强大于阈值4,那么可认为人机交互装置处于桌面边缘,此时人机交互装置便检测到了桌面边缘事件。其中,阈值3可以是基于当人机交互装置静止站立在桌面且底部平行于桌面时,光传感器采集到的光强、以及当人机交互装置在桌面上行走时,光传感器采集到的光强确定的。阈值4可以是基于当人机交互装置在桌面上行走时,光传感器采集到的光强确定的。
示例性地,人机交互装置可以通过摄像头采集图像或视频,并根据采集的图像或视频,确定是否存在人形事件。
例如,人机交互装置先提取图像中的特征,然后将图像中的特征与人体的关节特征、人体的躯干特征(可以是存储在人机交互装置中)进行匹配,若图像中的特征与预设的手关节、躯干特征匹配,那么可认为人机交互装置的外界环境中存在用户,且用户离该人机交互装置不是很近(未检测到人脸),此时人机交互装置便检测到了该人形事件。其中,人体的关节特征、人体的躯干特征可以是基于多个人体的关节特征、人体的躯干特征而确定的。
示例性地,人机交互装置可以设置姿态传感器,并根据姿态传感器检测的信号,确定是否存在拿起事件或倒地事件。
例如,根据姿态传感器检测到的姿态角信息,并结合摄像头采集到的外界环境信息,可以判断人机交互装置是否被用户拿起,此时人机交互装置便检测到了拿起事件。
例如,根据姿态传感器检测到的姿态角信息,并结合摄像头采集到的外界环境信息,可以判断人机交互装置是否倒在地面上,此时人机交互装置便检测到了倒地事件。
S220,根据人机交互装置所处的第一交互模式和第一交互模式下的第一交互状态,确定对第一事件的第一处理方式。
可选地,在一些实施例中,为了提高用户与人机交互装置的交互效率,可以将该人机交互装置的交互模式设置为多个交互模式。
示例性地,人机交互装置的交互模式可以包括但不限于待机模式、人形模式、人脸模式、应用程序控制模式、语音控制模式、桌面模式等。
人机交互装置具有多个交互模式,第一交互模式是人机交互装置的多个交互模式之一,其具体是多个交互模式中的哪个交互模式是基于人机交互装置当前的外界环境而定。例如,人形模式可以是人机交互装置在检测到其外界环境中存在人形的情况下的交互模式。人脸模式可以是人机交互装置在检测到其外界环境中存在近距离人脸的情况下的交互模式。应用程序控制模式可以是人机交互装置在检测到应用程序被操控的情况下的交互模式。语音控制模式可以是人机交互装置在检测到语音的情况下的交互模式。桌面模式可以是人机交互装置在检测到其外界环境中存在桌面的情况下的交互模式。待机模式可以是人机交互装置默认或常规的交互模式。
人机交互装置的每个交互模式包括至少一个交互状态,交互状态用于指示人机交互装置与外界交互的过程中所处的状态。第一交互状态是第一交互模式下的交互状态,当第一交互模式下有多个交互状态时,第一交互状态是第一交互模式中的哪个交互状态是基于人机交互装置当前的外界环境而定。
在一个示例中,当第一交互模式是人形模式,人形模式可以包括待机状态、跟随状态、抚摸状态、找人状态、手势状态、一个或多个勾人状态、拿起状态、倒地状态等。
其中,人形模式下的待机状态是人机交互装置在人形模式下默认或常规的交互状态,其待机目的是吸引用户靠近并与其进行交互。当人机交互装置处于人形模式的待机状态,若检测到与其待机目的相关的事件时(如人形远离事件、人形消失事件、抚摸事件、手势事件等),人机交互装置将由待机状态切换至同模式下的相应交互状态。
当人机交互装置检测到人形、且人形与其越来越远时,人机交互装置可进入人形模式的跟随状态。在人形模式的跟随状态下,人机交互装置主要的目的是跟随用户,对于干扰事件(如障碍物事件)将进行弱反馈或无反馈。
当人机交互装置检测到用户抚摸自己的动作时,人机交互装置可处于人形模式的抚摸 状态。在人形模式的抚摸状态下,人机交互装置主要的目的是与用户进行互动。
当人机交互装置检测到人形消失时,人机交互装置可处于人形模式的找人状态。在人形模式的找人状态下,人机交互装置主要的目的寻找用户,其可优先对外界的人形事件或人脸事件进行处理。
当人机交互装置检测到用户的特定手势时,人机交互装置可处于人形模式的手势状态。在人形模式的手势状态下,人机交互装置的主要目的是识别用户的互动手势,并做出相应反馈。
当人机交互装置想吸引与用户进行交互时,人机交互装置可处于人形模式的勾人状态。在人形模式的勾人状态下,人机交互装置主要的目的吸引用户与其进行交互。
当人机交互装置检测到其被拿起时,人机交互装置可处于人形模式的拿起状态。在人形模式的拿起状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
当人机交互装置检测到其倒地时,人机交互装置可处于人形模式的倒地状态。在人形模式的倒地状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
在另一个示例中,当第一交互模式是人脸模式,人脸模式可以包括待机状态、一个或多个抚摸状态、找人状态、手势状态、一个或多个勾人状态、拿起状态、倒地状态等。
其中,人脸模式下的待机状态是人机交互装置在人脸模式下默认或常规的交互状态,其待机目的是与用户进行近距离交互。当人机交互装置处于人脸模式的待机状态,若检测到与其待机目的相关的事件时(如抚摸事件、人脸消失事件、手势事件等),人机交互装置将由待机状态切换至同模式下的相应交互状态。
当人机交互装置检测到用户抚摸自己的动作时,人机交互装置可处于人脸模式的抚摸状态。在人脸模式的抚摸状态下,人机交互装置主要的目的是与用户进行互动。
当人机交互装置检测到人脸消失时,人机交互装置可处于人脸模式的找人状态。在人脸模式的找人状态下,人机交互装置主要的目的寻找用户,其可优先对外界的人脸事件进行处理。
当人机交互装置检测到用户的特定手势时,人机交互装置可处于人脸模式的手势状态。在人脸模式的手势状态下,人机交互装置的主要目的是识别用户的互动手势,并做出相应反馈。
当人机交互装置想吸引与用户进行交互时,人机交互装置可处于人脸模式的勾人状态。在人脸模式的勾人状态下,人机交互装置主要的目的吸引用户与其进行交互。
当人机交互装置检测到其被拿起时,人机交互装置可处于人脸模式的拿起状态。在人脸模式的拿起状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
当人机交互装置检测到其倒地时,人机交互装置可处于人脸模式的倒地状态。在人脸模式的倒地状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
在又一个示例中,当第一交互模式是语音控制模式,语音控制模式可以包括待机状态、听不懂状态、具体指令反馈状态、找人状态、拿起状态、倒地状态等。
其中,语音控制模式下的待机状态是人机交互装置在语音控制模式下默认或常规的交 互状态,其待机目的是等待用户发出语音指令。当人机交互装置处于语音控制模式的待机状态,其可优先对外界的语音输入事件进行处理。
当人机交互装置检测到用户的语音指令,但是,对该语音指令进行语义分析后,未得到相应的语音分析结果时,人机交互装置可处于语音控制模式的听不懂状态。在语音控制模式的听不懂状态下,人机交互装置主要的目的是向用户反馈其听不懂用户的语音指令,其可优先对外界的语音指令进行处理。
当人机交互装置检测到用户的语音指令,并对该语音指令进行语义分析后,得到相应的语音分析结果时,人机交互装置可处于语音控制模式的具体指令反馈状态。在语音控制模式的具体指令反馈状态下,人机交互装置主要的目的是执行与用户的语音指令对应的动作,其可优先对外界的语音指令进行处理。
当人机交互装置未在预设时间内接收到用户的语音指令,人机交互装置可处于语音控制模式的找人状态。在语音控制模式的找人状态下,人机交互装置主要的目的是寻找用户,其可优先对外界的人脸事件、语音输入事件进行处理。
当人机交互装置检测到其被拿起时,人机交互装置可处于语音控制模式的拿起状态。在语音控制模式的拿起状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
当人机交互装置检测到其倒地时,人机交互装置可处于语音控制模式的倒地状态。在语音控制模式的倒地状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
在又一个示例中,当第一交互模式是待机模式,待机模式可以包括待机状态、抚摸状态、障碍物状态、拿起状态、倒地状态等。
其中,待机模式下的待机状态是人机交互装置在待机模式下默认或常规的交互状态,其待机目的是自主导航和/或观察外界环境。当人机交互装置处于待机模式的待机状态,其可自主导航并对外界中的事件进行处理。
当人机交互装置检测到用户抚摸自己的动作时,人机交互装置可处于待机模式的抚摸状态。在待机模式的抚摸状态下,人机交互装置主要的目的是与用户进行互动。
当人机交互装置检测到其被拿起时,人机交互装置可处于待机模式的拿起状态。在待机模式的拿起状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
当人机交互装置检测到其倒地时,人机交互装置可处于待机模式的倒地状态。在待机模式的倒地状态下,人机交互装置主要的目的是让自己的姿态恢复正常,其可优先对其姿态恢复正常的事件进行处理。
当人机交互装置检测到障碍物事件时,人机交互装置可处于待机模式的障碍物状态。在待机模式的障碍物状态下,人机交互装置主要的目的是根据障碍物的类型,确定是躲避障碍物还是与障碍物互动。
下面,以第一事件为情况1、情况2和情况3为例对S220进行详细描述。
情况1,第一事件为交互状态切换事件。
在情况1中,如图3所示,方法200还包括S230,该S230在S210和S220之间执行,下面详细介绍S230。
S230,根据第一事件,从第一交互模式下的第二交互状态切换至第一交互模式下的第 一交互状态。
此时,当第一事件为状态切换事件时,不用切换人机交互装置当前的第一交互模式,只需将人机交互装置当前所处的第二交互状态切换至第一交互状态,进而执行S230,即根据第一交互状态和第一交互模式,确定第一事件的第一处理方式。
第一交互状态和第二交互状态分别是第一交互模式下的两个不同的交互状态。
示例性地,该第一事件可以是姿态异常事件,如拿起事件或倒地事件等。当人机交互装置检测到姿态异常事件时,可以由其他任意交互状态(第二交互状态)切换至姿态异常状态(第一交互状态),避免姿态异常事件让人机交互装置自身或用户受到伤害。
在一个示例中,当第一事件为拿起事件、第一交互模式为待机模式、第二交互状态为待机状态时,人机交互装置可以先从待机状态(第二交互状态的一例)切换至拿起状态(第一交互状态的一例)。然后,根据待机模式和待机模式下的拿起状态,确定拿起事件的第一处理方式可以包括:小幅度扭动身体,示意用户将自己放下去。
在另一个示例中,当第一事件为倒地事件、第一交互模式为待机模式、第二交互状态为待机状态时,人机交互装置可以先从待机状态(第二交互状态的一例)切换至倒地状态(第一交互状态的一例)。然后,根据待机模式和待机模式下的倒地状态,确定倒地事件的第一处理方式可以包括:语音提示用户进入倒地状态,此外,还可以在语音提示后进入休眠状态;或者,自己重新站起。
在又一个示例中,当第一事件为拿起事件、第一交互模式为人形模式(或人脸模式)、第二交互状态为手势状态时,人机交互装置可以先从手势状态(第二交互状态的一例)切换至拿起状态(第一交互状态的一例)。然后,根据人形模式(或人脸模式)和人形模式(或人脸模式)下的拿起状态,确定拿起事件的第一处理方式可以包括:小幅度扭动身体,示意用户把自己放下去。
在又一个示例中,当第一事件为倒地事件、第一交互模式为人形模式(或人脸模式)、第二交互状态为手势状态时,人机交互装置可以先从手势状态(第二交互状态的一例)切换至倒地状态(第一交互状态的一例)。然后,根据人形模式(或人脸模式)和人形模式(或人脸模式)下的倒地状态,确定倒地事件的第一处理方式可以包括:语音提示用户进入倒地状态,此外,还可以在语音提示后进入休眠状态;或者,自己重新站起。
在又一个示例中,当第一事件为拿起事件、第一交互模式为语音控制模式、第二交互状态为具体指令状态时,人机交互装置可以先从具体指令状态(第二交互状态的一例)切换至拿起状态(第一交互状态的一例)。然后,根据人脸模式和人脸模式下的拿起状态,确定拿起事件的第一处理方式可以包括:小幅度扭动身体,示意用户放自己下去。
在又一个示例中,当第一事件为倒地事件、第一交互模式为语音控制模式、第二交互状态为具体指令状态时,人机交互装置可以先从具体指令状态(第二交互状态的一例)切换至倒地状态(第一交互状态的一例)。然后,根据语音控制模式和语音控制模式下的倒地状态,确定倒地事件的第一处理方式可以包括:语音提示用户进入倒地状态,此外,还可以在语音提示后进入休眠状态;或者,自己重新站起。
示例性地,该第一事件也可以是非姿态异常的事件。例如,该第一事件可以是具体交互模式下与待机目的相关的事件。
在一个示例中,第一交互模式为人脸模式、第二交互状态为待机状态,此时待机目的为与用户进行近距离交互,因此与待机目的相关的事件可以包括抚摸事件、手势事件、人 脸消失事件等。例如,第一事件为手势事件,人机交互装置可以由待机状态(第二交互状态的一例)切换至手势状态(第一交互状态的一例),在手势状态下,人机交互装置将根据用户特定的手势执行相应的动作,比如当检测到用户的碰拳手势时,人机交互装置可以执行碰拳动作。
在另一示例中,第一交互模式为待机模式、第二交互状态为待机状态,此时待机目的为自主导航和/或观察外界环境,因此与待机目的相关的事件可以包括障碍物事件、空场地事件、抚摸事件等。例如,第一事件为障碍物事件,人机交互装置可以根据障碍物的类型,决定是否与障碍物进行交互,若障碍物类型满足交互条件,人机交互装置可以由待机状态(第二交互状态的一例)切换至障碍物状态(第一交互状态的一例),在障碍物状态下,人机交互装置可以推着障碍物玩耍。
在又一示例中,第一交互模式为人形模式、第二交互状态为待机状态,此时待机目的为吸引用户靠近并与其进行交互,因此与待机目的相关的事件可以包括人形远离事件、人形消失事件、手势事件等。例如,第一事件为人形远离事件,人机交互装置可以由待机状态(第二交互状态的一例)切换至跟随状态(第一交互状态的一例),在跟随状态下,人机交互状态将跟随用户行动,吸引用户注意。
在又一示例中,第一交互模式为语音控制模式、第二交互状态为待机状态,此时待机目的为等待接收用户语音指令,因此与待机目的相关的事件可以包括具体指令事件、指令无法理解事件、未收到语音指令事件等。例如,第一事件为具体指令事件,人机交互装置可以由待机状态(第二交互状态的一例)切换至具体指令状态(第一交互状态的一例),在具体指令状态下,人机交互装置可以根据用户发出的语音指令具体内容,执行相应的动作。
情况2,第一事件为交互模式切换事件。
在情况2中,如图4所示,方法200还包括S240,该S240在S210和S220之间执行,下面详细介绍S240。
S240,根据第一事件,从第二交互模式下的第三交互状态切换至第一交互模式下的第一交互状态。其中,第二交互模式为多个交互模式之一。
此时,可认为当第一事件为交互模式切换事件时,不仅需要将人机交互装置当前的第二交互模式切换至第一交互模式,还需将人机交互装置当前所处的第三交互状态切换至第一交互状态,进而执行S230,即根据第一交互状态和第一交互模式,确定第一事件的第一处理方式。
示例性地,该第一事件可以是但不限于是近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件等。
在一个示例中,当第一事件为近距离人脸事件、第二交互模式为待机模式、第三交互状态为抚摸状态或障碍物状态时,人机交互装置可以先从待机模式(第二交互模式的一例)的抚摸状态或障碍物状态(第三交互状态的一例)切换至人脸模式(第一交互模式的一例)的待机状态(第一交互状态的一例)。然后,根据人脸模式和人脸模式下的待机状态,确定近距离人脸事件的处理方式包括:根据检测到的人脸信息,与已录入人脸信息进行比较,确认是否认识该人脸,若为认识的人脸信息,人机交互装置可以执行表现开心的反馈,若为不认识的人脸信息,人机交互装置可以执行表现疑惑的反馈。
在又一个示例中,当第一事件为近距离人脸事件、第二交互模式为人形模式、第三交 互状态为抚摸状态时,人机交互装置可以先从人形模式(第二交互模式的一例)的抚摸状态(第三交互状态的一例)切换至人脸模式(第一交互模式的一例)的首次看见人脸状态(第一交互状态的一例)。然后,根据人脸模式和人脸模式下的首次看见人脸状态,确定近距离人脸事件的处理方式包括:根据检测到的人脸信息,与已录入人脸信息进行比较,确认是否认识该人脸,若为认识的人脸信息,人机交互装置可以执行表现开心的反馈,若为不认识的人脸信息,人机交互装置可以执行表现疑惑的反馈。
在又一个示例中,当第一事件为远距离人脸事件、第二交互模式为人脸模式、第三交互状态为找人状态时,人机交互装置可以先从人脸模式(第二交互模式的一例)的找人状态(第三交互状态的一例)切换至人形模式(第一交互模式的一例)的待机状态(第一交互状态的一例)。然后,根据人形模式和人形模式下的待机状态,确定远距离人脸事件的处理方式可以包括:吸引用户与人机交互装置进行交互。例如,人机交互装置可以执行幅度较大或较明显的动作,如唱歌、跳舞、转圈、抬其腿部打招呼等,从而更容易引起远处用户的注意并吸引用户靠近。从而提高人机交互装置的仿生效果,有助于提升用户体验。
在又一个示例中,当第一事件为人形事件、第二交互模式为人脸模式、第三交互状态为待机状态时,人机交互装置可以先从人脸模式(第二交互模式的一例)的待机状态(第三交互状态的一例)切换至人形模式(第一交互模式的一例)的待机状态(第一交互状态的一例)。然后,根据人形模式和人形模式下的待机状态,确定人形事件的处理方式可以包括:吸引用户与人机交互装置进行交互。例如,人机交互装置可以执行幅度较大或较明显的动作,如唱歌、跳舞、转圈、抬其腿部打招呼等,从而更容易引起远处用户的注意并吸引用户靠近。从而提高人机交互装置的仿生效果,有助于提升用户体验。
在又一个示例中,当第一事件为语音输入事件、第二交互模式为人脸模式、第三交互状态为找人状态时,人机交互装置可以先从人脸模式(第二交互模式的一例)的找人状态(第三交互状态的一例)切换至语音控制模式(第一交互模式的一例)的待机状态(第一交互状态的一例)。然后,根据语音控制模式和语音控制模式下的待机状态,确定语音输入事件的处理方式可以包括:执行与语音输入事件对应的动作。例如,若语音输入事件为“hello,loona”,则语音输入事件对应的动作可以包括:歪头侧耳做出倾听的动作。从而提高人机交互装置的仿生效果,有助于提升用户体验。
在又一个示例中,当第一事件为操控应用程序事件、第二交互模式为待机模式、第三交互状态为待机状态时,人机交互装置可以先从待机模式(第二交互模式的一例)的待机(第三交互状态的一例)切换至应用程序控制模式(第一交互模式的一例)的待机状态(第一交互状态的一例)。然后,根据应用程序控制模式和应用程序控制模式下的待机状态,确定该操控应用程序事件的处理方式可以包括:竖起耳朵等待,或者兴奋地原地转圈。
情况3,第一事件是其他事件。
在情况3中,该方法200在S210和S220之间无需执行任何步骤。此时,第一交互模式即为人机交互装置当前的交互模式,第一交互状态即为人机交互装置的当前交互状态。
其中,其他事件可以包括除情况1和情况2中所述的第一事件外的事件,即其他事件包括除交互状态切换事件和除交互模式切换事件之外的事件。
在一个示例中,在第一交互模式为待机模式、第一交互状态为拿起状态的情况下,那么,当人机交互装置检测到抚摸事件(第一事件的一例)后,对抚摸事件的第一处理方式可以包括:在拿起状态下,耳朵上指示灯闪烁,或忽视该抚摸事件,不进行响应。
在另一示例中,在第一交互模式为人形模式、第一交互状态为跟随状态的情况下,那么,当人机交互装置检测到障碍物事件(第一事件的一例)后,对障碍物事件的第一处理方式可以包括:在跟随状态下,对障碍物事件进行弱响应或不响应,绕开障碍物继续跟随用户。
上文方法200中的一些示例,如事件的示例、交互模式的示例、交互状态的示例,仅是为了方便理解方案而进行的示例,其不应对本申请构成限制。
根据上述对方法200的描述可知,当第一事件为状态切换事件时,人机交互装置不用切换人机交互装置当前的第一交互模式,只需将人机交互装置当前所处的第二交互状态切换至第一交互状态,进而根据第一交互状态和第一交互模式,确定第一事件的第一处理方式。当第一事件为交互模式切换事件时,不仅需要将人机交互装置当前的第二交互模式切换至第一交互模式,还需将人机交互装置当前所处的第三交互状态切换至第一交互状态,进而根据第一交互状态和第一交互模式,确定第一事件的第一处理方式。当第一事件是其他事件(非状态切换事件或非交互模式切换事件)时,人机交互装置不用切换人机交互装置当前的第一交互模式,也不用切换当前所处的第一交互状态,直接根据第一交互状态和第一交互模式,确定第一事件的第一处理方式。这样,相比于对同一事件都采用相同反馈的现有方案而言,在上述方法200中,人机交互装置能够灵活地基于交互模式和该交互模式下的交互状态下,对事件进行处理,进而可以提升人机交互装置对用户的行为意图或外界环境分析的准确度,提高用户与人机交互装置的互动效果,有助于提升用户体验。
下面,结合图5和图6,详细描述本申请实施例提供的装置。
图5为本申请实施例提供的一例人机交互装置300的示意性结构图。
例如,如图5所示,该人机交互装置300具有多个交互模式,每个交互模式包括至少一个交互状态,交互状态用于指示人机交互装置300与外界交互的过程中所处的状态。该人机交互装置300包括检测单元310和处理单元320。其中,检测单元310,用于检测第一事件;处理单元320,用于根据人机交互装置所处的第一交互模式和第一交互模式下的第一交互状态,确定对第一事件的第一处理方式,其中,第一交互模式为多个交互模式之一。
可选地,在一个示例中,人机交互装置300还包括第一切换单元,用于根据第一事件,从第一交互模式下的第二交互状态切换至第一交互模式下的第一交互状态。
在该示例中,第一事件可以为拿起事件或倒地事件。
可选地,在另一个示例中,人机交互装置300还包括第二切换单元,用于根据第一事件,从第二交互模式下的第三交互状态切换至第一交互模式下的第一交互状态,第二交互模式为多个交互模式之一。
在该示例中,第一事件可以为近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件。
关于图5中其他未描述的部分可以参见方法200中的相关描述,这里不再赘述。
图6示出了本申请实施例提供的人机交互装置400的示意性结构图。
例如,如图6所示,该人机交互装置400包括:一个或多个处理器410,一个或多个存储器420,该一个或多个存储器存储320存储有一个或多个计算机程序,该一个或多个计算机程序包括指令。当该指令被所述一个或多个处理器410运行时,使得所述的人机交互装置400执行上述方法200。
本申请实施例提供一种计算机程序产品,当所述计算机程序产品在人机交互装置运行时,使得人机交互装置执行上述方法200。其实现原理和技术效果与上述方法200类似,此处不再赘述。
本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质包含指令,当所述指令在设备运行时,使得所述设备执行上述方法200。其实现原理和技术效果类似,此处不再赘述。
本申请实施例提供一种芯片,所述芯片用于执行指令,当所述芯片运行时,执行上述方法200。其实现原理和技术效果类似,此处不再赘述。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (14)

  1. 一种人机交互装置,其特征在于,所述人机交互装置具有多个交互模式,每个所述交互模式包括至少一个交互状态,所述交互状态用于指示所述人机交互装置与外界交互的过程中所处的状态,所述人机交互装置包括:
    检测单元,用于检测第一事件;
    处理单元,用于根据所述人机交互装置所处的第一交互模式和所述第一交互模式下的第一交互状态,确定对所述第一事件的第一处理方式,其中,所述第一交互模式为所述多个交互模式之一。
  2. 如权利要求1所述的人机交互装置,其特征在于,所述人机交互装置还包括:
    第一切换单元,用于根据所述第一事件,从所述第一交互模式下的第二交互状态切换至所述第一交互模式下的所述第一交互状态。
  3. 如权利要求2所述的人机交互装置,其特征在于,所述第一事件为拿起事件或倒地事件。
  4. 如权利要求1所述的人机交互装置,其特征在于,所述人机交互装置还包括:
    第二切换单元,用于根据所述第一事件,从第二交互模式下的第三交互状态切换至所述第一交互模式下的所述第一交互状态,所述第二交互模式为所述多个交互模式之一。
  5. 如权利要求4所述的人机交互装置,其特征在于,所述第一事件为近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件。
  6. 一种人机交互方法,其特征在于,所述人机交互方法应用于人机交互装置,所述人机交互装置具有多个交互模式,每个所述交互模式包括至少一个交互状态,所述交互状态用于指示所述人机交互装置与外界交互的过程中所处的状态,所述人机交互方法包括:
    检测第一事件;
    根据所述人机交互装置所处的第一交互模式和所述第一交互模式下的第一交互状态,确定对所述第一事件的第一处理方式,其中,所述第一交互模式为所述多个交互模式之一。
  7. 如权利要求6所述的人机交互方法,其特征在于,所述方法还包括:
    根据所述第一事件,从所述第一交互模式下的第二交互状态切换至所述第一交互模式下的所述第一交互状态。
  8. 如权利要求7所述的人机交互方法,其特征在于,所述第一事件为拿起事件或倒地事件。
  9. 如权利要求6所述的人机交互方法,其特征在于,所述方法还包括:
    根据所述第一事件,从第二交互模式下的第三交互状态切换至所述第一交互模式下的所述第一交互状态,所述第二交互模式为所述多个交互模式之一。
  10. 如权利要求9所述的人机交互方法,其特征在于,所述第一事件为近距离人脸事件、远距离人脸事件、人形事件、语音输入事件、或操控应用程序事件。
  11. 一种人机交互装置,其特征在于,包括:
    一个或多个处理器;
    一个或多个存储器;
    以及一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述一个或多 个存储器中,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得所述人机交互装置执行如权利要求6至10中任一项所述的人机交互方法。
  12. 一种计算机可读存储介质,其特征在于,包括计算机指令,当所述计算机指令在人机交互装置上运行时,使得所述人机交互装置执行如权利要求6至10中任一项所述的人机交互方法。
  13. 一种芯片,其特征在于,包括至少一个处理器和接口电路,所述接口电路用于为所述至少一个处理器提供程序指令或者数据,所述至少一个处理器用于执行所述程序指令,以实现如权利要求6至10中任一项所述的人机交互方法。
  14. 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机指令;当部分或全部所述计算机指令在计算机上运行时,使得如权利要求6至10中任一项所述的人机交互方法被执行。
PCT/CN2022/139291 2022-12-15 2022-12-15 人机交互装置及人机交互方法 WO2024124481A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/139291 WO2024124481A1 (zh) 2022-12-15 2022-12-15 人机交互装置及人机交互方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/139291 WO2024124481A1 (zh) 2022-12-15 2022-12-15 人机交互装置及人机交互方法

Publications (1)

Publication Number Publication Date
WO2024124481A1 true WO2024124481A1 (zh) 2024-06-20

Family

ID=91484295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/139291 WO2024124481A1 (zh) 2022-12-15 2022-12-15 人机交互装置及人机交互方法

Country Status (1)

Country Link
WO (1) WO2024124481A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868827A (zh) * 2016-03-25 2016-08-17 北京光年无限科技有限公司 一种智能机器人多模态交互方法和智能机器人
CN111061953A (zh) * 2019-12-18 2020-04-24 深圳市优必选科技股份有限公司 智能终端交互方法、装置、终端设备及存储介质
US20200130197A1 (en) * 2017-06-30 2020-04-30 Lg Electronics Inc. Moving robot
CN112180774A (zh) * 2019-07-03 2021-01-05 百度在线网络技术(北京)有限公司 一种智能设备的交互方法、装置、设备和介质
CN113495621A (zh) * 2020-04-03 2021-10-12 百度在线网络技术(北京)有限公司 一种交互模式的切换方法、装置、电子设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105868827A (zh) * 2016-03-25 2016-08-17 北京光年无限科技有限公司 一种智能机器人多模态交互方法和智能机器人
US20200130197A1 (en) * 2017-06-30 2020-04-30 Lg Electronics Inc. Moving robot
CN112180774A (zh) * 2019-07-03 2021-01-05 百度在线网络技术(北京)有限公司 一种智能设备的交互方法、装置、设备和介质
CN111061953A (zh) * 2019-12-18 2020-04-24 深圳市优必选科技股份有限公司 智能终端交互方法、装置、终端设备及存储介质
CN113495621A (zh) * 2020-04-03 2021-10-12 百度在线网络技术(北京)有限公司 一种交互模式的切换方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
CN111645070B (zh) 机器人的安全防护方法、装置与机器人
CN106575150B (zh) 使用运动数据识别手势的方法和可穿戴计算设备
US11416080B2 (en) User intention-based gesture recognition method and apparatus
CN102789218A (zh) 一种基于多控制器的Zigbee智能家居系统
CN112860169B (zh) 交互方法及装置、计算机可读介质和电子设备
WO2000015396A1 (fr) Appareil robotique, procede de commande de l'appareil robotique, procede d'affichage et support
CN113325948B (zh) 隔空手势的调节方法及终端
CN114167984B (zh) 设备控制方法、装置、存储介质及电子设备
US20170316261A1 (en) Systems and metohds of gesture recognition
CN111580656A (zh) 可穿戴设备及其控制方法、装置
CN112634895A (zh) 语音交互免唤醒方法和装置
US10831273B2 (en) User action activated voice recognition
WO2024124481A1 (zh) 人机交互装置及人机交互方法
JP7091745B2 (ja) 表示端末、プログラム、情報処理システム及び方法
CN106873939A (zh) 电子设备及其使用方法
CN113766127A (zh) 移动终端的控制方法及装置、存储介质及电子设备
US20240028137A1 (en) System and method for remotely controlling extended reality by virtual mouse
WO2024124482A1 (zh) 人机交互装置及人机交互方法
US20150148923A1 (en) Wearable device that infers actionable events
US12093464B2 (en) Systems for interpreting a digit-to-digit gesture by a user differently based on roll values of a wrist-wearable device worn by the user, and methods of use thereof
CN114211486A (zh) 一种机器人的控制方法、机器人及存储介质
CN211484452U (zh) 一种自移动清洁机器人
CN109189285A (zh) 操作界面控制方法及装置、存储介质、电子设备
CN212628169U (zh) 一种远程控制的移动ip摄像头系统
CN115047966A (zh) 交互方法、电子设备与交互系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22968189

Country of ref document: EP

Kind code of ref document: A1