WO2023245316A1 - Human-computer interaction method and device, computer device and storage medium - Google Patents

Human-computer interaction method and device, computer device and storage medium Download PDF

Info

Publication number
WO2023245316A1
WO2023245316A1 PCT/CN2022/099701 CN2022099701W WO2023245316A1 WO 2023245316 A1 WO2023245316 A1 WO 2023245316A1 CN 2022099701 W CN2022099701 W CN 2022099701W WO 2023245316 A1 WO2023245316 A1 WO 2023245316A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
coordinate system
head
head movement
human
Prior art date
Application number
PCT/CN2022/099701
Other languages
French (fr)
Chinese (zh)
Inventor
杜琳
Original Assignee
北京小米移动软件有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京小米移动软件有限公司 filed Critical 北京小米移动软件有限公司
Priority to CN202280004336.8A priority Critical patent/CN117616368A/en
Priority to PCT/CN2022/099701 priority patent/WO2023245316A1/en
Publication of WO2023245316A1 publication Critical patent/WO2023245316A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials

Definitions

  • the present disclosure relates to the field of computers, and in particular, to a human-computer interaction method, device, computer device and storage medium.
  • Eye movement interaction technology can make up for some shortcomings of existing human-computer interaction methods. Human-computer interaction operations such as pointing, movement, and selection can be easily realized through the direction of gaze. However, the parameters that eye movement interaction technology can provide are limited, and it is difficult to adapt to complex Operation scenarios and various activities of the user's body may cause deviations in eye movement data, leading to operational errors.
  • the present disclosure provides a human-computer interaction method, device, computer device and storage medium. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.
  • a human-computer interaction method including:
  • the step of detecting the user's gaze direction includes:
  • the step of initiating detection of head motion includes:
  • the IMU includes at least one accelerometer sensor that measures acceleration signals and at least one gyro sensor that measures angular signals.
  • the step of initiating detection of head motion includes:
  • the head image data is acquired through the image acquisition device, and the head movement is determined relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes.
  • the step of initiating detection of head motion includes:
  • the head movement is analyzed.
  • gaze direction and/or head movement is detected by the head mounted device.
  • the operation results are fed back through the head-mounted device and/or a second device connected to the head-mounted device, and the second device is connected to the head-mounted device through a wired or wireless connection.
  • the present invention discloses a human-computer interaction device, including:
  • Gaze detection module used to detect the direction of the user's gaze
  • a motion detection module for initiating the detection of head motion when the direction of vision does not change beyond a preset time threshold
  • the instruction generation module is used to generate corresponding operation instructions based on the head movement information obtained by detecting the head movement;
  • An execution module is used to execute the operation instructions and feed back the operation results to the user.
  • the gaze detection module is used to track the user's eyeballs and determine the gaze direction.
  • the gaze direction indicates the position of the user's gaze in the gaze coordinate system.
  • the gaze coordinate system is represented by X, Y, Z.
  • the axis data determines the vector direction of where the user is looking.
  • the motion detection module includes:
  • the IMU includes at least one accelerometer sensor that measures acceleration signals and At least one gyro sensor for measuring angular signals, the detection of head movement is performed relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes, the fixed coordinate system
  • the system is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system;
  • the image detection submodule is used to obtain head image data through an image collection device, and determine head movement based on the head image data.
  • the sound detection submodule is used to detect the vibration characteristics of the sound wave of the user's voice, and analyze the head movement based on the vibration characteristics.
  • a computer device including:
  • Memory used to store instructions executable by the processor
  • the processor is configured as:
  • a non-transitory computer-readable storage medium which when instructions in the storage medium are executed by a processor of a mobile terminal, enables the mobile terminal to execute a human-computer interaction method , the method includes:
  • the technical solution provided by the embodiments of the present disclosure may include the following beneficial effects: first detect the user's gaze direction, and when the gaze direction does not change beyond a preset time threshold, start the detection of head movement, thereby detecting the head movement according to the direction of the user's gaze.
  • the head movement information obtained by head movement detection is used to generate corresponding operation instructions, and then the operation instructions are executed, and the operation results are fed back to the user.
  • Figure 1 is a flow chart of a human-computer interaction method according to an exemplary embodiment.
  • Figure 2 is a flow chart of yet another human-computer interaction method according to an exemplary embodiment.
  • FIG. 3 is a schematic diagram of the coordinate system of the user's head according to an exemplary embodiment.
  • Figure 4 is a schematic diagram of a user's body coordinate system according to an exemplary embodiment.
  • Figure 5 is a schematic diagram of a geodetic coordinate system according to an exemplary embodiment.
  • Figure 6 is a schematic diagram showing the relative relationship between the user's head coordinate system, the user's body coordinate system and the earth coordinate system according to an exemplary embodiment.
  • FIG. 7 is a schematic diagram of the glasses coordinate system according to an exemplary embodiment.
  • Figure 8 is a block diagram of a human-computer interaction device according to an exemplary embodiment.
  • FIG. 9 is a schematic structural diagram of the motion detection module 802 according to an exemplary embodiment.
  • Figure 10 is a block diagram of a device according to an exemplary embodiment.
  • Eye movement interaction technology can make up for some shortcomings of existing human-computer interaction methods. Human-computer interaction operations such as pointing, movement, and selection can be easily realized through the direction of gaze. However, the parameters that eye movement interaction technology can provide are limited, and it is difficult to adapt to complex Operation scenarios and various activities of the user's body may cause deviations in eye movement data, leading to operational errors.
  • the present disclosure provides a human-computer interaction method and device. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.
  • An exemplary embodiment of the present disclosure provides a human-computer interaction method, which performs human-computer interaction through joint control of line of sight and head movement, and simultaneously detects the user's line of sight direction and head movement information.
  • the operating instructions are generated based on the head movement and executed and feedback is obtained.
  • Figure 1 including:
  • Step 101 Detect the user's line of sight direction.
  • the user's eyeballs can be tracked to determine the gaze direction.
  • the gaze direction indicates the position where the user is looking in the gaze coordinate system.
  • the gaze coordinate system determines the vector direction of the user's gaze position using data on the X, Y, and Z axes. .
  • Step 102 If the sight direction does not change beyond the preset time threshold, start the detection of head movement.
  • the time threshold can be preset to 0.8 seconds.
  • the gaze direction can be determined to be stable.
  • the gaze direction can be determined by tracking the user's eyes.
  • the eye image data can be obtained, and then the gaze direction can be determined based on changes in the eye image data.
  • the target to which the gaze direction is predicted is the target of the user's pre-operation.
  • the detection of head movement can be started to determine the user's operation intention.
  • Step 103 Generate corresponding operation instructions based on the head motion information obtained by detecting the head motion.
  • the operation instructions are generated based on the detected head movement information and the preset instruction rules.
  • the default instruction rule for the "confirm” operation instruction is “nod twice within two seconds”
  • the instruction rule for the "cancel” operation instruction is "shake your head within one second.” Therefore, after detecting the head movement of "nodding twice within two seconds", the "confirm” operation command can be generated, and after detecting the head movement of "shaking the head within one second", the "cancel” operation can be generated instruction.
  • Step 104 Execute the operation instruction and feed back the operation result to the user.
  • the operation instructions are executed, the operation results are obtained, and the operation results are fed back to the user.
  • the feedback operation result can be a confirmation message, such as "cancel operation successfully"; it can also be a response interface to the operation, such as entering the page of the object selected for viewing when the operation instruction indicates "confirm viewing".
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method, in which the user's eyeballs are tracked to determine the line of sight direction.
  • the line of sight direction is used to represent the position indicated by the user's gaze in the line of sight coordinate system, where the line of sight coordinate system determines the vector direction of the position of the user's gaze with the numerical values of the X, Y, and Z axes.
  • Figure 2 it specifically includes the following steps:
  • Step 201 Perform eye tracking on the user to determine the user's line of sight direction.
  • the line of sight coordinate system can be the coordinate system of the user's head, with the center of gravity of the head as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the distance from the origin to space point 1.
  • the vector direction is the user's visual direction.
  • the line of sight coordinate system can also be the user's body coordinate system, with the body center of gravity as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
  • the line of sight coordinate system can also be a geodetic coordinate system, with a fixed position relative to the ground as the origin, and the numerical representation of the X, Y, and Z axes as space point 1.
  • the vector direction from the origin to space point 1 is The user's visual direction.
  • detecting head movement When detecting head movement, it is performed relative to a fixed coordinate system, which is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system.
  • Detecting the motion parameters of the head can include detecting various movements of the head relative to the fixed coordinate system, including nodding (reciprocating rotation on the X axis), shaking the head (reciprocating rotation on the Y and Z axes), and movements under different tilt postures .
  • Step 202 Obtain a preset time threshold, and when the sight direction does not change beyond the time threshold, start detection of head movement.
  • the user's eye gaze direction is detected in real time.
  • a preset time threshold such as 2 seconds
  • any of the following methods can be used for head motion detection:
  • the sensors include, but are not limited to, any one or more of the following sensors: accelerometer, gyroscope, and geomagnetometer.
  • the movement of the head can be detected based on the changes in data of the three sensors of the accelerometer, gyroscope, and geomagnetometer in each dimension of the coordinate system relative to time.
  • a camera or other shooting equipment For example, one or more cameras on a mobile phone or computer are used to capture head image data such as photos of the user's face.
  • parameters of the user's line of sight direction and head movement that is, eye image data and head image data, can be detected simultaneously through visual means.
  • head image data for detection a three-dimensional model of the user's head can be established. Then, the head image obtained by a collection device such as a camera is matched to estimate the current posture of the user's head, and the parameters of the head movement are detected based on changes in time.
  • the user's voice characteristics can be detected to determine head movement.
  • the head movement is the movement of the user's vocal cords.
  • the voice recognition system can also be used to recognize the voice commands issued by the user and operate the target locked by sight.
  • Step 203 Generate corresponding operation instructions based on the head motion information obtained by detecting the head motion.
  • the obtained head movement information includes the data of the three sensors of accelerometer, gyroscope, and geomagnetometer in each dimension of the fixed coordinate system relative to time. changes to detect head movement.
  • the obtained head movement information includes changes in the head posture and position relative to time in each dimension of the fixed coordinate system. This change indicates the head's Specific movement trajectories, based on which the head movement can be determined.
  • the obtained head movement information includes information indicating whether the vocal cords vibrate; further, the vibration trigger amplitude can also be preset, and the head movement information includes the amplitude of the user's voice. Information, after the amplitude of the sound emitted by the user reaches the vibration trigger amplitude, it is determined that head movement has occurred. It can also be combined with a speech recognition system.
  • the obtained head movement information includes information indicating whether the vocal cords vibrate and the user's voice command information. The user's intention can be determined based on the voice command information.
  • Step 204 Execute the operation instruction and feed back the operation result to the user.
  • the operation results can be fed back in different ways according to the software and hardware configuration in the application scenario.
  • an operation result containing text and/or images can be formed and displayed on the display screen.
  • the operation results can also be fed back through device vibration, for example, a short vibration once if successful/continuous vibration for two seconds if failed.
  • the operation results can also be played back through voice.
  • the above feedback methods can be combined or applied singly. Those skilled in the art should know that the output methods of operation result information are not limited to the above listed.
  • the feedback of the above-mentioned operation result information can be performed through a head-mounted device and/or a second device connected to the head-mounted device.
  • the second device is connected to the head-mounted device through a wired or wireless connection. device connection. For example, it is displayed through the display screen of the head-mounted device, or displayed through at least one display screen external to the head-mounted device, or displayed on the display screen of the head-mounted device and the external display screen simultaneously.
  • the above-mentioned line of sight coordinate system can be any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system.
  • a coordinate system can also be established based on the head-mounted device. As shown in FIG. 6 , it is an example of the relative relationship between the user head coordinate system 601 , the user body coordinate system 602 , and the earth coordinate system 603 .
  • the coordinate system constructed by the wearable device is used as the coordinate system of the user's head.
  • the user's gaze direction is detected through the eye tracking module integrated in the smart glasses device, and the user's head movement parameters are detected through the inertial detection unit (IMU) motion sensing module integrated in the glasses device.
  • IMU inertial detection unit
  • the smart glasses device can be used as a reference object to construct the glasses coordinate system as the user head coordinate system.
  • the specific position of the user's head can be used as the origin, for example, the center of gravity of the head can be used as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
  • the specific position of the user's body can be used as the origin, for example, the body center of gravity or the projection of the body center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions.
  • the head coordinate system and the user body coordinate system are local coordinate systems relative to the user and are moving coordinate systems.
  • a global coordinate system such as a geodetic coordinate system, which is a static coordinate system, can also be used.
  • the line of sight coordinate system or fixed coordinate system is mainly used as a reference system for motion detection.
  • the coordinate systems that can be used are not limited to the head, body and earth coordinate systems listed above.
  • the fixed coordinate system is a stationary coordinate system
  • the line of sight coordinate system is a moving coordinate system relative to the fixed coordinate system.
  • the fixed coordinate system is the earth coordinate system
  • the sight coordinate system is the head coordinate system.
  • both the fixed coordinate system and the line-of-sight coordinate system are moving coordinate systems.
  • the fixed coordinate system is the user body coordinate system
  • the sight coordinate system is the user body coordinate system or head coordinate system.
  • both the fixed coordinate system and the line-of-sight coordinate system are static coordinate systems.
  • both are geodetic coordinate systems.
  • the line of sight coordinate system and the fixed coordinate system are the same coordinate system.
  • the line of sight coordinate system can also adopt a different coordinate system from the fixed coordinate system.
  • the settings of the sight coordinate system and the fixed coordinate system can be customized according to the application environment and user needs, and can be set flexibly to adapt to the hardware configuration, saving costs and improving efficiency.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system is the user's body coordinate system.
  • the center of the user's feet as the origin
  • the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
  • the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system is the user's body coordinate system.
  • the center of the user's feet as the origin
  • the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
  • the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system is the user's body coordinate system.
  • the center of the user's feet as the origin
  • the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
  • the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be a geodetic coordinate system.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
  • the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be a geodetic coordinate system.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
  • the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be a geodetic coordinate system.
  • the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
  • the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
  • the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
  • the head coordinate system may be a coordinate system established with the center of the smart glasses device as the origin.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
  • the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
  • the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
  • the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method that detects line of sight and head movement in different ways according to hardware configuration.
  • line of sight detection it can be detected visually, that is, through a camera and other shooting equipment, image data such as eye photos are taken, and analysis is performed to determine the line of sight.
  • image data such as eye photos are taken, and analysis is performed to determine the line of sight.
  • an infrared lighting module can also be added to obtain a clearer image of the user's eye area under different ambient light conditions.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction method configured as a wearable device such as a head-mounted device.
  • the wearable device simultaneously detects the parameters of the user's gaze direction and head movement. Since the user will keep the gaze direction fixed while performing head movements such as nodding during actual use, the change in gaze direction relative to the wearable device is different from that of the wearable device.
  • the head movement direction of the head coordinate system is exactly the opposite, so that joint detection can effectively avoid false detection while ensuring a low missed detection rate, and is more accurate than the traditional single detection method.
  • An exemplary embodiment of the present disclosure also provides a human-computer interaction device, the structure of which is shown in Figure 8, including:
  • Gaze detection module 801 used to detect the direction of the user's gaze
  • the motion detection module 802 is used to start the detection of head movement when the direction of sight does not change beyond the preset time threshold;
  • the instruction generation module 803 is used to generate corresponding operation instructions based on the head movement information obtained by detecting the head movement;
  • Execution module 804 is used to execute the operation instructions and feed back the operation results to the user.
  • the line of sight detection module 801 is used to track the user's eyeballs and determine the direction of the line of sight.
  • the line of sight direction indicates the position of the user's gaze in the line of sight coordinate system.
  • the line of sight coordinate system determines the user's gaze based on the data of the X, Y, and Z axes. The vector direction of the gaze location.
  • the structure of the motion detection module 802 is shown in Figure 9, including:
  • Sensor detection sub-module 901 is used to obtain the sensor signal of at least one sensor in the IMU, and determine the movement of the head relative to the fixed coordinate system based on the sensor signal.
  • the IMU includes at least one accelerometer sensor that measures acceleration signals. and at least one gyro sensor for measuring angular signals, the detection of head movement is performed relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes, the fixed
  • the coordinate system is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system;
  • Image detection sub-module 902 is used to obtain head image data through an image acquisition device, and determine head movement based on the head image data relative to the fixed coordinate system;
  • the sound detection sub-module 903 is used to detect the vibration characteristics of the sound wave of the user's voice, and analyze the head movement based on the vibration characteristics.
  • FIG. 10 is a block diagram of a device 1000 for human-computer interaction according to an exemplary embodiment.
  • the device 1000 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.
  • the device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and communications component 1016.
  • Processing component 1002 generally controls the overall operations of device 1000, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 1002 may include one or more processors 1020 to execute instructions to complete all or part of the steps of the above method.
  • processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components.
  • processing component 1002 may include a multimedia module to facilitate interaction between multimedia component 1008 and processing component 1002.
  • Memory 1004 is configured to store various types of data to support operations at device 1000 . Examples of such data include instructions for any application or method operating on device 1000, contact data, phonebook data, messages, pictures, videos, etc.
  • Memory 1004 may be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EEPROM), Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EEPROM erasable programmable read-only memory
  • EPROM Programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory flash memory, magnetic or optical disk.
  • Power supply component 1006 provides power to various components of device 1000.
  • Power supply components 1006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 1000 .
  • Multimedia component 1008 includes a screen that provides an output interface between the device 1000 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action.
  • multimedia component 1008 includes a front-facing camera and/or a rear-facing camera.
  • the front camera and/or the rear camera may receive external multimedia data.
  • Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.
  • Audio component 1010 is configured to output and/or input audio signals.
  • audio component 1010 includes a microphone (MIC) configured to receive external audio signals when device 1000 is in operating modes, such as call mode, recording mode, and speech recognition mode. The received audio signals may be further stored in memory 1004 or sent via communications component 1016 .
  • audio component 1010 also includes a speaker for outputting audio signals.
  • the I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module.
  • the peripheral interface module may be a keyboard, a click wheel, a button, etc. These buttons may include, but are not limited to: Home button, Volume buttons, Start button, and Lock button.
  • Sensor component 1014 includes one or more sensors for providing various aspects of status assessment for device 1000 .
  • the sensor component 1014 can detect the open/closed state of the device 1000, the relative positioning of components, such as the display and keypad of the device 1000, and the sensor component 1014 can also detect the position change of the device 1000 or a component of the device 1000. , the presence or absence of user contact with the device 1000 , device 1000 orientation or acceleration/deceleration and temperature changes of the device 1000 .
  • Sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor component 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 1016 is configured to facilitate wired or wireless communication between apparatus 1000 and other devices.
  • Device 1000 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 1016 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communications component 1016 also includes a near field communications (NFC) module to facilitate short-range communications.
  • NFC near field communications
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • apparatus 1000 may be configured by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable Gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are implemented for executing the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable Gate array
  • controller microcontroller, microprocessor or other electronic components are implemented for executing the above method.
  • a non-transitory computer-readable storage medium including instructions such as a memory 1004 including instructions, which can be executed by the processor 1020 of the device 1000 to complete the above method is also provided.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
  • a computer device including:
  • Memory used to store instructions executable by the processor
  • the processor is configured as:
  • a non-transitory computer-readable storage medium when instructions in the storage medium are executed by a processor of a mobile terminal, enable the mobile terminal to perform a human-computer interaction method, the method includes:
  • the user's gaze direction is first detected, and when the gaze direction does not change beyond the preset time threshold, the head movement detection is started, and the corresponding head movement information is generated based on the head movement information obtained from the head movement detection. operation instructions, and then executes the operation instructions and feeds back the operation results to the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)
  • Position Input By Displaying (AREA)

Abstract

The present disclosure relates to the field of computers, and relates to a human-computer interaction method and device, for solving the problems of human-computer interaction efficiency and accuracy. The method comprises: detecting a line-of-sight direction of a user; under the condition that the line-of-sight direction is not changed within a preset time threshold, starting detection of the head motion; according to head motion information obtained by detecting the head motion, generating a corresponding operation instruction; and executing the operation instruction, and feeding back an operation result to the user. The technical solution provided by the present disclosure is suitable for a non-manual interaction scenario, and realizes efficient and high-accuracy human-computer interaction.

Description

一种人机交互方法、装置、计算机装置和存储介质A human-computer interaction method, device, computer device and storage medium 技术领域Technical field
本公开涉及计算机领域,尤其涉及一种人机交互方法、装置、计算机装置和存储介质。The present disclosure relates to the field of computers, and in particular, to a human-computer interaction method, device, computer device and storage medium.
背景技术Background technique
相关技术中,人机交互通常依赖按钮、鼠标、键盘、触屏、语音等方式进行。在很多较复杂场景下,常用的交互方式在便捷性和准确性等方面表现较差。例如在虚拟现实(VR)/增强现实(AR)应用、遥控其他设备等场景下,对人机交互的效率和精度均有较高要求。In related technologies, human-computer interaction usually relies on buttons, mouse, keyboard, touch screen, voice, etc. In many more complex scenarios, commonly used interaction methods perform poorly in terms of convenience and accuracy. For example, in scenarios such as virtual reality (VR)/augmented reality (AR) applications and remote control of other devices, there are high requirements for the efficiency and accuracy of human-computer interaction.
语音交互虽然可以无需动手操作,但是由于说话和识别过程耗时较长,导致效率较低。眼动交互技术则可以弥补现有人机交互方式的一些不足,通过视线方向可以很便捷的实现指向、移动、选择等人机交互操作,但眼动交互技术能够提供的参数有限,很难适应复杂操作场景,且用户身体的各种活动均可能造成眼动数据的偏差,导致操作失误。Although voice interaction can require no hands-on operation, the speaking and recognition process takes a long time, resulting in low efficiency. Eye movement interaction technology can make up for some shortcomings of existing human-computer interaction methods. Human-computer interaction operations such as pointing, movement, and selection can be easily realized through the direction of gaze. However, the parameters that eye movement interaction technology can provide are limited, and it is difficult to adapt to complex Operation scenarios and various activities of the user's body may cause deviations in eye movement data, leading to operational errors.
综上,缺乏一种能够适应在复杂场景下使用需求的高效、高准确性的人机交互方式。In summary, there is a lack of an efficient and highly accurate human-computer interaction method that can meet the needs of use in complex scenarios.
发明内容Contents of the invention
为克服相关技术中存在的问题,本公开提供一种人机交互方法、装置、计算机装置和存储介质。通过视线和头部运动两个维度,结合控制人机交互,适用于复杂、精细的应用场景,实现了高效、高准确性的人机交互。In order to overcome problems existing in related technologies, the present disclosure provides a human-computer interaction method, device, computer device and storage medium. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.
根据本公开实施例的第一方面,提供一种人机交互方法,包括:According to a first aspect of an embodiment of the present disclosure, a human-computer interaction method is provided, including:
检测用户的视线方向;Detect the user’s gaze direction;
在视线方向在预设的时间阈值内未改变的情况下,启动对头部运动的检测;When the gaze direction does not change within a preset time threshold, start the detection of head movement;
根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
在一些实施例中,所述检测用户的视线方向的步骤包括:In some embodiments, the step of detecting the user's gaze direction includes:
对用户进行眼球追踪,确定视线方向,所述视线方向在视线坐标系中指示用户注视的位置,所述视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。Perform eye tracking on the user to determine the gaze direction, which indicates the position where the user is looking in the gaze coordinate system, which determines the vector direction of the position where the user is gazing with data on the X, Y, and Z axes.
在一些实施例中,所述启动对头部运动的检测的步骤包括:In some embodiments, the step of initiating detection of head motion includes:
获取惯性检测单元(IMU)中至少一个传感器的传感器信号,并根据所述传感器信号判定相对于固定坐标系的头部运动,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向。Acquire the sensor signal of at least one sensor in the inertial detection unit (IMU), and determine the head movement relative to a fixed coordinate system based on the sensor signal, and the fixed coordinate system determines the head movement with data on the X, Y, and Z axes vector direction.
在一些实施例中,所述IMU包括至少一个测量加速度信号的加速度计传感器和至少一个测量角信号的陀螺传感器。In some embodiments, the IMU includes at least one accelerometer sensor that measures acceleration signals and at least one gyro sensor that measures angular signals.
在一些实施例中,所述启动对头部运动的检测的步骤包括:In some embodiments, the step of initiating detection of head motion includes:
通过图像采集设备获取头部影像数据,相对于固定坐标系判断头部运动,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向。The head image data is acquired through the image acquisition device, and the head movement is determined relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes.
在一些实施例中,所述启动对头部运动的检测的步骤包括:In some embodiments, the step of initiating detection of head motion includes:
检测用户语音的声波的震动特征;Detect the vibration characteristics of the sound waves of the user’s voice;
根据所述震动特征,分析头部运动。Based on the vibration characteristics, the head movement is analyzed.
在一些实施例中,通过头戴式设备检测视线方向和/或头部运动。In some embodiments, gaze direction and/or head movement is detected by the head mounted device.
在一些实施例中,通过头戴式设备和/或所述头戴式设备连接的第二设备反馈操作结果,所述第二设备通过有线或无线连接方式与所述头戴式设备连接。In some embodiments, the operation results are fed back through the head-mounted device and/or a second device connected to the head-mounted device, and the second device is connected to the head-mounted device through a wired or wireless connection.
根据本公开实施例的第二方面,本发明公开了一种人机交互装置,包括:According to a second aspect of the disclosed embodiment, the present invention discloses a human-computer interaction device, including:
视线检测模块,用于检测用户的视线方向;Gaze detection module, used to detect the direction of the user's gaze;
运动检测模块,用于在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;A motion detection module for initiating the detection of head motion when the direction of vision does not change beyond a preset time threshold;
指令生成模块,用于根据对头部运动检测得到的头部运动信息,生成对应的操作指令;The instruction generation module is used to generate corresponding operation instructions based on the head movement information obtained by detecting the head movement;
执行模块,用于执行所述操作指令,并向所述用户反馈操作结果。An execution module is used to execute the operation instructions and feed back the operation results to the user.
在一些实施例中,所述视线检测模块,用于对用户进行眼球追踪,确定视线方向,所述视线方向在视线坐标系中指示用户注视的位置,所述视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。In some embodiments, the gaze detection module is used to track the user's eyeballs and determine the gaze direction. The gaze direction indicates the position of the user's gaze in the gaze coordinate system. The gaze coordinate system is represented by X, Y, Z. The axis data determines the vector direction of where the user is looking.
在一些实施例中,所述运动检测模块包括:In some embodiments, the motion detection module includes:
传感器检测子模块,用于获取IMU中至少一个传感器的传感器信号,并根据所述传感器信号判定头部相对于所述固定坐标系的运动,所述IMU包括至少一个测量加速度信号的加速 度计传感器和至少一个测量角信号的陀螺传感器,所述在检测头部运动时相对于固定坐标系进行,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向,所述固定坐标系为以下坐标系中的任一:用户头部坐标系、用户身体坐标系、大地坐标系;Sensor detection submodule, used to obtain the sensor signal of at least one sensor in the IMU, and determine the movement of the head relative to the fixed coordinate system based on the sensor signal. The IMU includes at least one accelerometer sensor that measures acceleration signals and At least one gyro sensor for measuring angular signals, the detection of head movement is performed relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes, the fixed coordinate system The system is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system;
图像检测子模块,用于通过图像采集设备获取头部影像数据,根据所述头部影像数据判断头部运动。The image detection submodule is used to obtain head image data through an image collection device, and determine head movement based on the head image data.
声音检测子模块,用于检测用户语音的声波的震动特征,并根据所述震动特征,分析头部运动。The sound detection submodule is used to detect the vibration characteristics of the sound wave of the user's voice, and analyze the head movement based on the vibration characteristics.
根据本公开实施例的第三方面,提供一种计算机装置,包括:According to a third aspect of embodiments of the present disclosure, a computer device is provided, including:
处理器;processor;
用于存储处理器可执行指令的存储器;Memory used to store instructions executable by the processor;
其中,所述处理器被配置为:Wherein, the processor is configured as:
检测用户的视线方向;Detect the user’s gaze direction;
在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
根据本公开实施例的第四方面,提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种人机交互方法,所述方法包括:According to a fourth aspect of an embodiment of the present disclosure, a non-transitory computer-readable storage medium is provided, which when instructions in the storage medium are executed by a processor of a mobile terminal, enables the mobile terminal to execute a human-computer interaction method , the method includes:
检测用户的视线方向;Detect the user’s gaze direction;
在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
本公开的实施例提供的技术方案可以包括以下有益效果:首先检测用户的视线方向,并在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测,从而根据对头部运动检测得到的头部运动信息,生成对应的操作指令,然后执行所述操作指令,并向所述用户反馈操作结果。通过视线和头部运动两个维度,结合控制人机交互,适用于复杂、精细的 应用场景,实现了高效、高准确性的人机交互。The technical solution provided by the embodiments of the present disclosure may include the following beneficial effects: first detect the user's gaze direction, and when the gaze direction does not change beyond a preset time threshold, start the detection of head movement, thereby detecting the head movement according to the direction of the user's gaze. The head movement information obtained by head movement detection is used to generate corresponding operation instructions, and then the operation instructions are executed, and the operation results are fed back to the user. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It should be understood that the foregoing general description and the following detailed description are exemplary and explanatory only, and do not limit the present disclosure.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description serve to explain the principles of the invention.
图1是根据一示例性实施例示出的一种人机交互方法的流程图。Figure 1 is a flow chart of a human-computer interaction method according to an exemplary embodiment.
图2是根据一示例性实施例示出的又一种人机交互方法的流程图。Figure 2 is a flow chart of yet another human-computer interaction method according to an exemplary embodiment.
图3是根据一示例性实施例示出的用户头部坐标系示意图。FIG. 3 is a schematic diagram of the coordinate system of the user's head according to an exemplary embodiment.
图4是根据一示例性实施例示出的用户身体坐标系示意图。Figure 4 is a schematic diagram of a user's body coordinate system according to an exemplary embodiment.
图5是根据一示例性实施例示出的大地坐标系示意图。Figure 5 is a schematic diagram of a geodetic coordinate system according to an exemplary embodiment.
图6是根据一示例性实施例示出的用户头部坐标系、用户身体坐标系、大地坐标系相对关系示意图。Figure 6 is a schematic diagram showing the relative relationship between the user's head coordinate system, the user's body coordinate system and the earth coordinate system according to an exemplary embodiment.
图7是根据一示例性实施例示出的眼镜坐标系示意图。FIG. 7 is a schematic diagram of the glasses coordinate system according to an exemplary embodiment.
图8是根据一示例性实施例示出的一种人机交互装置的框图。Figure 8 is a block diagram of a human-computer interaction device according to an exemplary embodiment.
图9是根据一示例性实施例示出的运动检测模块802的结构示意图。FIG. 9 is a schematic structural diagram of the motion detection module 802 according to an exemplary embodiment.
图10是根据一示例性实施例示出的一种装置的框图。Figure 10 is a block diagram of a device according to an exemplary embodiment.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the appended claims.
常用的人机交互方式中,语音交互虽然可以无需动手操作,但是由于说话和识别过程耗 时较长,导致效率较低。眼动交互技术则可以弥补现有人机交互方式的一些不足,通过视线方向可以很便捷的实现指向、移动、选择等人机交互操作,但眼动交互技术能够提供的参数有限,很难适应复杂操作场景,且用户身体的各种活动均可能造成眼动数据的偏差,导致操作失误。Among the commonly used human-computer interaction methods, although voice interaction can require no hands-on operation, the speaking and recognition process takes a long time, resulting in low efficiency. Eye movement interaction technology can make up for some shortcomings of existing human-computer interaction methods. Human-computer interaction operations such as pointing, movement, and selection can be easily realized through the direction of gaze. However, the parameters that eye movement interaction technology can provide are limited, and it is difficult to adapt to complex Operation scenarios and various activities of the user's body may cause deviations in eye movement data, leading to operational errors.
为了解决上述问题,本公开提供了一种人机交互方法和装置。通过视线和头部运动两个维度,结合控制人机交互,适用于复杂、精细的应用场景,实现了高效、高准确性的人机交互。In order to solve the above problems, the present disclosure provides a human-computer interaction method and device. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.
本公开的一示例性实施例提供了一种人机交互方法,通过视线和头部运动联合控制进行人机交互,同时检测用户的视线方向和头部运动信息,在视线方向在预设的时间阈值内未改变的情况下,根据头部运动生成操作指令并执行和获取反馈。具体流程如图1所示,包括:An exemplary embodiment of the present disclosure provides a human-computer interaction method, which performs human-computer interaction through joint control of line of sight and head movement, and simultaneously detects the user's line of sight direction and head movement information. In the line of sight direction at a preset time, If there is no change within the threshold, the operating instructions are generated based on the head movement and executed and feedback is obtained. The specific process is shown in Figure 1, including:
步骤101、检测用户的视线方向。Step 101: Detect the user's line of sight direction.
本步骤中,例如可以对用户进行眼球追踪,确定视线方向,视线方向在视线坐标系中指示用户注视的位置,该视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。In this step, for example, the user's eyeballs can be tracked to determine the gaze direction. The gaze direction indicates the position where the user is looking in the gaze coordinate system. The gaze coordinate system determines the vector direction of the user's gaze position using data on the X, Y, and Z axes. .
步骤102、在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测。Step 102: If the sight direction does not change beyond the preset time threshold, start the detection of head movement.
本步骤中,在视线方向超过预设的时间阈值未改变的情况下,判定视线方向稳定。例如,可预设时间阈值为0.8秒,当视线锁定于同一目标超过0.8秒未改变视线后,即可判定视线方向稳定。In this step, when the line of sight direction does not change beyond the preset time threshold, it is determined that the line of sight direction is stable. For example, the time threshold can be preset to 0.8 seconds. When the gaze is locked on the same target for more than 0.8 seconds without changing the gaze, the gaze direction can be determined to be stable.
本步骤中,可通过对用户进行眼球追踪,确定视线方向,例如可以获取眼部的影像数据,然后根据眼部影像数据的变化确定视线方向。In this step, the gaze direction can be determined by tracking the user's eyes. For example, the eye image data can be obtained, and then the gaze direction can be determined based on changes in the eye image data.
在视线方向稳定的情况下,预测视线方向所指向的目标是用户预操作的目标。此时,可启动对头部运动的检测,以确定用户的操作意图。When the gaze direction is stable, the target to which the gaze direction is predicted is the target of the user's pre-operation. At this time, the detection of head movement can be started to determine the user's operation intention.
步骤103、根据对头部运动检测得到的头部运动信息,生成对应的操作指令。Step 103: Generate corresponding operation instructions based on the head motion information obtained by detecting the head motion.
本步骤中,根据检测得到的头部运动信息,结合预设的指令规则,生成操作指令。例如,预设“确认”操作指令的指令规则是“两秒内点头两次”,“取消”操作指令的指令规则是“一秒内摇头”。因此,在检测到“两秒内点头两次”的头部运动后即可生成“确认”的操作指令,在检测到“一秒内摇头”的头部运动后即可生成“取消”的操作指令。In this step, the operation instructions are generated based on the detected head movement information and the preset instruction rules. For example, the default instruction rule for the "confirm" operation instruction is "nod twice within two seconds", and the instruction rule for the "cancel" operation instruction is "shake your head within one second." Therefore, after detecting the head movement of "nodding twice within two seconds", the "confirm" operation command can be generated, and after detecting the head movement of "shaking the head within one second", the "cancel" operation can be generated instruction.
步骤104、执行所述操作指令,并向所述用户反馈操作结果。Step 104: Execute the operation instruction and feed back the operation result to the user.
本步骤中,执行所述操作指令,获取操作结果,并将操作结果向用户反馈。反馈的操作 结果可以是确认信息,例如“取消操作成功”;也可以是对操作的响应界面,例如在操作指令表示“确认查看”的时候,进入被选择查看的对象的页面。In this step, the operation instructions are executed, the operation results are obtained, and the operation results are fed back to the user. The feedback operation result can be a confirmation message, such as "cancel operation successfully"; it can also be a response interface to the operation, such as entering the page of the object selected for viewing when the operation instruction indicates "confirm viewing".
本公开的一示例性实施例还提供了一种人机交互方法,在该方法中,对用户进行眼球追踪,确定视线方向。该视线方向用于表征在视线坐标系中指示用户注视的位置,其中,视线坐标系以X、Y、Z轴的数值确定用户注视的位置的矢量方向。如图2所示,具体包括以下步骤:An exemplary embodiment of the present disclosure also provides a human-computer interaction method, in which the user's eyeballs are tracked to determine the line of sight direction. The line of sight direction is used to represent the position indicated by the user's gaze in the line of sight coordinate system, where the line of sight coordinate system determines the vector direction of the position of the user's gaze with the numerical values of the X, Y, and Z axes. As shown in Figure 2, it specifically includes the following steps:
步骤201、对用户进行眼球追踪,确定用户的视线方向。Step 201: Perform eye tracking on the user to determine the user's line of sight direction.
如图3所示,在视线坐标系下,视线坐标系可为用户头部坐标系,以头部重心为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。As shown in Figure 3, in the line of sight coordinate system, the line of sight coordinate system can be the coordinate system of the user's head, with the center of gravity of the head as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the distance from the origin to space point 1. The vector direction is the user's visual direction.
如图4所示,视线坐标系还可以为用户身体坐标系,以身体重心为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。As shown in Figure 4, the line of sight coordinate system can also be the user's body coordinate system, with the body center of gravity as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
如图5所示,视线坐标系还可以为大地坐标系,以相对地面固定的某一位置为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。As shown in Figure 5, the line of sight coordinate system can also be a geodetic coordinate system, with a fixed position relative to the ground as the origin, and the numerical representation of the X, Y, and Z axes as space point 1. The vector direction from the origin to space point 1 is The user's visual direction.
在检测头部运动时,则相对于固定坐标系进行,所述固定坐标系为以下坐标系中的任一:用户头部坐标系、用户身体坐标系、大地坐标系。检测头部的运动参数,可以包括检测头部相对于固定坐标系的各种运动,包括点头(X轴往复旋转)、摇头(Y、Z轴往复旋转)等运动,以及不同倾斜姿态下的运动。When detecting head movement, it is performed relative to a fixed coordinate system, which is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system. Detecting the motion parameters of the head can include detecting various movements of the head relative to the fixed coordinate system, including nodding (reciprocating rotation on the X axis), shaking the head (reciprocating rotation on the Y and Z axes), and movements under different tilt postures .
步骤202、获取预设的时间阈值,在超过该时间阈值视线方向未改变的情况下,启动对头部运动的检测。Step 202: Obtain a preset time threshold, and when the sight direction does not change beyond the time threshold, start detection of head movement.
实时检测用户的眼球视线方向,当用户的视线方向维持固定未改变的时间超过预设时间阈值,如2秒,可判定视线方向稳定。The user's eye gaze direction is detected in real time. When the user's gaze direction remains fixed and unchanged for more than a preset time threshold, such as 2 seconds, it can be determined that the gaze direction is stable.
其中,根据硬件配置情况,对于头部运动检测,可以采用以下任意方式:Among them, according to the hardware configuration, any of the following methods can be used for head motion detection:
1)在使用配备有IMU的可穿戴设备等设备进行头部运动的检测时,获取IMU中至少一个传感器的传感器信号,并根据所述传感器信号判定头部的运动。所述传感器包括但不限于下列传感器中的任一或任意多种:加速度计、陀螺仪、地磁计。1) When using a device such as a wearable device equipped with an IMU to detect head movement, obtain a sensor signal from at least one sensor in the IMU, and determine the movement of the head based on the sensor signal. The sensors include, but are not limited to, any one or more of the following sensors: accelerometer, gyroscope, and geomagnetometer.
在使用IMU检测时,可以根据加速度计、陀螺仪、地磁计三种传感器在坐标系各维度上的数据相对于时间的变化来检测头部的运动。When using the IMU for detection, the movement of the head can be detected based on the changes in data of the three sensors of the accelerometer, gyroscope, and geomagnetometer in each dimension of the coordinate system relative to time.
2)也可使用摄像头等拍摄设备,通过图像采集设备获取头部影像数据,根据所述头部影像数据判断头部运动。例如,通过手机或电脑的一个或多个摄像头,拍摄用户的脸部照片等头部影像数据。根据一种实施方式,可通过视觉方式同时检测用户的视线方向和头部运动的参数,即眼部影像数据和头部影像数据。在使用头部影像数据检测时,可以建立用户头部三维模型。然后根据摄像头等采集设备获取的头部图像进行匹配,从而估算出用户头部的当前姿态,结合时间的变化检测出头部运动的参数。还可以使用多个摄像头,或者ToF、LiDAR等传感器获取三维头部信息,结合头部三维模型估算当前姿态及头部运动的参数。2) You can also use a camera or other shooting equipment to obtain head image data through an image acquisition device, and determine the head movement based on the head image data. For example, one or more cameras on a mobile phone or computer are used to capture head image data such as photos of the user's face. According to one implementation, parameters of the user's line of sight direction and head movement, that is, eye image data and head image data, can be detected simultaneously through visual means. When using head image data for detection, a three-dimensional model of the user's head can be established. Then, the head image obtained by a collection device such as a camera is matched to estimate the current posture of the user's head, and the parameters of the head movement are detected based on changes in time. You can also use multiple cameras, or ToF, LiDAR and other sensors to obtain three-dimensional head information, and estimate the current posture and head movement parameters based on the three-dimensional head model.
3)可检测用户语音特征以判定头部运动。例如,头部运动为用户的声带运动。通过检测声波的震动特征,并根据所述震动特征,分析头部运动。也可通过语音识别系统,识别用户发出的语音指令,对视线锁定的目标进行操作。3) The user's voice characteristics can be detected to determine head movement. For example, the head movement is the movement of the user's vocal cords. By detecting the vibration characteristics of sound waves, and analyzing the head movement based on the vibration characteristics. The voice recognition system can also be used to recognize the voice commands issued by the user and operate the target locked by sight.
步骤203、根据对头部运动检测得到的头部运动信息,生成对应的操作指令。Step 203: Generate corresponding operation instructions based on the head motion information obtained by detecting the head motion.
在使用配备有IMU的可穿戴设备等设备进行头部运动的检测时,得到的头部运动信息包含加速度计、陀螺仪、地磁计三种传感器在固定坐标系各维度上的数据相对于时间的变化来检测头部的运动。When using a wearable device equipped with an IMU to detect head movement, the obtained head movement information includes the data of the three sensors of accelerometer, gyroscope, and geomagnetometer in each dimension of the fixed coordinate system relative to time. changes to detect head movement.
在使用摄像头等拍摄设备,通过图像采集设备获取头部影像数据时,得到的头部运动信息包含头部姿态和位置在固定坐标系各维度上相对于时间的变化,这一变化表明了头部具体的运动轨迹,据此,可确定头部运动。When using a camera or other shooting equipment to obtain head image data through an image acquisition device, the obtained head movement information includes changes in the head posture and position relative to time in each dimension of the fixed coordinate system. This change indicates the head's Specific movement trajectories, based on which the head movement can be determined.
在检测用户语音特征以判定头部运动时,得到的头部运动信息包含指示声带是否发生震动的信息;进一步的,还可预设震动触发幅度,在头部运动信息中包含用户发声的震幅信息,在用户发出的声音震幅达到该震动触发幅度后,确定发生了头部运动。亦可结合语音识别系统,得到的头部运动信息包含指示声带是否发生震动的信息以及用户的语音指令信息,根据该语音指令信息可确定用户意图。When detecting the user's voice characteristics to determine head movement, the obtained head movement information includes information indicating whether the vocal cords vibrate; further, the vibration trigger amplitude can also be preset, and the head movement information includes the amplitude of the user's voice. Information, after the amplitude of the sound emitted by the user reaches the vibration trigger amplitude, it is determined that head movement has occurred. It can also be combined with a speech recognition system. The obtained head movement information includes information indicating whether the vocal cords vibrate and the user's voice command information. The user's intention can be determined based on the voice command information.
步骤204、执行操作指令,并向用户反馈操作结果。Step 204: Execute the operation instruction and feed back the operation result to the user.
本步骤中,根据应用场景下的软硬件配置,可通过不同方式反馈操作结果。根据一种实施方式,可形成包含文字和/或图像的操作结果,通过显示屏显示。也可将操作结果通过设备震动方式反馈,例如成功短促震动一次/失败持续震动两秒。也可将操作结果通过语音播放。上述反馈方式可结合或单一应用,本领域技术人员应当知晓对于操作结果信息的输出方式并不限于以上所列。In this step, the operation results can be fed back in different ways according to the software and hardware configuration in the application scenario. According to one implementation, an operation result containing text and/or images can be formed and displayed on the display screen. The operation results can also be fed back through device vibration, for example, a short vibration once if successful/continuous vibration for two seconds if failed. The operation results can also be played back through voice. The above feedback methods can be combined or applied singly. Those skilled in the art should know that the output methods of operation result information are not limited to the above listed.
根据一种实施方式,上述操作结果信息的反馈可通过头戴式设备和/或所述头戴式设备连接的第二设备进行,所述第二设备通过有线或无线连接方式与所述头戴式设备连接。例如,通过头戴式设备的显示屏幕显示,或通过头戴式设备外接的至少一个显示屏显示,或在头戴式设备的显示屏和外接的显示屏同时显示。According to one embodiment, the feedback of the above-mentioned operation result information can be performed through a head-mounted device and/or a second device connected to the head-mounted device. The second device is connected to the head-mounted device through a wired or wireless connection. device connection. For example, it is displayed through the display screen of the head-mounted device, or displayed through at least one display screen external to the head-mounted device, or displayed on the display screen of the head-mounted device and the external display screen simultaneously.
上述视线坐标系可为以下坐标系中的任一:用户头部坐标系、用户身体坐标系、大地坐标系。在使用如智能眼镜、头盔等头戴式设备检测视线时,也可基于头戴式设备建立坐标系。如图6所示,为用户头部坐标系601、用户身体坐标系602、大地坐标系603相对关系的一个示例。The above-mentioned line of sight coordinate system can be any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system. When using head-mounted devices such as smart glasses and helmets to detect line of sight, a coordinate system can also be established based on the head-mounted device. As shown in FIG. 6 , it is an example of the relative relationship between the user head coordinate system 601 , the user body coordinate system 602 , and the earth coordinate system 603 .
其中,在使用智能眼镜设备等可穿戴设备进行视线方向和/或头部运动的检测时,以可穿戴设备构建坐标系作为用户头部坐标系。通过智能眼镜设备中集成的眼球追踪模块检测用户的视线方向,通过眼镜设备中集成的惯性检测单元(IMU)运动传感模块检测用户的头部运动参数。如图7所示,可以智能眼镜设备为参照物,构建眼镜坐标系作为用户头部坐标系。Among them, when using a wearable device such as a smart glasses device to detect the direction of gaze and/or head movement, the coordinate system constructed by the wearable device is used as the coordinate system of the user's head. The user's gaze direction is detected through the eye tracking module integrated in the smart glasses device, and the user's head movement parameters are detected through the inertial detection unit (IMU) motion sensing module integrated in the glasses device. As shown in Figure 7, the smart glasses device can be used as a reference object to construct the glasses coordinate system as the user head coordinate system.
对于用户头部坐标系,可以用户头部的特定位置为原点,例如以头部重心为原点;以X、Y、Z轴标定其中的矢量方向。For the user head coordinate system, the specific position of the user's head can be used as the origin, for example, the center of gravity of the head can be used as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
对于用户身体坐标系,可以用户身体的特定位置为原点,例如以身体重心或身体重心在地面的投影为原点;以X、Y、Z轴标定其中的矢量方向。For the user body coordinate system, the specific position of the user's body can be used as the origin, for example, the body center of gravity or the projection of the body center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions.
头部坐标系和用户身体坐标系为相对于用户的本地坐标系,是动坐标系。也可采用如大地坐标系的全局坐标系,是静坐标系。The head coordinate system and the user body coordinate system are local coordinate systems relative to the user and are moving coordinate systems. A global coordinate system such as a geodetic coordinate system, which is a static coordinate system, can also be used.
需要说明的是,对于视线坐标系或固定坐标系,其主要作用是作为运动检测的参考体系,可使用的坐标系并不局限于以上所列的头部、身体和大地坐标系。It should be noted that the line of sight coordinate system or fixed coordinate system is mainly used as a reference system for motion detection. The coordinate systems that can be used are not limited to the head, body and earth coordinate systems listed above.
根据一种实施方式,所述固定坐标系为静止坐标系,所述视线坐标系为相对于所述固定坐标系的动坐标系。例如,固定坐标系为大地坐标系,视线坐标系为头部坐标系。According to one embodiment, the fixed coordinate system is a stationary coordinate system, and the line of sight coordinate system is a moving coordinate system relative to the fixed coordinate system. For example, the fixed coordinate system is the earth coordinate system, and the sight coordinate system is the head coordinate system.
根据一种实施方式,所述固定坐标系和所述视线坐标系都为动坐标系。例如,固定坐标系为用户身体坐标系,视线坐标系为用户身体坐标系或头部坐标系。According to an implementation manner, both the fixed coordinate system and the line-of-sight coordinate system are moving coordinate systems. For example, the fixed coordinate system is the user body coordinate system, and the sight coordinate system is the user body coordinate system or head coordinate system.
根据一种实施方式,所述固定坐标系和所述视线坐标系都为静坐标系。例如,均为大地坐标系。According to one implementation, both the fixed coordinate system and the line-of-sight coordinate system are static coordinate systems. For example, both are geodetic coordinate systems.
根据一种实施方式,所述视线坐标系与所述固定坐标系为同一坐标系。当然,视线坐标 系也可与固定坐标系采用不同的坐标系。关于视线坐标系和固定坐标系的设定可根据应用环境和用户需求自定义,灵活设置,适应硬件配置,节约成本,提高效率。According to an embodiment, the line of sight coordinate system and the fixed coordinate system are the same coordinate system. Of course, the line of sight coordinate system can also adopt a different coordinate system from the fixed coordinate system. The settings of the sight coordinate system and the fixed coordinate system can be customized according to the application environment and user needs, and can be set flexibly to adapt to the hardware configuration, saving costs and improving efficiency.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体静止时,固定坐标系为用户身体坐标系。例如,以用户的两脚中心为原点,侧方、头顶和前方分别为坐标系的X轴、Y轴、Z轴方向。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视面前的静止物体、注视屏幕指定位置等。When the user's body is stationary, the fixed coordinate system is the user's body coordinate system. For example, take the center of the user's feet as the origin, and the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
相应的,视线坐标系可为头部坐标系,以用户头部的特定位置为原点,例如以头部重心为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体静止时,固定坐标系为用户身体坐标系。例如,以用户的两脚中心为原点,侧方、头顶和前方分别为坐标系的X轴、Y轴、Z轴方向。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视面前的静止物体、注视屏幕指定位置等。When the user's body is stationary, the fixed coordinate system is the user's body coordinate system. For example, take the center of the user's feet as the origin, and the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
相应的,视线坐标系可为用户身体坐标系,可以用户身体的特定位置为原点,例如以身体重心或身体重心在地面的投影为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体静止时,固定坐标系为用户身体坐标系。例如,以用户的两脚中心为原点,侧方、头顶和前方分别为坐标系的X轴、Y轴、Z轴方向。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视面前的静止物体、注视屏幕指定位置等。When the user's body is stationary, the fixed coordinate system is the user's body coordinate system. For example, take the center of the user's feet as the origin, and the sides, top of the head, and front are the X-axis, Y-axis, and Z-axis directions of the coordinate system respectively. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking at a stationary object in front of him, looking at a specified position on the screen, etc.
相应的,视线坐标系可为大地坐标系,以相对地面固定的某一位置为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。Correspondingly, the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体在移动时,例如正在驾驶行驶中的车辆,固定坐标系可以是大地坐标系。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视道路前方、注视车内仪表盘等。When the user's body is moving, such as driving a moving vehicle, the fixed coordinate system may be a geodetic coordinate system. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
相应的,视线坐标系可为头部坐标系,以用户头部的特定位置为原点,例如以头部重心为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体在移动时,例如正在驾驶行驶中的车辆,固定坐标系可以是大地坐标系。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视道路前方、注视车内仪表盘等。When the user's body is moving, such as driving a moving vehicle, the fixed coordinate system may be a geodetic coordinate system. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
相应的,视线坐标系可为用户身体坐标系,可以用户身体的特定位置为原点,例如以身体重心或身体重心在地面的投影为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户身体在移动时,例如正在驾驶行驶中的车辆,固定坐标系可以是大地坐标系。此时视线方向相对于一个固定坐标系基本稳定的场景可包括用户注视道路前方、注视车内仪表盘等。When the user's body is moving, such as driving a moving vehicle, the fixed coordinate system may be a geodetic coordinate system. At this time, the scene where the direction of sight is basically stable relative to a fixed coordinate system can include the user looking ahead on the road, looking at the instrument panel in the car, etc.
相应的,视线坐标系可为大地坐标系,以相对地面固定的某一位置为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。Correspondingly, the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户使用如智能眼镜设备等可穿戴设备时,固定坐标系可以是用户头部坐标系,具体 可为以该智能眼镜设备的中心为原点建立的坐标系。When the user uses a wearable device such as a smart glasses device, the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
相应的,视线坐标系可为头部坐标系,以用户头部的特定位置为原点,例如以头部重心为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be a head coordinate system, with the specific position of the user's head as the origin, for example, the center of gravity of the head as the origin; and the vector directions therein are calibrated with the X, Y, and Z axes.
优选的,头部坐标系可为以该智能眼镜设备的中心为原点建立的坐标系。Preferably, the head coordinate system may be a coordinate system established with the center of the smart glasses device as the origin.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户使用如智能眼镜设备等可穿戴设备时,固定坐标系可以是用户头部坐标系,具体可为以该智能眼镜设备的中心为原点建立的坐标系。When the user uses a wearable device such as a smart glasses device, the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
相应的,视线坐标系可为用户身体坐标系,可以用户身体的特定位置为原点,例如以身体重心或身体重心在地面的投影为原点;以X、Y、Z轴标定其中的矢量方向。Correspondingly, the line of sight coordinate system can be the user's body coordinate system, and the specific position of the user's body can be the origin, for example, the center of gravity of the body or the projection of the body's center of gravity on the ground as the origin; the X, Y, and Z axes are used to calibrate the vector directions therein.
本公开的一示例性实施例还提供了一种人机交互方法,在检测用户头部运动时,根据不同场景选择不同坐标系作为固定坐标系。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that selects different coordinate systems as fixed coordinate systems according to different scenarios when detecting the movement of the user's head.
当用户使用如智能眼镜设备等可穿戴设备时,固定坐标系可以是用户头部坐标系,具体可为以该智能眼镜设备的中心为原点建立的坐标系。When the user uses a wearable device such as a smart glasses device, the fixed coordinate system may be the coordinate system of the user's head. Specifically, it may be a coordinate system established with the center of the smart glasses device as the origin.
相应的,视线坐标系可为大地坐标系,以相对地面固定的某一位置为原点,X、Y、Z轴的数值表征的空间点1,原点到空间点1的矢量方向为用户的视觉方向。Correspondingly, the line of sight coordinate system can be a geodetic coordinate system, with a fixed position relative to the ground as the origin, the numerical representation of the X, Y, and Z axes as space point 1, and the vector direction from the origin to space point 1 as the user's visual direction. .
本公开的一示例性实施例还提供了一种人机交互方法,根据硬件配置情况,通过不同方式检测视线和头部运动。An exemplary embodiment of the present disclosure also provides a human-computer interaction method that detects line of sight and head movement in different ways according to hardware configuration.
对于视线检测,可通过视觉方式检测,即通过摄像头等拍摄设备,拍摄眼部照片等影像数据,进行分析确定视线。优选的,在检测视线方向时,还可增加红外照明模块,以便在不同环境光线条件下都能更清晰的获取用户眼睛部位的图像。For line of sight detection, it can be detected visually, that is, through a camera and other shooting equipment, image data such as eye photos are taken, and analysis is performed to determine the line of sight. Preferably, when detecting the line of sight direction, an infrared lighting module can also be added to obtain a clearer image of the user's eye area under different ambient light conditions.
本公开的一示例性实施例还提供了一种人机交互方法,配置为头戴式设备等可穿戴设备。通过可穿戴设备同时检测用户的视线方向和头部运动的参数,由于用户在实际使用时会在保 持视线方向固定的同时进行点头等头部运动,相对于可穿戴设备的视线方向变化与相对于头部坐标系的头部运动方向正好相反,这样联合检测时可以在保证低漏检率的同时有效的避免误检测,相比传统的单一检测方式准确性更高。An exemplary embodiment of the present disclosure also provides a human-computer interaction method configured as a wearable device such as a head-mounted device. The wearable device simultaneously detects the parameters of the user's gaze direction and head movement. Since the user will keep the gaze direction fixed while performing head movements such as nodding during actual use, the change in gaze direction relative to the wearable device is different from that of the wearable device. The head movement direction of the head coordinate system is exactly the opposite, so that joint detection can effectively avoid false detection while ensuring a low missed detection rate, and is more accurate than the traditional single detection method.
本公开的一示例性实施例还提供了一种人机交互装置,其结构如图8所示,包括:An exemplary embodiment of the present disclosure also provides a human-computer interaction device, the structure of which is shown in Figure 8, including:
视线检测模块801,用于检测用户的视线方向; Gaze detection module 801, used to detect the direction of the user's gaze;
运动检测模块802,用于在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;The motion detection module 802 is used to start the detection of head movement when the direction of sight does not change beyond the preset time threshold;
指令生成模块803,用于根据对头部运动检测得到的头部运动信息,生成对应的操作指令;The instruction generation module 803 is used to generate corresponding operation instructions based on the head movement information obtained by detecting the head movement;
执行模块804,用于执行所述操作指令,并向所述用户反馈操作结果。 Execution module 804 is used to execute the operation instructions and feed back the operation results to the user.
所述视线检测模块801,用于对用户进行眼球追踪,确定视线方向,所述视线方向在视线坐标系中指示用户注视的位置,所述视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。The line of sight detection module 801 is used to track the user's eyeballs and determine the direction of the line of sight. The line of sight direction indicates the position of the user's gaze in the line of sight coordinate system. The line of sight coordinate system determines the user's gaze based on the data of the X, Y, and Z axes. The vector direction of the gaze location.
所述运动检测模块802的结构如图9所示,包括:The structure of the motion detection module 802 is shown in Figure 9, including:
传感器检测子模块901,用于获取IMU中至少一个传感器的传感器信号,并根据所述传感器信号判定头部相对于所述固定坐标系的运动,所述IMU包括至少一个测量加速度信号的加速度计传感器和至少一个测量角信号的陀螺传感器,所述在检测头部运动时相对于固定坐标系进行,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向,所述固定坐标系为以下坐标系中的任一:用户头部坐标系、用户身体坐标系、大地坐标系; Sensor detection sub-module 901 is used to obtain the sensor signal of at least one sensor in the IMU, and determine the movement of the head relative to the fixed coordinate system based on the sensor signal. The IMU includes at least one accelerometer sensor that measures acceleration signals. and at least one gyro sensor for measuring angular signals, the detection of head movement is performed relative to a fixed coordinate system that determines the vector direction of the head movement using data on the X, Y, and Z axes, the fixed The coordinate system is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system;
图像检测子模块902,用于通过图像采集设备获取头部影像数据,相对于所述固定坐标系根据所述头部影像数据判断头部运动; Image detection sub-module 902 is used to obtain head image data through an image acquisition device, and determine head movement based on the head image data relative to the fixed coordinate system;
声音检测子模块903,用于检测用户语音的声波的震动特征,并根据所述震动特征,分析头部运动。The sound detection sub-module 903 is used to detect the vibration characteristics of the sound wave of the user's voice, and analyze the head movement based on the vibration characteristics.
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the devices in the above embodiments, the specific manner in which each module performs operations has been described in detail in the embodiments related to the method, and will not be described in detail here.
图10是根据一示例性实施例示出的一种用于人机交互的装置1000的框图。例如,装置 1000可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 10 is a block diagram of a device 1000 for human-computer interaction according to an exemplary embodiment. For example, the device 1000 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.
参照图10,装置1000可以包括以下一个或多个组件:处理组件1002,存储器1004,电力组件1006,多媒体组件1008,音频组件1010,输入/输出(I/O)的接口1012,传感器组件1014,以及通信组件1016。Referring to Figure 10, the device 1000 may include one or more of the following components: a processing component 1002, a memory 1004, a power component 1006, a multimedia component 1008, an audio component 1010, an input/output (I/O) interface 1012, a sensor component 1014, and communications component 1016.
处理组件1002通常控制装置1000的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件1002可以包括一个或多个处理器1020来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件1002可以包括一个或多个模块,便于处理组件1002和其他组件之间的交互。例如,处理组件1002可以包括多媒体模块,以方便多媒体组件1008和处理组件1002之间的交互。 Processing component 1002 generally controls the overall operations of device 1000, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 1002 may include one or more processors 1020 to execute instructions to complete all or part of the steps of the above method. Additionally, processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components. For example, processing component 1002 may include a multimedia module to facilitate interaction between multimedia component 1008 and processing component 1002.
存储器1004被配置为存储各种类型的数据以支持在设备1000的操作。这些数据的示例包括用于在装置1000上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器1004可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 1004 is configured to store various types of data to support operations at device 1000 . Examples of such data include instructions for any application or method operating on device 1000, contact data, phonebook data, messages, pictures, videos, etc. Memory 1004 may be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EEPROM), Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
电源组件1006为装置1000的各种组件提供电力。电源组件1006可以包括电源管理系统,一个或多个电源,及其他与为装置1000生成、管理和分配电力相关联的组件。 Power supply component 1006 provides power to various components of device 1000. Power supply components 1006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 1000 .
多媒体组件1008包括在所述装置1000和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件1008包括一个前置摄像头和/或后置摄像头。当设备1000处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。 Multimedia component 1008 includes a screen that provides an output interface between the device 1000 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action. In some embodiments, multimedia component 1008 includes a front-facing camera and/or a rear-facing camera. When the device 1000 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.
音频组件1010被配置为输出和/或输入音频信号。例如,音频组件1010包括一个麦克风(MIC),当装置1000处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器1004或经由通信组件1016发送。在一些实施例中,音频组件1010还包括一个扬声器,用于输出音频信号。 Audio component 1010 is configured to output and/or input audio signals. For example, audio component 1010 includes a microphone (MIC) configured to receive external audio signals when device 1000 is in operating modes, such as call mode, recording mode, and speech recognition mode. The received audio signals may be further stored in memory 1004 or sent via communications component 1016 . In some embodiments, audio component 1010 also includes a speaker for outputting audio signals.
I/O接口1012为处理组件1002和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 1012 provides an interface between the processing component 1002 and a peripheral interface module. The peripheral interface module may be a keyboard, a click wheel, a button, etc. These buttons may include, but are not limited to: Home button, Volume buttons, Start button, and Lock button.
传感器组件1014包括一个或多个传感器,用于为装置1000提供各个方面的状态评估。例如,传感器组件1014可以检测到设备1000的打开/关闭状态,组件的相对定位,例如所述组件为装置1000的显示器和小键盘,传感器组件1014还可以检测装置1000或装置1000一个组件的位置改变,用户与装置1000接触的存在或不存在,装置1000方位或加速/减速和装置1000的温度变化。传感器组件1014可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件1014还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件1014还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。 Sensor component 1014 includes one or more sensors for providing various aspects of status assessment for device 1000 . For example, the sensor component 1014 can detect the open/closed state of the device 1000, the relative positioning of components, such as the display and keypad of the device 1000, and the sensor component 1014 can also detect the position change of the device 1000 or a component of the device 1000. , the presence or absence of user contact with the device 1000 , device 1000 orientation or acceleration/deceleration and temperature changes of the device 1000 . Sensor assembly 1014 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件1016被配置为便于装置1000和其他设备之间有线或无线方式的通信。装置1000可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件1016经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件1016还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 1016 is configured to facilitate wired or wireless communication between apparatus 1000 and other devices. Device 1000 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 1016 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communications component 1016 also includes a near field communications (NFC) module to facilitate short-range communications. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在示例性实施例中,装置1000可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, apparatus 1000 may be configured by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable Gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are implemented for executing the above method.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器1004,上述指令可由装置1000的处理器1020执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions, such as a memory 1004 including instructions, which can be executed by the processor 1020 of the device 1000 to complete the above method is also provided. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
一种计算机装置,包括:A computer device including:
处理器;processor;
用于存储处理器可执行指令的存储器;Memory used to store instructions executable by the processor;
其中,所述处理器被配置为:Wherein, the processor is configured as:
检测用户的视线方向;Detect the user’s gaze direction;
在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种人机交互方法,所述方法包括:A non-transitory computer-readable storage medium, when instructions in the storage medium are executed by a processor of a mobile terminal, enable the mobile terminal to perform a human-computer interaction method, the method includes:
检测用户的视线方向;Detect the user’s gaze direction;
在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本申请旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本发明的真正范围和精神由下面的权利要求指出。Other embodiments of the invention will be readily apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary technical means in the technical field that are not disclosed in the present disclosure. . It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
应当理解的是,本发明并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本发明的范围仅由所附的权利要求来限制。It is to be understood that the present invention is not limited to the precise construction described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
工业实用性Industrial applicability
本文中首先检测用户的视线方向,并在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测,从而根据对头部运动检测得到的头部运动信息,生成对应的操作指令,然后执行所述操作指令,并向所述用户反馈操作结果。通过视线和头部运动两个维度,结合控制人机交互,适用于复杂、精细的应用场景,实现了高效、高准确性的人机交互。In this article, the user's gaze direction is first detected, and when the gaze direction does not change beyond the preset time threshold, the head movement detection is started, and the corresponding head movement information is generated based on the head movement information obtained from the head movement detection. operation instructions, and then executes the operation instructions and feeds back the operation results to the user. Through the two dimensions of sight and head movement, combined with the control of human-computer interaction, it is suitable for complex and delicate application scenarios, achieving efficient and high-accuracy human-computer interaction.

Claims (13)

  1. 一种人机交互方法,其特征在于,包括:A human-computer interaction method, characterized by including:
    检测用户的视线方向;Detect the user’s gaze direction;
    在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
    根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
    执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
  2. 根据权利要求1所述的人机交互方法,其特征在于,所述检测用户的视线方向的步骤包括:The human-computer interaction method according to claim 1, wherein the step of detecting the user's line of sight direction includes:
    对用户进行眼球追踪,确定视线方向,所述视线方向在视线坐标系中指示用户注视的位置,所述视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。Perform eye tracking on the user to determine the gaze direction, which indicates the position where the user is looking in the gaze coordinate system, which determines the vector direction of the position where the user is gazing with data on the X, Y, and Z axes.
  3. 根据权利要求1所述的人机交互方法,其特征在于,所述启动对头部运动的检测的步骤包括:The human-computer interaction method according to claim 1, wherein the step of initiating the detection of head movement includes:
    获取惯性检测单元IMU中至少一个传感器的传感器信号,并根据所述传感器信号判定相对于固定坐标系的头部运动,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向。Acquire the sensor signal of at least one sensor in the inertial detection unit IMU, and determine the head movement relative to a fixed coordinate system based on the sensor signal. The fixed coordinate system determines the vector of the head movement with data on the X, Y, and Z axes. direction.
  4. 根据权利要求3所述的人机交互方法,其特征在于,所述IMU包括至少一个测量加速度信号的加速度计传感器和至少一个测量角信号的陀螺传感器。The human-computer interaction method according to claim 3, wherein the IMU includes at least one accelerometer sensor that measures acceleration signals and at least one gyro sensor that measures angle signals.
  5. 根据权利要求1所述的人机交互方法,其特征在于,所述启动对头部运动的检测的步骤包括:The human-computer interaction method according to claim 1, wherein the step of initiating the detection of head movement includes:
    通过图像采集设备获取头部影像数据,根据所述头部影像数据,相对于固定坐标系判断头部运动,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向。Head image data is acquired through an image acquisition device, and head movement is determined relative to a fixed coordinate system based on the head image data. The fixed coordinate system determines the vector direction of the head movement using data on the X, Y, and Z axes.
  6. 根据权利要求1所述的人机交互方法,其特征在于,所述启动对头部运动的检测的步骤包括:The human-computer interaction method according to claim 1, wherein the step of initiating the detection of head movement includes:
    检测用户语音的声波的震动特征;Detect the vibration characteristics of the sound waves of the user’s voice;
    根据所述震动特征,分析头部运动。Based on the vibration characteristics, the head movement is analyzed.
  7. 根据权利要求1所述的人机交互方法,其特征在于,通过头戴式设备检测视线方向和/或头部运动。The human-computer interaction method according to claim 1, characterized in that the line of sight direction and/or head movement is detected through a head-mounted device.
  8. 根据权利要求1所述的人机交互方法,其特征在于,通过头戴式设备和/或所述头戴式设备连接的第二设备反馈操作结果,所述第二设备通过有线或无线连接方式与所述头戴式设备连接。The human-computer interaction method according to claim 1, characterized in that the operation results are fed back through a head-mounted device and/or a second device connected to the head-mounted device, and the second device is connected through a wired or wireless connection. Connect with the headset.
  9. 一种人机交互装置,其特征在于,包括:A human-computer interaction device, characterized by including:
    视线检测模块,用于检测用户的视线方向;Gaze detection module, used to detect the direction of the user's gaze;
    运动检测模块,用于在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;A motion detection module for initiating the detection of head motion when the direction of vision does not change beyond a preset time threshold;
    指令生成模块,用于根据对头部运动检测得到的头部运动信息,生成对应的操作指令;The instruction generation module is used to generate corresponding operation instructions based on the head movement information obtained by detecting the head movement;
    执行模块,用于执行所述操作指令,并向所述用户反馈操作结果。An execution module is used to execute the operation instructions and feed back the operation results to the user.
  10. 根据权利要求9所述的人机交互装置,其特征在于,所述视线检测模块,用于对用户进行眼球追踪,确定视线方向,所述视线方向在视线坐标系中指示用户注视的位置,所述视线坐标系以X、Y、Z轴的数据确定用户注视的位置的矢量方向。The human-computer interaction device according to claim 9, wherein the gaze detection module is used to track the user's eyeballs and determine the gaze direction, and the gaze direction indicates the position of the user's gaze in the gaze coordinate system, so The gaze coordinate system determines the vector direction of the user's gaze position using data on the X, Y, and Z axes.
  11. 根据权利要求9所述的人机交互装置,其特征在于,所述运动检测模块包括:The human-computer interaction device according to claim 9, wherein the motion detection module includes:
    传感器检测子模块,用于获取惯性检测单元IMU中至少一个传感器的传感器信号,并根据所述传感器信号判定头部相对于所述固定坐标系的运动,所述IMU包括至少一个测量加速度信号的加速度计传感器和至少一个测量角信号的陀螺传感器,所述在检测头部运动时相对于固定坐标系进行,所述固定坐标系以X、Y、Z轴的数据确定头部运动的矢量方向,所述固定坐标系为以下坐标系中的任一:用户头部坐标系、用户身体坐标系、大地坐标系;Sensor detection submodule, used to obtain the sensor signal of at least one sensor in the inertial detection unit IMU, and determine the movement of the head relative to the fixed coordinate system based on the sensor signal. The IMU includes at least one acceleration measuring acceleration signal. The sensor and at least one gyro sensor that measures an angular signal are detected relative to a fixed coordinate system that determines the vector direction of the head movement based on data on the X, Y, and Z axes. The fixed coordinate system is any one of the following coordinate systems: user head coordinate system, user body coordinate system, and earth coordinate system;
    图像检测子模块,用于通过图像采集设备获取头部影像数据,相对于所述固定坐标系根据所述头部影像数据判断头部运动;An image detection submodule, used to obtain head image data through an image acquisition device, and determine head movement based on the head image data relative to the fixed coordinate system;
    声音检测子模块,用于检测用户语音的声波的震动特征,并根据所述震动特征,分析头部运动。The sound detection submodule is used to detect the vibration characteristics of the sound wave of the user's voice, and analyze the head movement based on the vibration characteristics.
  12. 一种计算机装置,其特征在于,包括:A computer device, characterized in that it includes:
    处理器;processor;
    用于存储处理器可执行指令的存储器;Memory used to store instructions executable by the processor;
    其中,所述处理器被配置为:Wherein, the processor is configured as:
    检测用户的视线方向;Detect the user’s gaze direction;
    在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
    根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
    执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
  13. 一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行一种人机交互方法,所述方法包括:A non-transitory computer-readable storage medium, when instructions in the storage medium are executed by a processor of a mobile terminal, enable the mobile terminal to perform a human-computer interaction method, the method includes:
    检测用户的视线方向;Detect the user’s gaze direction;
    在超过预设的时间阈值视线方向未改变的情况下,启动对头部运动的检测;When the gaze direction does not change beyond the preset time threshold, start the detection of head movement;
    根据对头部运动检测得到的头部运动信息,生成对应的操作指令;Generate corresponding operating instructions based on the head movement information obtained from head movement detection;
    执行所述操作指令,并向所述用户反馈操作结果。Execute the operation instructions and feed back the operation results to the user.
PCT/CN2022/099701 2022-06-20 2022-06-20 Human-computer interaction method and device, computer device and storage medium WO2023245316A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202280004336.8A CN117616368A (en) 2022-06-20 2022-06-20 Man-machine interaction method and device, computer device and storage medium
PCT/CN2022/099701 WO2023245316A1 (en) 2022-06-20 2022-06-20 Human-computer interaction method and device, computer device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/099701 WO2023245316A1 (en) 2022-06-20 2022-06-20 Human-computer interaction method and device, computer device and storage medium

Publications (1)

Publication Number Publication Date
WO2023245316A1 true WO2023245316A1 (en) 2023-12-28

Family

ID=89378902

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/099701 WO2023245316A1 (en) 2022-06-20 2022-06-20 Human-computer interaction method and device, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN117616368A (en)
WO (1) WO2023245316A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101308400A (en) * 2007-05-18 2008-11-19 肖斌 Novel human-machine interaction device based on eye-motion and head motion detection
CN103294180A (en) * 2012-03-01 2013-09-11 联想(北京)有限公司 Man-machine interaction control method and electronic terminal
CN106325517A (en) * 2016-08-29 2017-01-11 袁超 Target object trigger method and system and wearable equipment based on virtual reality
US10921882B1 (en) * 2019-12-26 2021-02-16 Jie Li Human-machine interaction method, system and apparatus for controlling an electronic device
CN113160260A (en) * 2021-05-08 2021-07-23 哈尔滨理工大学 Head-eye double-channel intelligent man-machine interaction system and operation method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101308400A (en) * 2007-05-18 2008-11-19 肖斌 Novel human-machine interaction device based on eye-motion and head motion detection
CN103294180A (en) * 2012-03-01 2013-09-11 联想(北京)有限公司 Man-machine interaction control method and electronic terminal
CN106325517A (en) * 2016-08-29 2017-01-11 袁超 Target object trigger method and system and wearable equipment based on virtual reality
US10921882B1 (en) * 2019-12-26 2021-02-16 Jie Li Human-machine interaction method, system and apparatus for controlling an electronic device
CN113160260A (en) * 2021-05-08 2021-07-23 哈尔滨理工大学 Head-eye double-channel intelligent man-machine interaction system and operation method

Also Published As

Publication number Publication date
CN117616368A (en) 2024-02-27

Similar Documents

Publication Publication Date Title
US10509487B2 (en) Combining gyromouse input and touch input for navigation in an augmented and/or virtual reality environment
US11340707B2 (en) Hand gesture-based emojis
US9401050B2 (en) Recalibration of a flexible mixed reality device
US20180188802A1 (en) Operation input apparatus and operation input method
JP6201024B1 (en) Method for supporting input to application for providing content using head mounted device, program for causing computer to execute the method, and content display device
JP2013258614A (en) Image generation device and image generation method
CN106067833B (en) Mobile terminal and control method thereof
KR20160008372A (en) Mobile terminal and control method for the mobile terminal
KR20210064019A (en) Method and device for obtaining positioning information and medium
US10771707B2 (en) Information processing device and information processing method
US20210406542A1 (en) Augmented reality eyewear with mood sharing
JP2015118442A (en) Information processor, information processing method, and program
CN111415421B (en) Virtual object control method, device, storage medium and augmented reality equipment
CN112202962A (en) Screen brightness adjusting method and device and storage medium
WO2022199102A1 (en) Image processing method and device
US20220253198A1 (en) Image processing device, image processing method, and recording medium
KR101695695B1 (en) Mobile terminal and method for controlling the same
WO2023245316A1 (en) Human-computer interaction method and device, computer device and storage medium
WO2020071144A1 (en) Information processing device, information processing method, and program
US9300908B2 (en) Information processing apparatus and information processing method
US11275547B2 (en) Display system, display method, and program
CN114115544B (en) Man-machine interaction method, three-dimensional display device and storage medium
US11449296B2 (en) Display system, display method, and program
KR20190061825A (en) Tethering type head mounted display and method for controlling the same
KR20160084208A (en) Mobile terminal and method for controlling the same

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202280004336.8

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22947097

Country of ref document: EP

Kind code of ref document: A1