CN113160260B - Head-eye double-channel intelligent man-machine interaction system and operation method - Google Patents

Head-eye double-channel intelligent man-machine interaction system and operation method Download PDF

Info

Publication number
CN113160260B
CN113160260B CN202110499945.2A CN202110499945A CN113160260B CN 113160260 B CN113160260 B CN 113160260B CN 202110499945 A CN202110499945 A CN 202110499945A CN 113160260 B CN113160260 B CN 113160260B
Authority
CN
China
Prior art keywords
data
module
head
eye
gesture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110499945.2A
Other languages
Chinese (zh)
Other versions
CN113160260A (en
Inventor
于天河
温宏韬
王鹏
王世龙
庞广龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Penglu Intelligent Technology Co ltd
Harbin University of Science and Technology
Original Assignee
Harbin Penglu Intelligent Technology Co ltd
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Penglu Intelligent Technology Co ltd, Harbin University of Science and Technology filed Critical Harbin Penglu Intelligent Technology Co ltd
Priority to CN202110499945.2A priority Critical patent/CN113160260B/en
Publication of CN113160260A publication Critical patent/CN113160260A/en
Application granted granted Critical
Publication of CN113160260B publication Critical patent/CN113160260B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20032Median filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Geometry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Multimedia (AREA)
  • Position Input By Displaying (AREA)

Abstract

The invention discloses a head-eye double-channel intelligent human-computer interaction system and an operation method, and belongs to the technical field of human-computer interaction. The head-eye double-channel intelligent man-machine interaction system comprises: the eye image acquisition module, the head gesture detection module, the system core module, the wireless data transmission module and the power module are connected with the system core module, the system core module is connected with the wireless data transmission module, and the power module is simultaneously electrified for the eye image acquisition module, the head gesture detection module, the system core module and the wireless data transmission module. The invention realizes the man-machine interaction mode of head-eye double-input channel control by determining the gazing area by the head gesture and determining the specific gazing point by the sight, thereby solving the defects of a single-input channel man-machine interaction system.

Description

Head-eye double-channel intelligent man-machine interaction system and operation method
Technical Field
The invention relates to a head-eye double-channel intelligent human-computer interaction system and an operation method, and belongs to the technical field of human-computer interaction.
Background
Man-machine interaction technology is widely applied to the fields of military, industry, education, medical treatment, home furnishing and the like as an interface for information communication between people and computers. Along with the rapid development of electronic technology and computer technology, the man-machine interaction mode is also changed greatly, is not limited to the traditional mouse and keyboard input, display screen output and the like, but gradually develops into a multi-channel interaction mode integrating the traditional interaction mode with natural man-machine interaction modes such as eye movement, gesture, voice and the like, and the intelligence of man-machine interaction is greatly improved.
Since the line of sight can reflect the information of interest of human eyes, and is regarded as an important information input source in man-machine interaction, the line of sight tracking technology is attracting attention of researchers in the field. However, the current sight control technology based on sight tracking has inherent defects, such as difficulty in completing accurate operation in a large range and deviation of head movement on system precision, and the practicability of human-computer interaction by only relying on sight is still low.
Since the gaze point of the human eye is determined by both the eye and the head orientation, the head orientation determines the range at which the human eye can gaze, and the eye determines the precise gaze direction. If the head gesture is used as a complementary input channel and combined with the sight line input channel, the head gesture is used for determining the fixation area to finish coarse positioning, the sight line is used for determining the specific fixation point, the human-computer interaction mode of head-eye double-input channel control is realized, the defect existing in a single input channel is overcome, the practicability of sight line interaction is greatly improved, and huge market and economic benefits are realized.
Disclosure of Invention
The invention aims to solve the problem of low practicability of the conventional single-channel visual interaction system in practical application, and discloses a head-eye double-channel intelligent human-computer interaction system and method based on a sight tracking technology and a head gesture detection technology.
A head-eye double-channel intelligent human-computer interaction system comprises: the eye image acquisition module, the head gesture detection module, the system core module, the wireless data transmission module and the power supply module are connected with the system core module, the system core module is connected with the wireless data transmission module, and the power supply module simultaneously energizes the eye image acquisition module, the head gesture detection module, the system core module and the wireless data transmission module;
Furthermore, the head-eye double-channel intelligent man-machine interaction system also comprises an electronic terminal, wherein,
The eye image acquisition module is used for acquiring human eye gray image data and transmitting the human eye gray image data to the system core module;
the head gesture detection module is used for acquiring three-axis gesture data, carrying out gesture calculation on the three-axis gesture data, and sending a gesture calculation result to the system core module;
The system core module is used for processing the gray image data and the gesture resolving result, converting the gray image data and the gesture resolving result into interactive data suitable for a computer and transmitting the interactive data to the electronic terminal;
The wireless data transmission module is used for enabling the system core module to be in wireless connection with the electronic terminal;
The electronic terminal is used for converting the interaction data into an operation instruction suitable for the electronic terminal;
The head-eye double-channel intelligent human-computer interaction system further comprises a head-wearing frame, wherein the head-wearing frame comprises a mirror frame, a left supporting leg, a right supporting leg, a charging interface, an LED (light-emitting diode) state indicator lamp and a lens expanding frame, the left supporting leg and the right supporting leg are respectively connected to the left end and the right end of the mirror frame in a rotating mode, the charging interface is arranged on the left supporting leg, the lens expanding frame is arranged in the mirror frame, and the LED state indicator lamp is arranged on the outer side face of the left supporting leg;
the LED state indicator lamps are RGB indicator lamps and are used for indicating different states of the system;
the lens expanding frame is internally provided with a lens which is a plane lens or a myopia lens;
The left supporting leg and the right supporting leg are of cavity structures;
Further, the system core module is embedded in the cavity of the right supporting leg and comprises an eye image caching module and an FPGA control and data processing module, and the FPGA control and data processing module is respectively connected with the eye image acquisition module, the head gesture detection module and the eye image caching module;
The eye image acquisition module is arranged at the right lower part of the mirror frame and comprises a single infrared LED integrated miniature infrared camera with adjustable angle, a camera left-right angle adjusting roller and a camera up-down angle adjusting roller, wherein the left-right angle and the up-down angle of the miniature infrared camera are respectively adjusted by the camera left-right angle adjusting roller and the camera up-down angle adjusting roller;
The eye image data caching module is used for caching video image data output by the miniature infrared camera head by taking a frame as a unit;
The FPGA control and data processing module is used for reading the data in the eye image data caching module by taking the frame as a unit, and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module to obtain sight landing point data;
The FPGA control and data processing module is also used for acquiring the gesture data output by the head gesture detection module and processing the gesture data to obtain head gesture angle data;
The FPGA control and data processing module is also used for carrying out fusion processing on the sight falling point data and the head attitude angle data through data fusion, converting the sight falling point data and the head attitude angle data into interactive data suitable for a computer,
The head gesture detection module is embedded in the middle upper part of the mirror frame; the head gesture detection module comprises a three-axis gyroscope, a three-axis accelerometer and a Digital Motion Processor (DMP);
the wireless data transmission module is characterized in that the core of the wireless data transmission module is a low-power consumption Bluetooth chip and is connected with the FPGA control and data processing module.
Further, the power module comprises a power management module, a polymer lithium battery and a power switch button, wherein the power management module is connected with the charging interface, the polymer lithium battery is connected with the power management module, and the power switch button is connected with the power management module;
Further, in the state of charge of the power supply, the LED status indicator lamp emits red light, and emits green light when full; when a power supply switch button is pressed, a power supply of the head-eye double-channel intelligent man-machine interaction system is connected, and an LED state indicator light emits blue light to indicate that the system starts to work normally; when the electric quantity is lower than a preset low-electric quantity warning line, the LED status indicator lamp flashes and emits red light.
The head-eye double-channel intelligent human-computer interaction method is based on the head-eye double-channel intelligent human-computer interaction system, and comprises the following steps of:
s100, extracting sight line information;
s200, detecting the head posture;
S300, data fusion processing;
s400, transmitting interactive data.
Further, in S100, the method specifically includes the following steps:
s110, acquiring an original image by using a miniature infrared camera, and converting the original image into human eye gray image data;
s120, caching the acquired human eye gray image data to an eye image data caching module by taking a frame as a unit;
S130, the FPGA control and data processing module reads the gray image data of the human eyes from the eye image data caching module, and the sight line tracking algorithm is used for extracting the sight line information.
Further, in S130, the extracting of the line-of-sight information includes the steps of:
s131, preprocessing an eye image: the image preprocessing comprises median filtering, and salt and pepper noise in the eye images is filtered;
S132, pupil edge feature extraction: pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: binarizing and segmenting the image f (x, y) by using a pre-determined gray threshold T and the following formula to obtain a binary image g (x, y):
Wherein t=35;
Morphological treatment: for eyelashes or pupil edge noise points still existing in an image, the image is processed by utilizing mathematical morphological operation, specifically, a disk-shaped structural element B with the radius of 6 is selected by the system according to the characteristics of noise in the image, firstly, the image A is subjected to closed operation by utilizing a formula (2), and then the image A is subjected to open operation by utilizing a formula (3):
Sobel edge detection: carrying out convolution calculation on the image by adopting a3 multiplied by 3 Sobel horizontal operator Gx and a vertical operator Gy, and extracting the edge contour of the pupil area;
S133, extracting pupil centers by utilizing the near-circle characteristic of the pupil: specifically, calculating the minimum circumscribed rectangle of the pupil area outline, and using a rectangle centroid (x 0,y0) to approximately replace the pupil center;
S134, estimating a line-of-sight falling point: the line of sight falling point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain the mapping relation between pupil center coordinates and calibration points; the length and the width of the interactive interface are halved to obtain nine gazing areas with equal area, one area is selected as a calibration interface,
The calibration method specifically comprises the following steps: the user adjusts the body position so that the line of sight falls at the middle calibration point of the calibration interface when the user looks straight, then the head is kept motionless, the user looks at nine points on the calibration interface in sequence, the system records the pupil center coordinates and the corresponding calibration point coordinates, and the following two-dimensional mapping equation is established:
Wherein, X G、YG is the abscissa of the calibration point of the calibration interface, X e、Ye is the abscissa of the corresponding pupil center, a 0~a5 and b 0~b5 are the undetermined parameters of the mapping function, 9 sets of calibration data are substituted into the equation set to solve the undetermined parameters, and a mapping equation can be obtained, through which the pupil center coordinate can be mapped into the coordinate in the injection region.
Further, in S200, the head gesture detection is performed, first, three-axis gesture data is obtained through a three-axis gyroscope and a three-axis accelerometer inside the head gesture detection module, then, the three-axis gesture data is subjected to gesture calculation by using a digital motion processor DMP, the quaternion obtained by using the three-axis gesture data is read by the FPGA control and data processing module, finally, the quaternion is subjected to formula conversion to obtain gesture angle data, and the specific conversion process is as follows:
Quaternion definition: one quaternion is denoted q=w+xi+yj+zk, written in matrix form as:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
quaternion and Euler angle conversion, and conversion is carried out on the quaternion by utilizing the formula (8) to obtain an attitude angle:
Wherein, Θ, ψ represent the Roll angle Roll, pitch angle Pitch, and Yaw angle Yaw, respectively.
Further, in the step S300, data fusion processing is performed, firstly, the gesture angle data obtained by gesture resolution in the step S200 is obtained through the FPGA control and data processing module, and whether the head movement generates an effective movement is determined by using a threshold value; and then obtaining the original acceleration data obtained in the step S200, continuously carrying out secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head movement as interaction action by utilizing a threshold value, mapping the displacement of the effective interaction action into interaction interface coordinate data by a mapping equation, and overlapping the displacement generated by non-interaction action on a sight falling point to compensate deviation caused by head movement on the estimation of the sight falling point.
Further, in the step S400, the interaction data is sent to the computer end through the FPGA control and data processing module and the wireless data transmission module, and the computer converts the interaction data into an operation instruction of the mouse, so as to complete interaction with the man-machine.
The invention has the following advantages: the invention provides a head-eye double-channel intelligent human-computer interaction system and an interaction method, which are used for completing coarse positioning by determining a gazing area by using head gestures, determining a specific gazing point by using a sight line, realizing a human-computer interaction mode of head-eye double-input channel control and solving the defects of a single-input channel human-computer interaction system. According to the head-eye double-channel intelligent human-computer interaction method, head movements are divided into actions with interaction intention and non-interaction intention actions, the interaction intention actions are mapped into interaction data, the non-interaction intention actions are used for compensating sight falling points, deviation caused by unconscious head movements in sight tracking is reduced, and the practicability of a sight interaction technology is improved. The head-eye double-channel intelligent human-computer interaction system provided by the invention is based on embedded hardware design, has low requirements on hardware performance of an upper computer, is convenient to use and has strong applicability.
Drawings
FIG. 1 is a system architecture block diagram of a head-eye dual-channel intelligent human-computer interaction system of the present invention;
FIG. 2 is a schematic diagram of a system architecture of a head-eye dual-channel intelligent human-computer interaction system according to the present invention;
FIG. 3 is a step diagram of a method for extracting sight line information according to the present invention;
FIG. 4 is a schematic diagram of a Sobel operator of the present invention;
FIG. 5 is a pupil center extraction schematic;
FIG. 6 is a nine-point calibration interface schematic;
FIG. 7 is a flow chart of head motion recognition;
FIG. 8 is a schematic view of head movements;
FIG. 9 is a schematic view of head movements;
FIG. 10 is a nine-point calibration interface schematic.
The device comprises a head-mounted frame 1, a mirror frame 1-1, a left support leg 1-2, a right support leg 1-3, a charging interface 1-4, an LED status indicator lamp 1-5, a lens expansion frame 1-6, an eye image acquisition module 2, a miniature infrared camera 2-1, a camera left and right angle adjusting roller 2-2, a camera up and down angle adjusting roller 2-3, a head posture detecting module 3, a system core module 4-1, an eye image data cache module 4-2, an FPGA control and data processing module 5, a wireless data transmission module 6, a power supply module 6-1, a power supply management module 6-2, a polymer lithium battery 6-3 and a power supply switch button 6-3.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention provides a head-eye dual-channel intelligent human-computer interaction system, which comprises: the eye image acquisition module 2, the head gesture detection module 3, the system core module 4, the wireless data transmission module 5 and the power module 6, the eye image acquisition module 2 and the head gesture detection module 3 are connected with the system core module 4, the system core module 4 is connected with the wireless data transmission module 5, and the power module 6 is used for powering on the eye image acquisition module 2, the head gesture detection module 3, the system core module 4 and the wireless data transmission module 5.
Furthermore, the head-eye double-channel intelligent man-machine interaction system also comprises an electronic terminal, wherein,
The eye image acquisition module 2 is used for acquiring the gray image data of the human eyes and transmitting the gray image data of the human eyes to the system core module 4;
The head gesture detection module 3 is used for acquiring three-axis gesture data, performing gesture calculation on the three-axis gesture data, and sending a gesture calculation result to the system core module 4;
the system core module 4 is used for processing the gray image data and the gesture resolving result, converting the gray image data and the gesture resolving result into interactive data suitable for a computer, and transmitting the interactive data to the electronic terminal;
a wireless data transmission module 5, configured to wirelessly connect the system core module 4 and the electronic terminal;
And the electronic terminal is used for converting the interaction data into an operation instruction suitable for the electronic terminal.
Further, referring to fig. 2, the head-eye dual-channel intelligent man-machine interaction system further comprises a head-mounted frame 1, wherein the head-mounted frame 1 comprises a mirror frame 1-1, a left supporting leg 1-2, a right supporting leg 1-3, a charging interface 1-4, an LED status indicator lamp 1-5 and a lens expanding frame 1-6, the left supporting leg 1-2 and the right supporting leg 1-3 are respectively connected to the left end and the right end of the mirror frame 1-1 in a rotating mode, the charging interface 1-4 is arranged on the left supporting leg 1-2, the lens expanding frame 1-6 is arranged in the mirror frame 1-1, and the LED status indicator lamp 1-5 is arranged on the outer side face of the left supporting leg 1-2.
Further, the LED status indicator lamps 1-5 are RGB indicator lamps, which can emit red, green and blue lights for indicating different status of the system.
Furthermore, lenses are arranged in the lens expansion frames 1-6, and the lenses are plane lenses or myopia lenses, so that the expansibility and the practicability of the system can be improved.
Further, the left supporting leg 1-2 and the right supporting leg 1-3 are of a cavity structure.
Further, the system core module 4 is embedded in the cavity of the right supporting leg 1-3, the system core module 4 comprises an eye image caching module 4-1 and an FPGA control and data processing module 4-2, and the FPGA control and data processing module 4-2 is respectively connected with the eye image acquisition module 2, the head gesture detection module 3 and the eye image caching module 4-1.
Further, the eye image acquisition module 2 is arranged at the right lower part of the mirror frame 1-1, the eye image acquisition module 2 comprises a single integrated infrared LED miniature infrared camera 2-1 with adjustable angles, a camera left-right angle adjusting roller 2-2 and a camera up-down angle adjusting roller 2-3, and the left-right angle and the up-down angle of the miniature infrared camera 2-1 are respectively adjusted through the camera left-right angle adjusting roller 2-2 and the camera up-down angle adjusting roller 2-3.
Further, in this case, among others,
The eye image data caching module 4-1 is used for caching video image data output by the miniature infrared camera 2-1 by taking a frame as a unit;
The FPGA control and data processing module 4-2 is used for reading the data in the eye image data caching module 4-1 by taking a frame as a unit, and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module 4-2 to obtain sight drop point data;
The FPGA control and data processing module 4-2 is also used for acquiring the posture data output by the head posture detection module 3 and processing the posture data to obtain head posture angle data;
the FPGA control and data processing module 4-2 is also used for carrying out fusion processing on the sight falling point data and the head attitude angle data through data fusion and converting the sight falling point data and the head attitude angle data into interactive data suitable for a computer.
Further, the head posture detection module 3 is embedded in the middle upper part of the mirror frame 1-1; the head pose detection module 3 includes a three-axis gyroscope, a three-axis accelerometer, and a digital motion processor DMP.
Further, the wireless data transmission module 5 is a low-power consumption bluetooth chip, and is connected with the FPGA control and data processing module 4-2, and is used for transmitting the interactive data to the mobile phone or the computer and other terminals in a wireless mode.
Further, the power module 6 comprises a power management module 6-1, a polymer lithium battery 6-2 and a power switch button 6-3, wherein the power management module 6-1 is connected with the charging interface 1-4, the polymer lithium battery 6-2 is connected with the power management module 6-1, and the power switch button 6-3 is connected with the power management module 6-1.
Further, in the power supply charging state, the LED status indicator lamps 1-5 emit red light, and emit green light when full; when the power switch button 6-3 is pressed, the power supply of the head-eye double-channel intelligent human-computer interaction system is connected, and the LED status indicator lamp 1-5 emits blue light to indicate that the system starts to work normally; when the electric quantity is lower than a preset low-electric quantity warning line, the LED status indicator lamps 1-5 flash and emit red light.
The head-eye double-channel intelligent human-computer interaction method is based on the head-eye double-channel intelligent human-computer interaction system, and comprises the following steps of:
s100, extracting sight line information;
s200, detecting the head posture;
S300, data fusion processing;
s400, transmitting interactive data.
Further, in S100, the method specifically includes the following steps:
S110, acquiring an original image by using a miniature infrared camera 2-1, and converting the original image into human eye gray image data;
s120, caching the acquired human eye gray image data to an eye image data caching module 4-1 by taking a frame as a unit;
S130, the FPGA control and data processing module 4-2 reads the gray image data of the human eyes from the eye image data caching module 4-1, and the sight line information is extracted by using a sight line tracking algorithm.
Further, in S130, referring to fig. 3, the extracting of the sight line information includes the following steps:
S131, preprocessing an eye image: the image preprocessing comprises median filtering, and salt and pepper noise in the eye images is filtered;
S132, pupil edge feature extraction: pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: binarizing and segmenting the image f (x, y) by using a pre-determined gray threshold T and the following formula to obtain a binary image g (x, y):
Wherein t=35;
Morphological treatment: for eyelashes or pupil edge noise points still existing in an image, the image is processed by utilizing mathematical morphological operation, specifically, a disk-shaped structural element B with the radius of 6 is selected by the system according to the characteristics of noise in the image, firstly, the image A is subjected to closed operation by utilizing a formula (2), and then the image A is subjected to open operation by utilizing a formula (3):
sobel edge detection: referring to fig. 4, a3×3 Sobel horizontal operator Gx and a vertical operator Gy are adopted to perform convolution calculation on an image, and an edge contour of a pupil area is extracted;
S133, extracting pupil centers by utilizing the near-circle characteristic of the pupil: specifically, calculating the minimum circumscribed rectangle of the pupil area outline, and using a rectangle centroid (x 0,y0) to approximately replace the pupil center;
S134, estimating a line-of-sight falling point: referring to fig. 5 and 6, a two-dimensional polynomial mapping method based on nine-point calibration is adopted for estimating the line-of-sight falling point, so that a mapping relation between pupil center coordinates and a calibration point is obtained; the length and the width of the interactive interface are halved to obtain nine gazing areas with equal area, one area is selected as a calibration interface,
The calibration method specifically comprises the following steps: the user adjusts the body position so that the line of sight falls at the middle calibration point of the calibration interface when the user looks straight, then the head is kept motionless, the user looks at nine points on the calibration interface in sequence, the system records the pupil center coordinates and the corresponding calibration point coordinates, and the following two-dimensional mapping equation is established:
Wherein, X G、YG is the abscissa of the calibration point of the calibration interface, X e、Ye is the abscissa of the corresponding pupil center, a 0~a5 and b 0~b5 are the undetermined parameters of the mapping function, 9 sets of calibration data are substituted into the equation set to solve the undetermined parameters, and a mapping equation can be obtained, through which the pupil center coordinate can be mapped into the coordinate in the injection region.
S135, estimating a line-of-sight falling point: the line of sight falling point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain the mapping relation between pupil center coordinates and calibration points; in order to ensure the naturality of line-of-sight interaction and the accuracy of line-of-sight control in a small-range area, the system equally divides the length and the width of an interaction interface into nine gazing areas with equal areas, selects one area as a calibration interface, and referring to figure 6,
The calibration method specifically comprises the following steps: the user adjusts the body position so that the line of sight falls at the middle calibration point of the calibration interface when the user looks straight, then the head is kept motionless, the user looks at nine points on the calibration interface in sequence, the system records the pupil center coordinates and the corresponding calibration point coordinates, and the following two-dimensional mapping equation is established:
Wherein, X G、YG is the abscissa of the calibration point of the calibration interface, X e、Ye is the abscissa of the corresponding pupil center, a 0~a5 and b 0~b5 are the undetermined parameters of the mapping function, 9 sets of calibration data are substituted into the equation set to solve the undetermined parameters, and a mapping equation can be obtained, through which the pupil center coordinate can be mapped into the coordinate in the injection region.
Further, in S200, the head gesture detection is performed, first, three-axis gesture data is obtained through a three-axis gyroscope and a three-axis accelerometer inside the head gesture detection module 3, then, the three-axis gesture data is subjected to gesture calculation by using a digital motion processor DMP, the FPGA control and data processing module 4-2 reads quaternion obtained by using angular velocity data and acceleration data gesture calculation, and finally, the quaternion is subjected to formula conversion to obtain gesture angle data, and the specific conversion process is as follows:
Quaternion definition: one quaternion is denoted q=w+xi+yj+zk, written in matrix form as:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
quaternion and Euler angle conversion, and conversion is carried out on the quaternion by utilizing the formula (8) to obtain an attitude angle:
Wherein, Θ, ψ represent the Roll angle Roll, pitch angle Pitch, and Yaw angle Yaw, respectively.
Further, in S300, referring to fig. 7, in the data fusion process, firstly, the FPGA control and data processing module 4-2 obtains the attitude angle data obtained by the attitude solution in S200, and determines whether the head motion generates an effective motion by using a threshold; and then obtaining the original acceleration data obtained in the step S200, continuously carrying out secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head movement as interaction action by utilizing a threshold value, mapping the displacement of the effective interaction action into interaction interface coordinate data by a mapping equation, and overlapping the displacement generated by non-interaction action on a sight falling point to compensate deviation caused by head movement on the estimation of the sight falling point.
Specifically, (1) effective interaction:
when the head movement displacement is larger than a set threshold, the head movement displacement is recognized as an action with interactive intention, and at the moment, a displacement vector generated by projection of a head movement track on an X-Y plane in a space three-dimensional coordinate system is also formed by adopting a two-dimensional polynomial mapping method based on nine-point calibration As the input of the equation, the relation between the displacement vector and the calibration point of the interactive interface is obtained, and the motion of the head is represented by a space three-dimensional coordinate system as shown in fig. 8; the components of the head motion trajectory in the X-Y plane are shown in FIG. 9, the motion about Pitch angle Pitch can be resolved into the Z and Y axes, the motion about Yaw angle Yaw can be resolved into the Z and X axes, and the motion about Roll angle Roll can be resolved into the X and Y axes; the calibration interface is the entire interactive interface, as shown in FIG. 10.
The calibration method comprises the following steps: maintaining the body still, enabling the sight line to maintain a certain point on the direct-view calibration interface through the movement of the head, sequentially completing the calibration of nine points, obtaining nine groups of displacement vectors as the input of an equation set, substituting the nine groups of displacement vectors into the equation set to solve undetermined parameters, and mapping the head action into the change of the cursor position of the computer interaction interface; when the head movement of the user generates interaction, the system shields the data of the visual input channel, takes the head movement data as a unique data source for controlling the cursor of the interaction interface, and the user determines the gazing area only through the head movement to finish coarse positioning; when the head movement stops, the system starts a visual input channel, the sight line drop point data is used as a data source for the movement of the cursor of the interactive interface, and the user controls the movement of the cursor in the gazing area through the movement of the sight line to finish accurate positioning; the system simulates the gaze action of the user into double-click operation of the interactive interface cursor by setting a gaze time threshold;
(2) Invalid interaction actions:
When the head movement displacement is smaller than a set threshold value, the head movement displacement is identified as unintentional head movement and is used as non-interactive movement; at the moment, the head movement is used as compensation of the line-of-sight falling point, the compensation mode is that displacement data of the head movement relative to an X-Y plane are superimposed on pupil center coordinates, and then the established two-dimensional mapping equation is utilized to estimate the line-of-sight falling point;
Further, in the step S400, the interaction data is sent to the computer end through the FPGA control and data processing module 4-2 and the wireless data transmission module 5, and the computer converts the interaction data into an operation command of the mouse, so as to complete interaction with the man-machine.
The above embodiments are only for aiding in understanding the method of the present invention and its core idea, and those skilled in the art can make several improvements and modifications in the specific embodiments and application scope according to the idea of the present invention, and these improvements and modifications should also be considered as the protection scope of the present invention.

Claims (9)

1. The head-eye double-channel intelligent human-computer interaction system is characterized by comprising: the eye image acquisition system comprises an eye image acquisition module (2), a head gesture detection module (3), a system core module (4), a wireless data transmission module (5) and a power module (6), wherein the eye image acquisition module (2) and the head gesture detection module (3) are connected with the system core module (4), the system core module (4) is connected with the wireless data transmission module (5), and the power module (6) is used for powering on the eye image acquisition module (2), the head gesture detection module (3), the system core module (4) and the wireless data transmission module (5) at the same time;
The head-eye double-channel intelligent man-machine interaction system also comprises an electronic terminal, wherein,
The eye image acquisition module (2) is used for acquiring human eye gray image data and transmitting the human eye gray image data to the system core module (4);
the head gesture detection module (3) is used for acquiring three-axis gesture data, carrying out gesture calculation on the three-axis gesture data, and sending a gesture calculation result to the system core module (4);
the system core module (4) is used for processing the gray image data and the gesture resolving result, converting the gray image data and the gesture resolving result into interactive data suitable for a computer, and sending the interactive data to the electronic terminal;
the wireless data transmission module (5) is used for enabling the system core module (4) to be in wireless connection with the electronic terminal;
The electronic terminal is used for converting the interaction data into an operation instruction suitable for the electronic terminal;
the head-eye double-channel intelligent human-computer interaction system further comprises a head-mounted frame (1), the head-mounted frame (1) comprises a mirror frame (1-1), left supporting legs (1-2), right supporting legs (1-3), a charging interface (1-4), LED status indicator lamps (1-5) and a lens expansion frame (1-6), the left supporting legs (1-2) and the right supporting legs (1-3) are respectively connected to the left end and the right end of the mirror frame (1-1) in a rotating mode, the charging interface (1-4) is arranged on the left supporting legs (1-2), the lens expansion frame (1-6) is arranged in the mirror frame (1-1), and the LED status indicator lamps (1-5) are arranged on the outer side face of the left supporting legs (1-2);
The LED state indicator lamps (1-5) are RGB indicator lamps and are used for indicating different states of the system;
lenses are arranged in the lens expansion frames (1-6), and the lenses are plane lenses or myopia lenses;
The left supporting leg (1-2) and the right supporting leg (1-3) are of a cavity structure;
Further, the system core module (4) is embedded in the cavity of the right supporting leg (1-3), the system core module (4) comprises an eye image cache module (4-1) and an FPGA control and data processing module (4-2), and the FPGA control and data processing module (4-2) is respectively connected with the eye image acquisition module (2), the head gesture detection module (3) and the eye image cache module (4-1);
the eye image acquisition module (2) is arranged at the right lower part of the mirror frame (1-1), the eye image acquisition module (2) comprises a single infrared LED integrated miniature infrared camera (2-1) with adjustable angles, a camera left-right angle adjusting roller (2-2) and a camera up-down angle adjusting roller (2-3), and the left-right angle and the up-down angle of the miniature infrared camera (2-1) are respectively adjusted through the camera left-right angle adjusting roller (2-2) and the camera up-down angle adjusting roller (2-3);
the eye image data caching module (4-1) is used for caching video image data output by the miniature infrared camera (2-1) in a frame unit;
The FPGA control and data processing module (4-2) is used for reading the data in the eye image data caching module (4-1) by taking a frame as a unit and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module (4-2) to obtain sight drop point data;
the FPGA control and data processing module (4-2) is also used for acquiring the posture data output by the head posture detection module (3) and processing the posture data to obtain head posture angle data;
the FPGA control and data processing module (4-2) is also used for carrying out fusion processing on the sight falling point data and the head attitude angle data through data fusion and converting the sight falling point data and the head attitude angle data into interactive data suitable for a computer;
The head posture detection module (3) is embedded in the middle upper part of the mirror frame (1-1); the head gesture detection module (3) comprises a three-axis gyroscope, a three-axis accelerometer and a Digital Motion Processor (DMP);
The wireless data transmission module (5) is characterized in that the core of the wireless data transmission module is a low-power consumption Bluetooth chip, and the wireless data transmission module is connected with the FPGA control and data processing module (4-2).
2. The head-eye dual-channel intelligent human-computer interaction system according to claim 1, wherein the power module (6) comprises a power management module (6-1), a polymer lithium battery (6-2) and a power switch button (6-3), wherein the power management module (6-1) is connected with the charging interface (1-4), the polymer lithium battery (6-2) is connected with the power management module (6-1), and the power switch button (6-3) is connected with the power management module (6-1).
3. The head-eye dual-channel intelligent human-computer interaction system according to claim 1, wherein the LED status indicator lamps (1-5) emit red light in a power supply charging state and emit green light when full; when a power switch button (6-3) is pressed, the power supply of the head-eye double-channel intelligent human-computer interaction system is connected, and the LED status indicator lamp (1-5) emits blue light to indicate that the system starts to work normally; when the electric quantity is lower than a preset low-electric quantity warning line, the LED status indicator lamps (1-5) flash and emit red light.
4. A head-eye dual-channel intelligent human-computer interaction method based on the head-eye dual-channel intelligent human-computer interaction system of any one of claims 1-3, characterized in that the head-eye dual-channel intelligent human-computer interaction method comprises the following steps:
s100, extracting sight line information;
s200, detecting the head posture;
S300, data fusion processing;
s400, transmitting interactive data.
5. The head-eye dual-channel intelligent human-computer interaction method according to claim 4, wherein in S100, the method specifically comprises the following steps:
S110, acquiring an original image by using the miniature infrared camera (2-1), and converting the original image into human eye gray scale image data;
S120, caching the acquired human eye gray image data to the eye image data caching module (4-1) in a frame unit;
s130, the FPGA control and data processing module (4-2) reads the human eye gray image data from the eye image data caching module (4-1) and extracts the sight line information by using a sight line tracking algorithm.
6. The head-eye dual-channel intelligent human-computer interaction method according to claim 5, wherein in S130, the extracting of the sight line information comprises the following steps:
S131, preprocessing an eye image: the image preprocessing comprises median filtering, and salt and pepper noise in the eye images is filtered;
S132, pupil edge feature extraction: pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: binarizing and segmenting the image f (x, y) by using a pre-determined gray threshold T and the following formula to obtain a binary image g (x, y):
Wherein t=35;
Morphological treatment: for eyelashes or pupil edge noise points still existing in an image, the image is processed by utilizing mathematical morphological operation, specifically, a disk-shaped structural element B with the radius of 6 is selected by the system according to the characteristics of noise in the image, firstly, the image A is subjected to closed operation by utilizing a formula (2), and then the image A is subjected to open operation by utilizing a formula (3):
Sobel edge detection: carrying out convolution calculation on the image by adopting a3 multiplied by 3 Sobel horizontal operator Gx and a vertical operator Gy, and extracting the edge contour of the pupil area;
S133, extracting pupil centers by utilizing the near-circle characteristic of the pupil: specifically, calculating the minimum circumscribed rectangle of the pupil area outline, and using a rectangle centroid (x 0,y0) to approximately replace the pupil center;
S134, estimating a line-of-sight falling point: the line of sight falling point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain the mapping relation between pupil center coordinates and calibration points; the length and the width of the interactive interface are halved to obtain nine gazing areas with equal area, one area is selected as a calibration interface,
The calibration method specifically comprises the following steps: the user adjusts the body position so that the line of sight falls at the middle calibration point of the calibration interface when the user looks straight, then the head is kept motionless, the user looks at nine points on the calibration interface in sequence, the system records the pupil center coordinates and the corresponding calibration point coordinates, and the following two-dimensional mapping equation is established:
Wherein, X G、YG is the abscissa of the calibration point of the calibration interface, X e、Ye is the abscissa of the corresponding pupil center, a 0~a5 and b 0~b5 are the undetermined parameters of the mapping function, 9 sets of calibration data are substituted into the equation set to solve the undetermined parameters, and a mapping equation can be obtained, through which the pupil center coordinate can be mapped into the coordinate in the injection region.
7. The head-eye dual-channel intelligent human-computer interaction method according to claim 4, wherein in S200, the head gesture detection firstly obtains three-axis gesture data through a three-axis gyroscope and a three-axis accelerometer inside a head gesture detection module (3), then uses a Digital Motion Processor (DMP) to perform gesture calculation on the three-axis gesture data, an FPGA control and data processing module (4-2) reads quaternion calculated by using the three-axis gesture data, and finally performs formula conversion on the quaternion to obtain gesture angle data, and the specific conversion process is as follows:
Quaternion definition: one quaternion is denoted q=w+xi+yj+zk, written in matrix form as:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
quaternion and Euler angle conversion, and conversion is carried out on the quaternion by utilizing the formula (8) to obtain an attitude angle:
Wherein, Θ, ψ represent the Roll angle Roll, pitch angle Pitch, and Yaw angle Yaw, respectively.
8. The head-eye dual-channel intelligent human-computer interaction method according to claim 4, wherein in S300, the data fusion process firstly obtains the attitude angle data obtained by the attitude solution in S200 through the FPGA control and data processing module (4-2), and judges whether the head motion generates the effective motion by using the threshold value; and then obtaining the original acceleration data obtained in the step S200, continuously carrying out secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head movement as interaction action by utilizing a threshold value, mapping the displacement of the effective interaction action into interaction interface coordinate data by a mapping equation, and overlapping the displacement generated by non-interaction action on a sight falling point to compensate deviation caused by head movement on the estimation of the sight falling point.
9. The head-eye dual-channel intelligent human-computer interaction method according to claim 4, wherein the step of sending the interaction data in the step S400 is to send the interaction data to a computer end through a wireless data transmission module (5) by an FPGA control and data processing module (4-2), and the computer converts the interaction data into an operation instruction of a mouse to complete interaction with a human-computer.
CN202110499945.2A 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method Active CN113160260B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110499945.2A CN113160260B (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110499945.2A CN113160260B (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Publications (2)

Publication Number Publication Date
CN113160260A CN113160260A (en) 2021-07-23
CN113160260B true CN113160260B (en) 2024-05-07

Family

ID=76874130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110499945.2A Active CN113160260B (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Country Status (1)

Country Link
CN (1) CN113160260B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117597656A (en) * 2022-06-14 2024-02-23 北京小米移动软件有限公司 Method, device, equipment and storage medium for detecting head action
WO2023245316A1 (en) * 2022-06-20 2023-12-28 北京小米移动软件有限公司 Human-computer interaction method and device, computer device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193383A (en) * 2017-06-13 2017-09-22 华南师范大学 A kind of two grades of Eye-controlling focus methods constrained based on facial orientation
CN112578682A (en) * 2021-01-05 2021-03-30 陕西科技大学 Intelligent obstacle-assisting home system based on electro-oculogram control
CN215814080U (en) * 2021-05-08 2022-02-11 哈尔滨鹏路智能科技有限公司 Head-eye double-channel intelligent man-machine interaction system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106056092B (en) * 2016-06-08 2019-08-20 华南理工大学 The gaze estimation method for headset equipment based on iris and pupil

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193383A (en) * 2017-06-13 2017-09-22 华南师范大学 A kind of two grades of Eye-controlling focus methods constrained based on facial orientation
CN112578682A (en) * 2021-01-05 2021-03-30 陕西科技大学 Intelligent obstacle-assisting home system based on electro-oculogram control
CN215814080U (en) * 2021-05-08 2022-02-11 哈尔滨鹏路智能科技有限公司 Head-eye double-channel intelligent man-machine interaction system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于眼动跟踪的智能家居控制器;王鹏;陈园园;邵明磊;刘博;张伟超;;电机与控制学报;20200515(第05期);全文 *

Also Published As

Publication number Publication date
CN113160260A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
EP3571673B1 (en) Method for displaying virtual image, storage medium and electronic device therefor
US10078377B2 (en) Six DOF mixed reality input by fusing inertial handheld controller with hand tracking
CN107004275B (en) Method and system for determining spatial coordinates of a 3D reconstruction of at least a part of a physical object
US9779512B2 (en) Automatic generation of virtual materials from real-world materials
KR20230164185A (en) Bimanual interactions between mapped hand regions for controlling virtual and graphical elements
US11715231B2 (en) Head pose estimation from local eye region
JP5016175B2 (en) Face image processing system
CN116324677A (en) Non-contact photo capture in response to detected gestures
CN113160260B (en) Head-eye double-channel intelligent man-machine interaction system and operation method
CN107357428A (en) Man-machine interaction method and device based on gesture identification, system
US20220027621A1 (en) Sensor Fusion Eye Tracking
CN114078278A (en) Method and device for positioning fixation point, electronic equipment and storage medium
CN109144250A (en) A kind of method, apparatus, equipment and storage medium that position is adjusted
CN115715177A (en) Blind person auxiliary glasses with geometrical hazard detection function
CN215814080U (en) Head-eye double-channel intelligent man-machine interaction system
Rahmaniar et al. Touchless head-control (thc): Head gesture recognition for cursor and orientation control
JP2020077271A (en) Display unit, learning device, and method for controlling display unit
US11726320B2 (en) Information processing apparatus, information processing method, and program
CN114661152B (en) AR display control system and method for reducing visual fatigue
US11982814B2 (en) Segmented illumination display
CN104375631A (en) Non-contact interaction method based on mobile terminal
CN104866112A (en) Non-contact interaction method based on mobile terminal
Sowmya et al. Eye gaze controlled wheelchair
KR102473669B1 (en) Visibility improvement method based on eye tracking, machine-readable storage medium and electronic device
Zhu et al. A novel target tracking method of unmanned drones by gaze prediction combined with YOLO algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant