CN113160260A - Head-eye double-channel intelligent man-machine interaction system and operation method - Google Patents

Head-eye double-channel intelligent man-machine interaction system and operation method Download PDF

Info

Publication number
CN113160260A
CN113160260A CN202110499945.2A CN202110499945A CN113160260A CN 113160260 A CN113160260 A CN 113160260A CN 202110499945 A CN202110499945 A CN 202110499945A CN 113160260 A CN113160260 A CN 113160260A
Authority
CN
China
Prior art keywords
module
data
head
eye
attitude
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110499945.2A
Other languages
Chinese (zh)
Inventor
于天河
温宏韬
王鹏
王世龙
庞广龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Penglu Intelligent Technology Co ltd
Harbin University of Science and Technology
Original Assignee
Harbin Penglu Intelligent Technology Co ltd
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Penglu Intelligent Technology Co ltd, Harbin University of Science and Technology filed Critical Harbin Penglu Intelligent Technology Co ltd
Priority to CN202110499945.2A priority Critical patent/CN113160260A/en
Publication of CN113160260A publication Critical patent/CN113160260A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • G06T5/70
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20024Filtering details
    • G06T2207/20032Median filtering

Abstract

The invention discloses a head-eye double-channel intelligent human-computer interaction system and an operation method, and belongs to the technical field of human-computer interaction. Head-eye double-channel intelligent man-machine interaction system comprises: the eye image acquisition module and the head posture detection module are connected with the system core module, the system core module is connected with the wireless data transmission module, and the power supply module is used for electrifying the eye image acquisition module, the head posture detection module, the system core module and the wireless data transmission module. According to the invention, the head posture is used for determining the gazing area to complete coarse positioning, the sight line is used for determining the specific gazing point, the head-eye double-input-channel control man-machine interaction mode is realized, and the defects of a single-input-channel man-machine interaction system are overcome.

Description

Head-eye double-channel intelligent man-machine interaction system and operation method
Technical Field
The invention relates to a head-eye double-channel intelligent human-computer interaction system and an operation method, and belongs to the technical field of human-computer interaction.
Background
The man-machine interaction technology is used as an interface for information communication between people and computers, and is widely applied to the fields of military affairs, industry, education, medical treatment, home furnishing and the like. With the rapid development of electronic technology and computer technology, the man-machine interaction mode is also changed greatly, is not limited to traditional mouse and keyboard input, display screen output and the like, but is gradually developed into a multi-channel interaction mode integrating the traditional interaction mode and natural man-machine interaction modes such as eye movement, posture, voice and the like, and the intelligence of man-machine interaction is greatly improved.
Since the sight line can reflect the attention information of human eyes and is used as an important information input source in human-computer interaction, the sight line tracking technology is receiving attention of researchers in the field. However, the existing sight control technology based on sight tracking has inherent defects, such as difficulty in completing precise operation in a large range and deviation of system precision caused by head movement, and the practicability of human-computer interaction only depending on sight is still low.
Since the gaze point of the human eye is determined by the orientation of the eye and the head together, the head orientation determines the range at which the human eye can gaze, and the eye determines the exact gaze direction. If the head gesture is used as a complementary input channel and combined with the sight input channel, the head gesture is used for determining the watching area to complete coarse positioning, the sight is used for determining the specific watching point, the man-machine interaction mode of head-eye double input channel control is realized, the defects of a single input channel are overcome, the practicability of sight interaction can be greatly improved, and huge market and economic benefits are achieved.
Disclosure of Invention
The invention aims to solve the problem that the existing single-channel visual interaction system is low in practicability in practical application, and discloses a head-eye double-channel intelligent man-machine interaction system and method based on a sight tracking technology and a head posture detection technology.
The utility model provides a first eye binary channels intelligence man-machine interactive system, first eye binary channels intelligence man-machine interactive system includes: the eye image acquisition module and the head posture detection module are connected with the system core module, the system core module is connected with the wireless data transmission module, and the power supply module simultaneously powers on the eye image acquisition module, the head posture detection module, the system core module and the wireless data transmission module;
furthermore, the head-eye double-channel intelligent man-machine interaction system also comprises an electronic terminal, wherein,
the eye image acquisition module is used for acquiring human eye gray level image data and transmitting the human eye gray level image data to the system core module;
the head attitude detection module is used for acquiring three-axis attitude data, then performing attitude calculation on the three-axis attitude data and sending an attitude calculation result to the system core module;
the system core module is used for processing the gray image data and the attitude calculation result, converting the gray image data and the attitude calculation result into interactive data suitable for a computer, and sending the interactive data to the electronic terminal;
the wireless data transmission module is used for enabling the system core module to be wirelessly connected with the electronic terminal;
the electronic terminal is used for converting the interactive data into an operation instruction suitable for the electronic terminal;
the head-eye dual-channel intelligent man-machine interaction system further comprises a head-mounted frame, wherein the head-mounted frame comprises a picture frame, a left leg, a right leg, a charging interface, an LED state indicator lamp and a lens expansion frame, the left leg and the right leg are respectively rotatably connected to the left end and the right end of the picture frame, the charging interface is arranged on the left leg, the lens expansion frame is arranged in the picture frame, and the LED state indicator lamp is arranged on the outer side surface of the left leg;
the LED status indicator lamp is an RGB indicator lamp and is used for indicating different states of the system;
the lens expansion frame is internally provided with a lens which is a plane lens or a near vision lens;
the left supporting leg and the right supporting leg are both of a cavity structure;
furthermore, a system core module is embedded in the cavity of the right supporting leg, the system core module comprises an eye image cache module and an FPGA control and data processing module, and the FPGA control and data processing module is respectively connected with the eye image acquisition module, the head posture detection module and the eye image cache module;
the eye image acquisition module is arranged at the lower right part of the mirror frame and comprises a single miniature infrared camera which integrates an infrared LED and can adjust the angle, a left and right camera angle adjusting roller and a camera upper and lower angle adjusting roller, and the left and right angle and the upper and lower angle of the miniature infrared camera are respectively adjusted by the left and right camera angle adjusting roller and the camera upper and lower angle adjusting roller;
the eye image data caching module is used for caching video image data output by the miniature infrared camera by taking a frame as a unit;
the FPGA control and data processing module is used for reading data in the eye image data caching module by taking a frame as a unit and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module to obtain sight falling point data;
the FPGA control and data processing module is also used for acquiring the attitude data output by the head attitude detection module and processing the attitude data to obtain head attitude angle data;
the FPGA control and data processing module is also used for fusing the sight line landing point data and the head attitude angle data through data fusion and converting the data into interactive data suitable for a computer,
the head posture detection module is embedded in the middle upper part of the spectacle frame; the head posture detection module comprises a three-axis gyroscope, a three-axis accelerometer and a digital motion processor DMP;
the core of the wireless data transmission module is a low-power Bluetooth chip which is connected with the FPGA control and data processing module.
Further, the power supply module comprises a power supply management module, a polymer lithium battery and a power supply switch button, wherein the power supply management module is connected with the charging interface, the polymer lithium battery is connected with the power supply management module, and the power supply switch button is connected with the power supply management module;
further, under the charging state of the power supply, the LED state indicating lamp emits red light, and emits green light when the power supply is full of the light; when a power switch button is pressed, the head-eye double-channel intelligent man-machine interaction system is powered on, and the LED state indicating lamp emits blue light to indicate that the system starts to work normally; when the electric quantity is lower than the preset low-electric-quantity warning line, the LED state indicating lamp flickers and emits red light.
A head-eye double-channel intelligent man-machine interaction method is based on the head-eye double-channel intelligent man-machine interaction system, and comprises the following steps:
s100, extracting sight line information;
s200, detecting the head posture;
s300, data fusion processing;
and S400, interactive data transmission.
Further, in S100, the method specifically includes the following steps:
s110, acquiring an original image by using a miniature infrared camera, and converting the original image into human eye gray image data;
s120, caching the acquired human eye gray level image data to an eye image data caching module by taking a frame as a unit;
and S130, reading the human eye gray level image data from the eye image data caching module by the FPGA control and data processing module, and extracting sight line information by using a sight line tracking algorithm.
Further, in S130, the extracting of the line of sight information includes the following steps:
s131, preprocessing an eye image: the image preprocessing comprises median filtering and filtering the salt and pepper noise in the eye image;
s132, pupil edge feature extraction: the pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: using a predetermined gray threshold value T and the following formula to carry out binary segmentation on the image f (x, y) to obtain a binary image g (x, y):
Figure BDA0003056060270000041
wherein, T is 35;
morphological treatment: for eyelash or pupil edge noise still existing in the image, the image is processed by using mathematical morphology operation, specifically, the system selects a disc-shaped structural element B with the radius of 6 according to the characteristics of the noise in the image, firstly uses a formula (2) to perform closed operation on the image A, and then uses a formula (3) to perform open operation processing on the image A:
Figure BDA0003056060270000042
Figure BDA0003056060270000043
sobel edge detection: performing convolution calculation on the image by adopting a 3 x 3 Sobel horizontal operator Gx and a vertical operator Gy, and extracting an edge profile of the pupil area;
s133, extracting the pupil center by utilizing the near-circle characteristic of the pupil: specifically, the minimum circumscribed rectangle of the contour of the pupil region is calculated, and the centroid (x) of the rectangle is used0,y0) Approximately replacing the pupil center;
s134, sight line drop point estimation: the sight line drop point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain a mapping relation between the pupil center coordinate and the calibration point; dividing the length and width of the interactive interface into three equal areas to obtain nine watching areas, selecting one area as a calibration interface,
the calibration method specifically comprises the following steps: the user adjusts the body position, so that the sight line falls on the middle calibration point of the calibration interface when the user looks directly at the eye, then the head is kept still, the user sequentially watches nine points on the calibration interface, the system records the pupil center coordinate and the corresponding calibration point coordinate, and the following two-dimensional mapping equation is established:
Figure BDA0003056060270000051
wherein, XG、YGRespectively, the abscissa and ordinate, X, of the calibration point of the calibration interfacee、YeRespectively corresponding to the transverse and longitudinal coordinates of the pupil center, a0~a5And b0~b5And for undetermined parameters of the mapping function, substituting the 9 sets of calibration data into the equation set to solve the undetermined parameters to obtain a mapping equation, and mapping the pupil center coordinate into the coordinate in the gazing area through the equation.
Further, in S200, the head posture detection step includes acquiring three-axis posture data through a three-axis gyroscope and a three-axis accelerometer inside the head posture detection module, performing posture resolution on the three-axis posture data by using the digital motion processor DMP, reading a quaternion obtained after the posture resolution is performed by using the three-axis posture data by using the FPGA control and data processing module, and performing formula conversion on the quaternion to obtain posture angle data, wherein the specific conversion process is as follows:
quaternion definition: one quaternion is denoted q ═ w + xi + yj + zk, written in matrix form:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
and (3) converting the quaternion and the Euler angle, and converting the quaternion by using a formula (8) to obtain an attitude angle:
Figure BDA0003056060270000052
wherein the content of the first and second substances,
Figure BDA0003056060270000053
θ, ψ denote a Roll angle Roll, a Pitch angle Pitch, and a Yaw angle Yaw, respectively.
Further, in the data fusion processing in S300, firstly, the attitude angle data obtained by the attitude calculation in S200 is obtained through the FPGA control and data processing module, and whether the head movement generates effective movement is determined by using a threshold; and then acquiring the original acceleration data obtained in the S200, continuously performing secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head motion as interactive action by using a threshold value, mapping the displacement of the effective interactive action into interactive interface coordinate data through a mapping equation, and superposing the displacement generated by non-interactive action on a sight line drop point to compensate the deviation of the head motion to the sight line drop point estimation.
Further, in the step S400, interactive data is sent to the computer terminal through the wireless data transmission module by the FPGA control and data processing module, and the computer converts the interactive data into an operation instruction of the mouse, thereby completing interaction with the human-computer.
The invention has the following advantages: the invention provides a head-eye double-channel intelligent human-computer interaction system and an interaction method. According to the head-eye double-channel intelligent man-machine interaction method, head movement is divided into movement with an interaction intention and non-interaction intention movement, the interaction intention movement is mapped into interaction data, the non-interaction intention movement is used for compensating a sight line drop point, deviation caused by tracking of an unconscious head movement to a sight line is reduced, and practicability of a sight line interaction technology is improved. According to the head-eye double-channel intelligent man-machine interaction system, data acquisition, processing and sending are based on embedded hardware design, the requirement on hardware performance of an upper computer is low, the use is convenient, and the applicability is strong.
Drawings
FIG. 1 is a system structure block diagram of a head-eye two-channel intelligent human-computer interaction system according to the present invention;
FIG. 2 is a schematic diagram of a system structure of a head-eye dual-channel intelligent human-computer interaction system according to the present invention;
FIG. 3 is a diagram of the steps of a method for extracting gaze information according to the present invention;
FIG. 4 is a schematic diagram of the Sobel operator of the present invention;
FIG. 5 is a schematic diagram of pupil center extraction;
FIG. 6 is a schematic diagram of a nine-point labeled bounding surface;
FIG. 7 is a flow chart of head motion recognition;
FIG. 8 is a schematic view of head movement;
FIG. 9 is a schematic view of head movement;
fig. 10 is a nine-point labeled boundary diagram.
The system comprises a head-mounted frame 1, a mirror frame 1-1, a left leg 1-2, a right leg 1-3, a charging interface 1-4, an LED status indicator lamp 1-5, a lens expansion frame 1-6, an eye image acquisition module 2, a micro infrared camera 2-1, a left and right angle adjusting roller 2-2, a camera up and down angle adjusting roller 2-3, a head posture detection module 3, a system core module 4, an eye image data cache module 4-1, an FPGA control and data processing module 4-2, a wireless data transmission module 5, a power supply module 6, a power supply management module 6-1, a polymer lithium battery 6-2 and a power switch button 6-3.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, the invention provides a head-eye dual-channel intelligent man-machine interaction system, which comprises: eye image acquisition module 2, head gesture detection module 3, system core module 4, wireless data transmission module 5 and power module 6, eye image acquisition module 2 and head gesture detection module 3 all are connected with system core module 4, and system core module 4 is connected with wireless data transmission module 5, and power module 6 is eye image acquisition module 2, head gesture detection module 3, system core module 4 and the 5 circular telegrams of wireless data transmission module simultaneously.
Furthermore, the head-eye double-channel intelligent man-machine interaction system also comprises an electronic terminal, wherein,
the eye image acquisition module 2 is used for acquiring eye gray level image data and transmitting the eye gray level image data to the system core module 4;
the head attitude detection module 3 is used for acquiring three-axis attitude data, then performing attitude calculation on the three-axis attitude data, and sending an attitude calculation result to the system core module 4;
the system core module 4 is used for processing the gray image data and the attitude calculation result, converting the gray image data and the attitude calculation result into interactive data suitable for a computer, and sending the interactive data to the electronic terminal;
the wireless data transmission module 5 is used for enabling the system core module 4 to be wirelessly connected with the electronic terminal;
and the electronic terminal is used for converting the interactive data into an operation instruction suitable for the electronic terminal.
Further, as shown in fig. 2, the head-eye dual-channel intelligent man-machine interaction system further comprises a head-wearing frame 1, wherein the head-wearing frame 1 comprises a frame 1-1, a left leg 1-2, a right leg 1-3, a charging interface 1-4, an LED status indicator lamp 1-5 and a lens expansion frame 1-6, the left leg 1-2 and the right leg 1-3 are respectively rotatably connected to the left end and the right end of the frame 1-1, the charging interface 1-4 is arranged on the left leg 1-2, the lens expansion frame 1-6 is arranged in the frame 1-1, and the LED status indicator lamp 1-5 is arranged on the outer side face of the left leg 1-2.
Further, the LED status indicator lamps 1-5 are RGB indicator lamps, and can emit light of three colors, red, green, and blue, for indicating different statuses of the system.
Furthermore, the lenses are arranged in the lens expansion frames 1-6 and are plane lenses or near-sighted lenses, so that the expansibility and the practicability of the system can be improved.
Furthermore, the left leg 1-2 and the right leg 1-3 are both of a cavity structure.
Further, a system core module 4 is embedded in the cavity of the right leg 1-3, the system core module 4 comprises an eye image cache module 4-1 and an FPGA control and data processing module 4-2, and the FPGA control and data processing module 4-2 is respectively connected with the eye image acquisition module 2, the head posture detection module 3 and the eye image cache module 4-1.
Further, the eye image acquisition module 2 is installed at the lower right portion of the frame 1-1, the eye image acquisition module 2 comprises a single miniature infrared camera 2-1 which integrates an infrared LED and is adjustable in angle, a camera left and right angle adjusting roller 2-2 and a camera upper and lower angle adjusting roller 2-3, and the left and right angle and the upper and lower angle of the miniature infrared camera 2-1 are adjusted through the camera left and right angle adjusting roller 2-2 and the camera upper and lower angle adjusting roller 2-3 respectively.
Further, wherein,
the eye image data caching module 4-1 is used for caching video image data output by the micro infrared camera 2-1 by taking a frame as a unit;
the FPGA control and data processing module 4-2 is used for reading data in the eye image data cache module 4-1 by taking a frame as a unit and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module 4-2 to obtain sight line landing point data;
the FPGA control and data processing module 4-2 is also used for acquiring the attitude data output by the head attitude detection module 3 and processing the attitude data to obtain head attitude angle data;
the FPGA control and data processing module 4-2 is also used for carrying out fusion processing on the sight line landing point data and the head attitude angle data through data fusion and converting the sight line landing point data and the head attitude angle data into interactive data suitable for a computer.
Further, the head posture detection module 3 is embedded in the middle upper part of the spectacle frame 1-1; the head posture detection module 3 comprises a three-axis gyroscope, a three-axis accelerometer and a digital motion processor DMP.
Furthermore, the core of the wireless data transmission module 5 is a low-power consumption bluetooth chip, which is connected with the FPGA control and data processing module 4-2 and is used for sending the interactive data to the mobile phone or computer and other terminals in a wireless manner.
Further, the power module 6 comprises a power management module 6-1, a polymer lithium battery 6-2 and a power switch button 6-3, wherein the power management module 6-1 is connected with the charging interface 1-4, the polymer lithium battery 6-2 is connected with the power management module 6-1, and the power switch button 6-3 is connected with the power management module 6-1.
Further, under the charging state of the power supply, the LED state indicator lamps 1-5 emit red light, and emit green light when the power supply is full of the light; when the power switch button 6-3 is pressed, the head-eye dual-channel intelligent man-machine interaction system is powered on, and the LED state indicator lamps 1-5 emit blue light to indicate that the system starts to work normally; when the electric quantity is lower than the preset low-electric-quantity warning line, the LED state indicating lamps 1-5 flash and emit red light.
A head-eye double-channel intelligent man-machine interaction method is based on the head-eye double-channel intelligent man-machine interaction system, and comprises the following steps:
s100, extracting sight line information;
s200, detecting the head posture;
s300, data fusion processing;
and S400, interactive data transmission.
Further, in S100, the method specifically includes the following steps:
s110, acquiring an original image by using the miniature infrared camera 2-1, and converting the original image into human eye gray image data;
s120, caching the acquired human eye gray level image data to an eye image data caching module 4-1 by taking a frame as a unit;
s130, reading the human eye gray level image data from the eye image data caching module 4-1 by the FPGA control and data processing module 4-2, and extracting the sight line information by using a sight line tracking algorithm.
Further, in S130, referring to fig. 3, the extracting of the gaze information includes the following steps:
s131, preprocessing an eye image: the image preprocessing comprises median filtering and is used for filtering salt and pepper noise in the eye image;
s132, pupil edge feature extraction: the pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: using a predetermined gray threshold value T and the following formula to carry out binary segmentation on the image f (x, y) to obtain a binary image g (x, y):
Figure BDA0003056060270000101
wherein, T is 35;
morphological treatment: for eyelash or pupil edge noise still existing in the image, the image is processed by using mathematical morphology operation, specifically, the system selects a disc-shaped structural element B with the radius of 6 according to the characteristics of the noise in the image, firstly uses a formula (2) to perform closed operation on the image A, and then uses a formula (3) to perform open operation processing on the image A:
Figure BDA0003056060270000102
Figure BDA0003056060270000103
sobel edge detection: referring to fig. 4, performing convolution calculation on the image by using a 3 × 3 Sobel horizontal operator Gx and a vertical operator Gy to extract an edge profile of the pupil region;
s133, utilizing the near round feature of the pupilSexual extraction of pupil center: specifically, the minimum circumscribed rectangle of the contour of the pupil region is calculated, and the centroid (x) of the rectangle is used0,y0) Approximately replacing the pupil center;
s134, sight line drop point estimation: referring to fig. 5 and 6, the gaze point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain a mapping relationship between the pupil center coordinates and the calibration points; dividing the length and width of the interactive interface into three equal areas to obtain nine watching areas, selecting one area as a calibration interface,
the calibration method specifically comprises the following steps: the user adjusts the body position, so that the sight line falls on the middle calibration point of the calibration interface when the user looks directly at the eye, then the head is kept still, the user sequentially watches nine points on the calibration interface, the system records the pupil center coordinate and the corresponding calibration point coordinate, and the following two-dimensional mapping equation is established:
Figure BDA0003056060270000104
wherein, XG、YGRespectively, the abscissa and ordinate, X, of the calibration point of the calibration interfacee、YeRespectively corresponding to the transverse and longitudinal coordinates of the pupil center, a0~a5And b0~b5And for undetermined parameters of the mapping function, substituting the 9 sets of calibration data into the equation set to solve the undetermined parameters to obtain a mapping equation, and mapping the pupil center coordinate into the coordinate in the gazing area through the equation.
S135, sight line drop point estimation: the sight line drop point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain a mapping relation between the pupil center coordinate and the calibration point; in order to ensure the naturalness of sight line interaction and the accuracy of sight line control in a small-range area, the system equally divides the length and the width of an interaction interface into nine watching areas with equal areas, selects one area as a calibration interface, and is shown in figure 6,
the calibration method specifically comprises the following steps: the user adjusts the body position, so that the sight line falls on the middle calibration point of the calibration interface when the user looks directly at the eye, then the head is kept still, the user sequentially watches nine points on the calibration interface, the system records the pupil center coordinate and the corresponding calibration point coordinate, and the following two-dimensional mapping equation is established:
Figure BDA0003056060270000111
wherein, XG、YGRespectively, the abscissa and ordinate, X, of the calibration point of the calibration interfacee、YeRespectively corresponding to the transverse and longitudinal coordinates of the pupil center, a0~a5And b0~b5And for undetermined parameters of the mapping function, substituting the 9 sets of calibration data into the equation set to solve the undetermined parameters to obtain a mapping equation, and mapping the pupil center coordinate into the coordinate in the gazing area through the equation.
Further, in S200, the head posture is detected by firstly acquiring three-axis posture data through a three-axis gyroscope and a three-axis accelerometer inside the head posture detection module 3, then performing posture resolution on the three-axis posture data by using the digital motion processor DMP, reading quaternions obtained after posture resolution by using the angular velocity data and the acceleration data by the FPGA control and data processing module 4-2, and finally performing formula conversion on the quaternions to obtain posture angle data, wherein the specific conversion process is as follows:
quaternion definition: one quaternion is denoted q ═ w + xi + yj + zk, written in matrix form:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
and (3) converting the quaternion and the Euler angle, and converting the quaternion by using a formula (8) to obtain an attitude angle:
Figure BDA0003056060270000121
wherein the content of the first and second substances,
Figure BDA0003056060270000122
θ, ψ denote a Roll angle Roll, a Pitch angle Pitch, and a Yaw angle Yaw, respectively.
Further, in S300, referring to fig. 7, in the data fusion process, firstly, the FPGA control and data processing module 4-2 obtains attitude angle data obtained through attitude calculation in S200, and determines whether the head movement generates effective movement by using a threshold; and then acquiring the original acceleration data obtained in the S200, continuously performing secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head motion as interactive action by using a threshold value, mapping the displacement of the effective interactive action into interactive interface coordinate data through a mapping equation, and superposing the displacement generated by non-interactive action on a sight line drop point to compensate the deviation of the head motion to the sight line drop point estimation.
Specifically, (1) effective interaction:
when the head movement displacement is larger than a set threshold value, the head movement displacement is identified as an action with an interactive intention, and at the moment, a displacement vector generated by the projection of a head movement track in a space three-dimensional coordinate system on an X-Y plane is also mapped by a two-dimensional polynomial mapping method based on nine-point calibration
Figure BDA0003056060270000123
As the input of the equation, the relationship between the displacement vector and the interactive interface calibration point is obtained, and the motion of the head is represented by a space three-dimensional coordinate system as shown in fig. 8; the components of the head movement locus on the X-Y plane are shown in fig. 9, the movement around the Pitch angle Pitch can be decomposed into the Z axis and the Y axis, the movement around the Yaw angle Yaw can be decomposed into the Z axis and the X axis, and the movement around the Roll angle Roll can be decomposed into the X axis and the Y axis; the calibration interface is the whole interactive interface, as shown in fig. 10.
The calibration method comprises the following steps: keeping the body still, enabling the sight line to keep looking at a certain point on a calibration interface through the movement of the head, sequentially completing the calibration of nine points to obtain nine sets of displacement vectors as the input of an equation set, substituting the displacement vectors into the equation set to solve undetermined parameters, and mapping the head action to the change of the cursor position of the computer interaction interface; when the head of the user moves to generate an interactive action, the system shields the data of the visual input channel, the head movement data is used as the only data source for controlling the interactive interface cursor, and the user determines a watching area only through the movement of the head to finish coarse positioning; when the head movement stops, the system starts a visual input channel, sight line falling point data is used as a data source for the movement of the cursor of the interactive interface, and the user controls the cursor to move in the watching area through the movement of the sight line to finish accurate positioning; the system simulates the sight watching action of the user into double-click operation of an interactive interface cursor by setting a watching time threshold;
(2) invalid interaction action:
when the head movement displacement is smaller than a set threshold value, the head movement is identified as unconscious head movement and is taken as non-interactive action; the head movement is used as compensation for the sight line drop point, the compensation mode is that the head movement is superposed on the pupil center coordinate relative to the displacement data on the X-Y plane, and then the sight line drop point is estimated by utilizing the established two-dimensional mapping equation;
further, in S400, interactive data is sent by the FPGA control and data processing module 4-2 to the computer via the wireless data transmission module 5, and the computer converts the interactive data into an operation command of the mouse, thereby completing interaction with the human-computer.
The above embodiments are only used to help understanding the method of the present invention and the core idea thereof, and a person skilled in the art can also make several modifications and decorations on the specific embodiments and application scope according to the idea of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. The utility model provides a head-eye binary channels intelligence man-machine interactive system which characterized in that, head-eye binary channels intelligence man-machine interactive system includes: eye image acquisition module (2), head gesture detection module (3), system core module (4), wireless data transmission module (5) and power module (6), eye image acquisition module (2) and head gesture detection module (3) all with system core module (4) are connected, system core module (4) with wireless data transmission module (5) are connected, power module (6) do simultaneously eye image acquisition module (2), head gesture detection module (3), system core module (4) and wireless data transmission module (5) circular telegram.
2. The head-eye dual channel intelligent human-computer interaction system of claim 1, further comprising an electronic terminal, wherein,
the eye image acquisition module (2) is used for acquiring human eye gray level image data and transmitting the human eye gray level image data to the system core module (4);
the head attitude detection module (3) is used for acquiring three-axis attitude data, then performing attitude calculation on the three-axis attitude data, and sending an attitude calculation result to the system core module (4);
the system core module (4) is used for processing the gray image data and the attitude calculation result, converting the gray image data and the attitude calculation result into interactive data suitable for a computer, and sending the interactive data to the electronic terminal;
the wireless data transmission module (5) is used for enabling the system core module (4) to be in wireless connection with an electronic terminal;
the electronic terminal is used for converting the interactive data into an operation instruction suitable for the electronic terminal;
the head-eye dual-channel intelligent man-machine interaction system also comprises the head-mounted frame (1), the head-wearing type frame (1) comprises a frame (1-1), a left leg (1-2), a right leg (1-3), a charging interface (1-4), an LED status indicator lamp (1-5) and a lens expansion frame (1-6), the left leg (1-2) and the right leg (1-3) are respectively connected with the left end and the right end of the spectacle frame (1-1) in a rotating way, the charging interfaces (1-4) are arranged on the left supporting leg (1-2), the lens expansion frames (1-6) are arranged in the lens frame (1-1), the LED state indicating lamp (1-5) is arranged on the outer side face of the left supporting leg (1-2);
the LED status indicator lamps (1-5) are RGB indicator lamps and are used for indicating different states of the system;
lenses are arranged in the lens expansion frames (1-6), and the lenses are plane lenses or near vision lenses;
the left supporting leg (1-2) and the right supporting leg (1-3) are both of a cavity structure;
further, the system core module (4) is embedded in the cavity of the right support leg (1-3), the system core module (4) comprises an eye image cache module (4-1) and an FPGA control and data processing module (4-2), and the FPGA control and data processing module (4-2) is respectively connected with the eye image acquisition module (2), the head posture detection module (3) and the eye image cache module (4-1);
the eye image acquisition module (2) is installed at the lower right part of the mirror frame (1-1), the eye image acquisition module (2) comprises a single miniature infrared camera (2-1) which integrates an infrared LED and can adjust the angle, a left and right camera angle adjusting roller (2-2) and a top and bottom camera angle adjusting roller (2-3), and the left and right angle and the top and bottom angle of the miniature infrared camera (2-1) are respectively adjusted through the left and right camera angle adjusting roller (2-2) and the top and bottom camera angle adjusting roller (2-3);
the eye image data caching module (4-1) is used for caching video image data output by the micro infrared camera (2-1) by taking a frame as a unit;
the FPGA control and data processing module (4-2) is used for reading data in the eye image data caching module (4-1) by taking a frame as a unit, and processing the image data by utilizing a sight tracking algorithm in the FPGA control and data processing module (4-2) to obtain sight point data;
the FPGA control and data processing module (4-2) is also used for acquiring the attitude data output by the head attitude detection module (3) and processing the attitude data to obtain head attitude angle data;
the FPGA control and data processing module (4-2) is also used for carrying out fusion processing on the sight line landing point data and the head attitude angle data through data fusion and converting the data into interactive data suitable for a computer,
the head posture detection module (3) is embedded in the middle upper part of the spectacle frame (1-1); the head posture detection module (3) comprises a three-axis gyroscope, a three-axis accelerometer and a Digital Motion Processor (DMP);
the core of the wireless data transmission module (5) is a low-power Bluetooth chip and is connected with the FPGA control and data processing module (4-2).
3. A head-eye dual-channel intelligent man-machine interaction system as claimed in claim 1, wherein the power module (6) comprises a power management module (6-1), a polymer lithium battery (6-2) and a power switch button (6-3), wherein the power management module (6-1) is connected with the charging interface (1-4), the polymer lithium battery (6-2) is connected with the power management module (6-1), and the power switch button (6-3) is connected with the power management module (6-1).
4. A head-eye two-channel intelligent man-machine interaction system as claimed in claim 1, wherein in the power supply charging state, the LED status indicator lamps (1-5) emit red light, and when full, emit green light; when a power switch button (6-3) is pressed, the head-eye dual-channel intelligent man-machine interaction system is powered on, and the LED state indicator lamps (1-5) emit blue light to indicate that the system starts to work normally; and when the electric quantity is lower than a preset low-electric-quantity warning line, the LED state indicating lamps (1-5) flicker and emit red light.
5. A head-eye dual-channel intelligent man-machine interaction method based on any one of claims 1-4, wherein the head-eye dual-channel intelligent man-machine interaction method comprises the following steps:
s100, extracting sight line information;
s200, detecting the head posture;
s300, data fusion processing;
and S400, interactive data transmission.
6. The head-eye dual-channel intelligent human-computer interaction method as claimed in claim 5, wherein in S100, the method specifically comprises the following steps:
s110, acquiring an original image by using the miniature infrared camera (2-1), and converting the original image into human eye gray image data;
s120, caching the acquired human eye gray level image data to the eye image data caching module (4-1) by taking a frame as a unit;
s130, the FPGA control and data processing module (4-2) reads the human eye gray image data from the eye image data caching module (4-1), and sight line information is extracted by using a sight line tracking algorithm.
7. The head-eye dual-channel intelligent human-computer interaction method as claimed in claim 6, wherein in S130, the extracting of the sight line information comprises the following steps:
s131, preprocessing an eye image: the image preprocessing comprises median filtering and is used for filtering salt and pepper noise in the eye image;
s132, pupil edge feature extraction: the pupil edge feature extraction comprises three substeps of image binarization, morphological processing and Sobel edge detection:
binarization: using a predetermined gray threshold value T and the following formula to carry out binary segmentation on the image f (x, y) to obtain a binary image g (x, y):
Figure FDA0003056060260000041
wherein, T is 35;
morphological treatment: for eyelash or pupil edge noise still existing in the image, the image is processed by using mathematical morphology operation, specifically, the system selects a disc-shaped structural element B with the radius of 6 according to the characteristics of the noise in the image, firstly uses a formula (2) to perform closed operation on the image A, and then uses a formula (3) to perform open operation processing on the image A:
Figure FDA0003056060260000042
Figure FDA0003056060260000043
sobel edge detection: performing convolution calculation on the image by adopting a 3 x 3 Sobel horizontal operator Gx and a vertical operator Gy, and extracting an edge profile of the pupil area;
s133, extracting the pupil center by utilizing the near-circle characteristic of the pupil: specifically, the minimum circumscribed rectangle of the contour of the pupil region is calculated, and the centroid (x) of the rectangle is used0,y0) Approximately replacing the pupil center;
s134, sight line drop point estimation: the sight line drop point estimation adopts a two-dimensional polynomial mapping method based on nine-point calibration to obtain a mapping relation between the pupil center coordinate and the calibration point; dividing the length and width of the interactive interface into three equal areas to obtain nine watching areas, selecting one area as a calibration interface,
the calibration method specifically comprises the following steps: the user adjusts the body position, so that the sight line falls on the middle calibration point of the calibration interface when the user looks directly at the eye, then the head is kept still, the user sequentially watches nine points on the calibration interface, the system records the pupil center coordinate and the corresponding calibration point coordinate, and the following two-dimensional mapping equation is established:
Figure FDA0003056060260000044
wherein, XG、YGRespectively, the abscissa and ordinate, X, of the calibration point of the calibration interfacee、YeRespectively corresponding to the transverse and longitudinal coordinates of the pupil center, a0~a5And b0~b5And for undetermined parameters of the mapping function, substituting the 9 sets of calibration data into the equation set to solve the undetermined parameters to obtain a mapping equation, and mapping the pupil center coordinate into the coordinate in the gazing area through the equation.
8. The head-eye dual-channel intelligent man-machine interaction method as claimed in claim 5, wherein in S200, the head attitude detection firstly obtains three-axis attitude data through a three-axis gyroscope and a three-axis accelerometer inside the head attitude detection module (3), then uses a digital motion processor DMP to perform attitude solution on the three-axis attitude data, the FPGA control and data processing module (4-2) reads quaternion obtained by the attitude solution of the three-axis attitude data, and finally performs formula conversion on the quaternion to obtain attitude angle data, and the specific conversion process is as follows:
quaternion definition: one quaternion is denoted q ═ w + xi + yj + zk, written in matrix form:
q=(w,x,y,z)T (6)
|q|2=w2+x2+y2+z2=1 (7)
and (3) converting the quaternion and the Euler angle, and converting the quaternion by using a formula (8) to obtain an attitude angle:
Figure FDA0003056060260000051
wherein the content of the first and second substances,
Figure FDA0003056060260000052
θ, ψ denote a Roll angle Roll, a Pitch angle Pitch, and a Yaw angle Yaw, respectively.
9. The head-eye dual-channel intelligent human-computer interaction method as claimed in claim 5, wherein in S300, the data fusion processing comprises firstly obtaining attitude angle data obtained by attitude calculation in S200 through an FPGA control and data processing module (4-2), and judging whether head motion generates effective motion by using a threshold; and then acquiring the original acceleration data obtained in the S200, continuously performing secondary integration on the acceleration data to obtain displacement data, judging the effectiveness of head motion as interactive action by using a threshold value, mapping the displacement of the effective interactive action into interactive interface coordinate data through a mapping equation, and superposing the displacement generated by non-interactive action on a sight line drop point to compensate the deviation of the head motion to the sight line drop point estimation.
10. The head-eye dual-channel intelligent man-machine interaction method as claimed in claim 5, wherein the interactive data transmission in S400 is performed by sending the interactive data to a computer end through a wireless data transmission module (5) by using an FPGA control and data processing module (4-2), and the computer converts the interactive data into an operation instruction of a mouse to complete the interaction with the man-machine.
CN202110499945.2A 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method Pending CN113160260A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110499945.2A CN113160260A (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110499945.2A CN113160260A (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Publications (1)

Publication Number Publication Date
CN113160260A true CN113160260A (en) 2021-07-23

Family

ID=76874130

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110499945.2A Pending CN113160260A (en) 2021-05-08 2021-05-08 Head-eye double-channel intelligent man-machine interaction system and operation method

Country Status (1)

Country Link
CN (1) CN113160260A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023240447A1 (en) * 2022-06-14 2023-12-21 北京小米移动软件有限公司 Head movement detection method, apparatus, device, and storage medium
WO2023245316A1 (en) * 2022-06-20 2023-12-28 北京小米移动软件有限公司 Human-computer interaction method and device, computer device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193383A (en) * 2017-06-13 2017-09-22 华南师范大学 A kind of two grades of Eye-controlling focus methods constrained based on facial orientation
US20190121427A1 (en) * 2016-06-08 2019-04-25 South China University Of Technology Iris and pupil-based gaze estimation method for head-mounted device
CN112578682A (en) * 2021-01-05 2021-03-30 陕西科技大学 Intelligent obstacle-assisting home system based on electro-oculogram control
CN215814080U (en) * 2021-05-08 2022-02-11 哈尔滨鹏路智能科技有限公司 Head-eye double-channel intelligent man-machine interaction system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190121427A1 (en) * 2016-06-08 2019-04-25 South China University Of Technology Iris and pupil-based gaze estimation method for head-mounted device
CN107193383A (en) * 2017-06-13 2017-09-22 华南师范大学 A kind of two grades of Eye-controlling focus methods constrained based on facial orientation
CN112578682A (en) * 2021-01-05 2021-03-30 陕西科技大学 Intelligent obstacle-assisting home system based on electro-oculogram control
CN215814080U (en) * 2021-05-08 2022-02-11 哈尔滨鹏路智能科技有限公司 Head-eye double-channel intelligent man-machine interaction system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王鹏;陈园园;邵明磊;刘博;张伟超;: "基于眼动跟踪的智能家居控制器", 电机与控制学报, no. 05, 15 May 2020 (2020-05-15) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023240447A1 (en) * 2022-06-14 2023-12-21 北京小米移动软件有限公司 Head movement detection method, apparatus, device, and storage medium
WO2023245316A1 (en) * 2022-06-20 2023-12-28 北京小米移动软件有限公司 Human-computer interaction method and device, computer device and storage medium

Similar Documents

Publication Publication Date Title
US11068050B2 (en) Method for controlling display of virtual image based on eye area size, storage medium and electronic device therefor
Kar et al. A review and analysis of eye-gaze estimation systems, algorithms and performance evaluation methods in consumer platforms
US10078377B2 (en) Six DOF mixed reality input by fusing inertial handheld controller with hand tracking
KR20230164185A (en) Bimanual interactions between mapped hand regions for controlling virtual and graphical elements
CN116324677A (en) Non-contact photo capture in response to detected gestures
US11715231B2 (en) Head pose estimation from local eye region
WO2019154539A1 (en) Devices, systems and methods for predicting gaze-related parameters
CN103885589A (en) Eye movement tracking method and device
WO2020147948A1 (en) Methods for generating calibration data for head-wearable devices and eye tracking system
CN113160260A (en) Head-eye double-channel intelligent man-machine interaction system and operation method
CN204442580U (en) A kind of wear-type virtual reality device and comprise the virtual reality system of this equipment
US20230116638A1 (en) Method for eye gaze tracking
CN110341617B (en) Eyeball tracking method, device, vehicle and storage medium
CN111938608A (en) AR (augmented reality) glasses, monitoring system and monitoring method for intelligent monitoring of old people
WO2019031005A1 (en) Information processing device, information processing method, and program
CN108052901B (en) Binocular-based gesture recognition intelligent unmanned aerial vehicle remote control method
CN215814080U (en) Head-eye double-channel intelligent man-machine interaction system
US11656471B2 (en) Eyewear including a push-pull lens set
CN110825216A (en) Method and system for man-machine interaction of driver during driving
US20200150758A1 (en) Display device, learning device, and control method of display device
CN116503794A (en) Fatigue detection method for cockpit unit
CN115185365A (en) Wireless control eye control system and control method thereof
CN114661152B (en) AR display control system and method for reducing visual fatigue
CN104375631A (en) Non-contact interaction method based on mobile terminal
WO2020253949A1 (en) Systems and methods for determining one or more parameters of a user's eye

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination