CN114201054A - Method for realizing non-contact human-computer interaction based on head posture - Google Patents

Method for realizing non-contact human-computer interaction based on head posture Download PDF

Info

Publication number
CN114201054A
CN114201054A CN202210150603.4A CN202210150603A CN114201054A CN 114201054 A CN114201054 A CN 114201054A CN 202210150603 A CN202210150603 A CN 202210150603A CN 114201054 A CN114201054 A CN 114201054A
Authority
CN
China
Prior art keywords
head
angle
computer interaction
coordinate system
contact human
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210150603.4A
Other languages
Chinese (zh)
Inventor
袁宏宇
刘国清
杨广
王启程
徐涵
全丹辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Minieye Innovation Technology Co Ltd
Original Assignee
Shenzhen Minieye Innovation Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Minieye Innovation Technology Co Ltd filed Critical Shenzhen Minieye Innovation Technology Co Ltd
Priority to CN202210150603.4A priority Critical patent/CN114201054A/en
Publication of CN114201054A publication Critical patent/CN114201054A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/012Head tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a method for realizing non-contact human-computer interaction based on head gestures, which comprises the following steps: establishing a 3D face model, establishing a camera coordinate system, collecting a face image video data stream of a user in real time, automatically positioning facial feature points through a feature point coordinate determination unit, outputting feature point coordinates of a current frame, determining an output result of the unit according to the feature point coordinates, selecting the feature points corresponding to the points in the 3D face model by a head posture determination unit, and calculating a pitch angle and a yaw angle under the camera coordinate system; recording a yaw angle and a pitch angle output for the first time as initial positions, enabling the current angle to correspond to a central point on a display, and calculating a difference value of the angles to be multiplied by a coefficient to correspond to pixel points of a screen; the invention can complete non-contact and continuous human-computer interaction under the condition of low image quality, does not need to wear related equipment, and has low cost and high identification accuracy.

Description

Method for realizing non-contact human-computer interaction based on head posture
Technical Field
The invention relates to the technical field of computers, in particular to a method for realizing non-contact human-computer interaction based on head gestures.
Background
The existing human-computer interaction scheme is basically completed by direct contact between a human and a machine, such as direct hand touch equipment, sliding click through a keyboard and a mouse and the like, and along with the rapid development of artificial intelligence technology, the human-computer interaction scheme with direct contact between the human and the machine cannot meet all application scenes; in addition, a sight line tracking scheme is adopted, so that the problems of low precision and high cost when the distance is too far exist, and high requirements on image quality are met; the scheme based on the gesture is difficult to continuously send out the instruction, and the aim can be achieved only by making corresponding gestures for many times.
The Chinese invention with publication number CN104123002B discloses a wireless somatosensory mouse based on head movement, which comprises a movement acquisition module, a data processing module and a control module, wherein the movement acquisition module is used for acquiring head movement information, recording head movement signals and transmitting head movement signals to the data processing module; the data processing module is used for receiving the head movement data transmitted by the motion acquisition module and processing the data to obtain data required by controlling a computer cursor; the wireless receiving and sending module is used for realizing wireless transmission of data between the equipment and the computer; the power module is used for providing a working power supply for the motion acquisition module, the data processing module and the wireless receiving and sending module, however, the wireless somatosensory mouse cannot meet all application scenes, and still needs to be carried on the head to move the head so as to move a cursor.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides a method for realizing non-contact human-computer interaction based on head gestures, which can realize human-computer interaction without carrying equipment.
The purpose of the invention is realized by the following technical scheme:
a method for realizing non-contact human-computer interaction based on head gestures comprises the following steps:
step one, establishing a 3D face model, ensuring that the number of characteristic points of the 3D face model is matched with the output result of a characteristic point coordinate determination unit, and establishing a head coordinate system by using the 3D face model;
establishing a camera coordinate system, acquiring a human face image video data stream of a user in real time, and establishing an image coordinate system for a single-frame image;
step three, determining a unit to automatically position facial feature points through feature point coordinates, and outputting feature point coordinates of a current frame, wherein the feature point coordinates are located in an image coordinate system;
fourthly, according to the output result of the feature point coordinate determination unit, the head posture determination unit selects the feature points to correspond to the points in the 3D face model, and the pitch angle and the yaw angle under the camera coordinate system are calculated;
fifthly, according to the yaw angle and the pitch angle output by the head attitude determination unit, recording the yaw angle and the pitch angle output for the first time as initial positions by using an angle-coordinate conversion unit, enabling the current angle to correspond to a central point on a display, and calculating the difference value of the angles to be multiplied by a coefficient to correspond to pixel points on a screen;
and step six, moving the head to realize non-contact human-computer interaction.
Furthermore, the single-frame image in the second step is cut and rotationally scaled to meet the input requirement of the characteristic point coordinate determination power supply.
Further, the characteristic points comprise left and right intraocular canthi, bridge of nose, tip of nose and chin point.
Furthermore, the angle ranges of the pitch angle and the yaw angle are-90 degrees.
Further, when the absolute value of the pitch angle is greater than 25 degrees or the absolute value of the yaw angle is greater than 40 degrees, angle-coordinate conversion is not performed, and the conversion formula is as follows:
Figure 633077DEST_PATH_IMAGE001
wherein x and y are coordinates of the characteristic point in the image coordinate system, U, V and W are coordinates of the characteristic point in the head coordinate system, R is a rotation matrix, T is a translation vector, fx, fy, cx,cyIs a camera distortion parameter.
Further, the calculation formula of the coordinates of the feature points in the image coordinate system is as follows:
Figure 231549DEST_PATH_IMAGE002
wherein: k is a coefficient, yawsYaw as the initial positionnPitch being the yaw angle of the current positionsPitch angle of initial position, pitchnThe pitch angle for the current position.
Further, the size of the k value is positively correlated with the size of the display.
Further, the method further comprises: calculating whether the head of the user is excessively pitched or not according to the change of the feature points in the facial image of the user; and if the pitching is determined to be excessive, locking the motion of the moving cursor.
Compared with the prior art, the invention has the following advantages and beneficial effects:
according to the invention, the non-contact and continuous human-computer interaction between the user and the equipment can be realized by capturing the head posture and the characteristic points, the quality requirement and the cost of the picture are lower than those of a sight tracking scheme, the human-computer interaction can be accurately performed when the distance is long, the user does not need to continuously perform the same or different gestures for multiple times to realize the human-computer interaction, the non-contact and continuous human-computer interaction can be also completed under the condition of low image quality, the cursor movement is controlled by moving the head, the cost is low, the recognition accuracy is high, the speed is high, and the method is suitable for market popularization.
Drawings
FIG. 1 is a flow chart of a method for implementing non-contact human-computer interaction based on head pose;
fig. 2 is a schematic product side view.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terms "comprises," "comprising," and "having" and any variations thereof in the description and claims of the invention and the above-described drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
As shown in FIG. 1, a method for realizing non-contact human-computer interaction based on head gestures comprises the following steps
The method comprises the following steps:
step one, establishing a 3D face model, ensuring that the number of characteristic points of the 3D face model is matched with the output result of a characteristic point coordinate determination unit, and establishing a head coordinate system by using the 3D face model; the 3D face model is used for describing a point set of three-dimensional coordinate information of feature points of a face, a head coordinate system is established by taking a head as a central point in the real world, the positive direction of the face is the positive direction of a Z axis, the positive direction of the top of the head is the positive direction of a Y axis, and the positive direction of the left ear is the positive direction of an X axis;
establishing a camera coordinate system, acquiring a human face image video data stream of a user in real time, and establishing an image coordinate system for a single-frame image; a coordinate system established by taking the camera 1 as a central point in the real world is called a camera coordinate system, the positive direction of the camera 1 is the positive direction of a Z axis, the positive direction of the top is the positive direction of a Y axis, and the positive direction of the left side is the positive direction of an X axis; the image coordinate system is a coordinate system established with the image presented on the display 2, the right direction is the positive direction of the X axis, and the upper direction is the positive direction of the Y axis;
thirdly, automatically positioning facial feature points by a feature point coordinate determination unit according to a trained human face 68 point key point regression neural network model, and outputting feature point coordinates of the current frame, wherein the feature point coordinates are located in an image coordinate system;
fourthly, according to the output result of the feature point coordinate determination unit, the head posture determination unit selects feature points corresponding to points in the 3D face model, calculates feature points of a head coordinate system of the 3D face model to an affine transformation matrix of the feature points on an image coordinate system, and calculates a pitch angle and a yaw angle under a camera coordinate system by using rotation information in the affine transformation matrix, wherein the affine transformation means that in geometry, a vector space is subjected to linear transformation once and is connected with translation, and is transformed into another vector space, wherein an angle rotating around an X axis is called a pitch angle, and an angle rotating around a Y axis is called a yaw angle;
step five, according to the yaw angle and the pitch angle output by the head attitude determination unit, recording the yaw angle and the pitch angle output for the first time as initial positions by using an angle-coordinate conversion unit, enabling the current angle to correspond to a central point on the display 2, enabling the current angle to correspond to the central point on the display 2, subtracting the pitch angle and the yaw angle obtained subsequently from the pitch angle and the yaw angle of the initial positions, calculating the difference value of the angles to be multiplied by a coefficient to correspond to pixel points of a screen, and enabling the head attitude to represent the angles of the head pitch angle, the yaw angle and the roll angle, wherein the pitch angle and the yaw angle are continuous, so that non-contact continuous man-machine interaction can be realized;
sixthly, moving the head to realize non-contact human-computer interaction;
the cursor is controlled to move by moving the head, an interaction result is displayed on the display 2, the result is fed back to the camera 1, head portrait data of the user is continuously obtained, and the user can continuously send a cursor moving instruction and cannot be stuck.
In addition, the single-frame image in the second step is cut and rotationally scaled to meet the input requirement of the characteristic point coordinate determination power supply.
The feature points comprise left and right eye canthi inside the eye, nose bridge, nose tip and chin point, if the selected feature points are shielded, the change can be made according to the actual situation, if the feature points can also select left and right eye external canthus, left and right mouth internal canthus and the like from key points of the face 68.
The angle range of the pitch angle and the yaw angle is-90 degrees. When the absolute value of the pitch angle is larger than 25 degrees or the absolute value of the yaw angle is larger than 40 degrees, angle-coordinate conversion is not carried out, the maximum angle limit of the pitch angle and the yaw angle can be changed according to the actual situation, wherein the conversion formula is as follows:
Figure 734336DEST_PATH_IMAGE003
(1)
wherein x and y are coordinates of the characteristic point in the image coordinate system, U, V and W are coordinates of the characteristic point in the head coordinate system, R is a rotation matrix, T is a translation vector, fx, fy, cx,cyIs a camera distortion parameter.
The calculation formula of the coordinates of the feature points under the image coordinate system is as follows:
Figure 494482DEST_PATH_IMAGE004
(2)
wherein: k is a coefficient, yawsYaw as the initial positionnPitch being the yaw angle of the current positionsPitch angle of initial position, pitchnThe pitch angle for the current position.
The size of the k value is positively correlated with the size of the display 2, and the coordinates of the feature points are obtained through the formula (1) and the formula (2).
The method for realizing non-contact human-computer interaction based on the head posture further comprises the following steps: calculating whether the head of the user is excessively pitched or not according to the change of the feature points in the facial image of the user; and if the pitching is determined to be excessive, locking the motion of the moving cursor.
As shown in fig. 2, the working principle of the present invention is as follows:
the method comprises the steps that a user moves the head, a camera 1 captures feature points set on a human face, a feature point coordinate determination unit automatically positions facial feature points, a head posture determination unit selects the feature points to correspond to points in a 3D human face model, a yaw angle and a pitch angle are calculated, an angle-coordinate conversion unit is used for recording the initial positions of the pitch angle and the yaw angle, the subsequently acquired pitch angle and yaw angle are subtracted from the pitch angle and the yaw angle of the initial positions to obtain a difference value, the difference value of the calculated angles is multiplied by a coefficient and corresponds to pixel points of a display 2, when the user rotates the head, the points in the display 2 can move along with the rotation of the user's head, the user can conveniently determine the position of the current head posture in the display 2, and non-contact human-computer interaction without carrying equipment is achieved.
It should be understood that the above-described embodiments are merely preferred embodiments of the present invention and the technical principles applied thereto, and that any changes, modifications, substitutions, combinations and simplifications made by those skilled in the art without departing from the spirit and principle of the present invention shall be regarded as equivalent substitutions and shall be covered by the protection scope of the present invention.

Claims (8)

1. A method for realizing non-contact human-computer interaction based on head gestures is characterized by comprising the following steps:
step one, establishing a 3D face model, ensuring that the number of characteristic points of the 3D face model is matched with the output result of a characteristic point coordinate determination unit, and establishing a head coordinate system by using the 3D face model;
establishing a camera coordinate system, acquiring a human face image video data stream of a user in real time, and establishing an image coordinate system for a single-frame image;
step three, determining a unit to automatically position facial feature points through feature point coordinates, and outputting feature point coordinates of a current frame, wherein the feature point coordinates are located in an image coordinate system;
fourthly, according to the output result of the feature point coordinate determination unit, the head posture determination unit selects the feature points to correspond to the points in the 3D face model, and the pitch angle and the yaw angle under the camera coordinate system are calculated;
fifthly, according to the yaw angle and the pitch angle output by the head attitude determination unit, recording the yaw angle and the pitch angle output for the first time as initial positions by using an angle-coordinate conversion unit, enabling the current angle to correspond to a central point on a display, and calculating the difference value of the angles to be multiplied by a coefficient to correspond to pixel points on a screen;
and step six, moving the head to realize non-contact human-computer interaction.
2. The method for realizing non-contact human-computer interaction based on head gestures according to claim 1, characterized in that: and cutting and rotating and scaling the single-frame image in the second step until the single-frame image meets the input requirement of the characteristic point coordinate determination power supply.
3. The method for realizing non-contact human-computer interaction based on head gestures according to claim 2, characterized in that: the characteristic points comprise left and right intraocular canthi, nose bridge, nose tip and chin point.
4. The method for realizing non-contact human-computer interaction based on head gestures according to claim 3, characterized in that: the angle range of the pitch angle and the yaw angle is-90 degrees.
5. The method for realizing non-contact human-computer interaction based on head gestures according to claim 4, characterized in that: when the absolute value of the pitch angle is larger than 25 degrees or the absolute value of the yaw angle is larger than 40 degrees, angle-coordinate conversion is not performed, and the conversion formula is as follows:
Figure 227352DEST_PATH_IMAGE001
wherein x and y are coordinates of the characteristic point in the image coordinate system, U, V and W are coordinates of the characteristic point in the head coordinate system, R is a rotation matrix, and T isTranslation vector, fx, fy, cx,cyIs a camera distortion parameter.
6. The method for realizing non-contact human-computer interaction based on head gestures according to claim 5, characterized in that: the calculation formula of the coordinates of the feature points in the image coordinate system is as follows:
Figure 410072DEST_PATH_IMAGE002
wherein: k is a coefficient, yawsYaw as the initial positionnPitch being the yaw angle of the current positionsPitch angle of initial position, pitchnThe pitch angle for the current position.
7. The method for realizing non-contact human-computer interaction based on head gestures according to claim 6, characterized in that: the magnitude of the k value is positively correlated to the size of the display.
8. The method for realizing non-contact human-computer interaction based on head gestures according to claim 7, characterized in that: the method further comprises the following steps: calculating whether the head of the user is excessively pitched or not according to the change of the feature points in the facial image of the user; and if the pitching is determined to be excessive, locking the motion of the moving cursor.
CN202210150603.4A 2022-02-18 2022-02-18 Method for realizing non-contact human-computer interaction based on head posture Pending CN114201054A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210150603.4A CN114201054A (en) 2022-02-18 2022-02-18 Method for realizing non-contact human-computer interaction based on head posture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210150603.4A CN114201054A (en) 2022-02-18 2022-02-18 Method for realizing non-contact human-computer interaction based on head posture

Publications (1)

Publication Number Publication Date
CN114201054A true CN114201054A (en) 2022-03-18

Family

ID=80645534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210150603.4A Pending CN114201054A (en) 2022-02-18 2022-02-18 Method for realizing non-contact human-computer interaction based on head posture

Country Status (1)

Country Link
CN (1) CN114201054A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070290998A1 (en) * 2006-06-08 2007-12-20 Samsung Electronics Co., Ltd. Input device comprising geomagnetic sensor and acceleration sensor, display device for displaying cursor corresponding to motion of input device, and cursor display method thereof
CN102156537A (en) * 2010-02-11 2011-08-17 三星电子株式会社 Equipment and method for detecting head posture
CN110717467A (en) * 2019-10-15 2020-01-21 北京字节跳动网络技术有限公司 Head pose estimation method, device, equipment and storage medium
CN111178152A (en) * 2019-12-09 2020-05-19 上海理工大学 Attention detection reminding device based on three-dimensional head modeling
CN111813689A (en) * 2020-07-22 2020-10-23 腾讯科技(深圳)有限公司 Game testing method, apparatus and medium
CN112088348A (en) * 2018-05-21 2020-12-15 韦斯特尔电子工业和贸易有限责任公司 Method, system and computer program for remote control of a display device via head gestures
CN112162627A (en) * 2020-08-28 2021-01-01 深圳市修远文化创意有限公司 Eyeball tracking method combined with head movement detection and related device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070290998A1 (en) * 2006-06-08 2007-12-20 Samsung Electronics Co., Ltd. Input device comprising geomagnetic sensor and acceleration sensor, display device for displaying cursor corresponding to motion of input device, and cursor display method thereof
CN102156537A (en) * 2010-02-11 2011-08-17 三星电子株式会社 Equipment and method for detecting head posture
CN112088348A (en) * 2018-05-21 2020-12-15 韦斯特尔电子工业和贸易有限责任公司 Method, system and computer program for remote control of a display device via head gestures
CN110717467A (en) * 2019-10-15 2020-01-21 北京字节跳动网络技术有限公司 Head pose estimation method, device, equipment and storage medium
CN111178152A (en) * 2019-12-09 2020-05-19 上海理工大学 Attention detection reminding device based on three-dimensional head modeling
CN111813689A (en) * 2020-07-22 2020-10-23 腾讯科技(深圳)有限公司 Game testing method, apparatus and medium
CN112162627A (en) * 2020-08-28 2021-01-01 深圳市修远文化创意有限公司 Eyeball tracking method combined with head movement detection and related device

Similar Documents

Publication Publication Date Title
US11600013B2 (en) Facial features tracker with advanced training for natural rendering of human faces in real-time
US10394334B2 (en) Gesture-based control system
US6204852B1 (en) Video hand image three-dimensional computer interface
US6147678A (en) Video hand image-three-dimensional computer interface with multiple degrees of freedom
Reale et al. A multi-gesture interaction system using a 3-D iris disk model for gaze estimation and an active appearance model for 3-D hand pointing
CN110083202B (en) Multimode interaction with near-eye display
KR101171660B1 (en) Pointing device of augmented reality
CN109145802B (en) Kinect-based multi-person gesture man-machine interaction method and device
CN109993073B (en) Leap Motion-based complex dynamic gesture recognition method
CN110865704B (en) Gesture interaction device and method for 360-degree suspended light field three-dimensional display system
CN107632699A (en) Natural human-machine interaction system based on the fusion of more perception datas
CN112198962A (en) Method for interacting with virtual reality equipment and virtual reality equipment
CN111639531A (en) Medical model interaction visualization method and system based on gesture recognition
CN112667078A (en) Method and system for quickly controlling mouse in multi-screen scene based on sight estimation and computer readable medium
CN108052901B (en) Binocular-based gesture recognition intelligent unmanned aerial vehicle remote control method
Vasisht et al. Human computer interaction based eye controlled mouse
Yousefi et al. 3D gesture-based interaction for immersive experience in mobile VR
WO2024055957A1 (en) Photographing parameter adjustment method and apparatus, electronic device and readable storage medium
Mayol et al. Interaction between hand and wearable camera in 2D and 3D environments
Appenrodt et al. Multi stereo camera data fusion for fingertip detection in gesture recognition systems
Liu et al. A robust hand tracking for gesture-based interaction of wearable computers
CN114201054A (en) Method for realizing non-contact human-computer interaction based on head posture
Thomas et al. A comprehensive review on vision based hand gesture recognition technology
CN115951783A (en) Computer man-machine interaction method based on gesture recognition
Jain et al. Human computer interaction–Hand gesture recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220318

RJ01 Rejection of invention patent application after publication