CN115454236A - Gesture cursor mapping method based on machine vision, network equipment and storage medium - Google Patents

Gesture cursor mapping method based on machine vision, network equipment and storage medium Download PDF

Info

Publication number
CN115454236A
CN115454236A CN202211015270.0A CN202211015270A CN115454236A CN 115454236 A CN115454236 A CN 115454236A CN 202211015270 A CN202211015270 A CN 202211015270A CN 115454236 A CN115454236 A CN 115454236A
Authority
CN
China
Prior art keywords
coordinates
cursor
virtual frame
offset
mapping method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211015270.0A
Other languages
Chinese (zh)
Inventor
贺垟瑒
贺欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinhuasan Intelligent Terminal Co ltd
Original Assignee
Xinhuasan Intelligent Terminal Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinhuasan Intelligent Terminal Co ltd filed Critical Xinhuasan Intelligent Terminal Co ltd
Priority to CN202211015270.0A priority Critical patent/CN115454236A/en
Publication of CN115454236A publication Critical patent/CN115454236A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • G06F3/04812Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Position Input By Displaying (AREA)

Abstract

The application provides a gesture cursor mapping method based on machine vision, network equipment and a storage medium, wherein the gesture cursor mapping method based on machine vision comprises the steps of acquiring a key point set of a body; the key point set comprises two end point coordinates of shoulders, and the shoulder width D is calculated for the two end point coordinates of the shoulders as a and b respectively; acquiring a hand key point coordinate m; calculating the hand central point coordinate c according to the hand key point coordinate; calculating coordinates of two diagonal points of the virtual frame according to the shoulder width D; and obtaining cursor mapping coordinates M of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen. According to the method and the device, the size and the position of the virtual frame are added, and the virtual frame is generated through an algorithm, so that the cursor moving speed can be self-adaptive under different distances, the operation range is smaller, the user fatigue is reduced, and the cursor moving speed can be adjusted.

Description

Gesture cursor mapping method based on machine vision, network equipment and storage medium
Technical Field
The present application relates to the field of communications devices, and in particular, to a gesture cursor mapping method based on machine vision, a network device, and a storage medium.
Background
With the development of computer performance and AI field, more interaction modes appear in the human-computer interaction field, and compared with the traditional interaction modes such as a mouse, a keyboard, a touch screen, a remote controller and the like, the gesture control based on machine vision brings brand new operation experience. In the field of large-screen terminals, gesture control becomes an indispensable function for high-end machines under flags of various manufacturers.
In the gesture interaction process based on vision, the gesture control of the user is simulated into mouse operation, so that the learning cost of the user can be reduced, the current software ecology can be better adapted, and a gesture interaction mode is not required to be adapted again by third-party application software. The most important thing in the process of simulating mouse control by gesture control is cursor mapping, and the accuracy and stability of cursor mapping are the keys of user experience. At present, cursor mapping is often realized by adopting a direct mapping mode, and the biggest problem of the mode is that the moving speeds of mapping cursors at different distances are inconsistent, and the cursor moving speed is slower as the distance is farther. Meanwhile, the cursor movement in the direct mapping mode requires a larger gesture operation range of the user, which will cause the fatigue of the gesture operation of the user to be increased.
Disclosure of Invention
In order to overcome the problems in the related art, the application provides a gesture cursor mapping method based on machine vision, a network device and a storage medium.
According to a first aspect of embodiments of the present application there is provided a machine vision based gesture cursor mapping method,
acquiring a key point set of a body;
the key point set comprises two end point coordinates of the shoulder which are respectively a (x) a ,y a ),b(x b ,y b );
Two endpoint coordinates a (x) through the shoulder a ,y a ),b(x b ,y b ) Calculating the shoulder width D;
obtaining the coordinates m (x) of key points of hands m ,y m );
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c );
Calculating the coordinates p (x) of two diagonal points of the virtual frame according to the shoulder width D p ,y p )=(x c -D/2,y c -D/2)q(x q ,y q )=(x c +D/2,y c +D/2);
Obtaining cursor mapping coordinates M (x) of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ))。
Preferably, the method further comprises obtaining a cursor speed gain, including
Calculating the current speed grade to obtain the speed V = V min +(K-1)/(N-1)*(V max -V min );
gain=V/V max
Wherein, V min Indicating the minimum cursor movement speed, V, supported by the system max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')];
Two diagonal point coordinates p (x) of virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t +gain*D/2)。
Preferably, also comprises
Judging the coordinates m (x) of key points of hands m ,y m ) Whether it is within a virtual box;
if not, updating the virtual frame;
if the cursor is in the virtual frame, calculating the mapping coordinates of the cursor.
Preferably, updating the virtual frame includes:
case 1:
x m <x p offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2:
x m >x q offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3:
y m <y t offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4:
y m >y q offset = y m -y q ;y q =y m ;y p =y p +offset。
Preferably, before calculating the shoulder width, the coordinates a (x) of the two end points of the shoulder are calculated a ,y a ),b(x b ,y b ) And (5) performing filtering processing.
The network device provided by the second aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the program, the processor executes the method for mapping the gesture cursor based on the machine vision.
A third aspect of the present application provides a storage medium having stored thereon computer program instructions for implementing the above-described machine vision based gesture cursor mapping method when executed by a processor.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
in the embodiment of the application, the scale conversion is performed on the traditional direct mapping mode by adding the virtual frame, and the virtual frame is different from the direct mapping mode in that the size and the position of the picture are fixed. The size and the position of the virtual frame are generated through an algorithm, so that the cursor moving speed can be self-adapted under different distances, the fatigue of a user is reduced by a smaller operation range, and the cursor moving speed can be adjusted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments consistent with the present application and together with the application, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of a gesture interaction flow;
fig. 2 is a schematic diagram of obtaining coordinates of a whole body key point by a MediaPipe pos model according to an embodiment of the present application;
FIG. 3 is a schematic diagram of obtaining coordinates of a hand key point by the MediaPipe handles model according to the embodiment of the present application;
FIG. 4 is a flowchart illustrating a cursor mapping method according to an embodiment of the present application;
FIG. 5 is a diagram illustrating a transformation relationship between a virtual frame and a projection screen according to an embodiment of the present application;
fig. 6 is a schematic diagram of a hardware framework of a network device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
First, a more complete flow of gesture interaction is described. The method comprises the steps of data acquisition, hand key point identification, data preprocessing and control right judgment, key point gesture conversion, instruction matching, smooth filtering, cursor mapping and system response, and is shown in figure 1.
Specifically, data acquisition is used to acquire pictures through a monocular camera, and the sampling rate is determined by the camera frame rate. The key point collection can be realized by adopting a Mediapipe framework, and is a multimedia machine learning model application framework developed and sourced by Google. And the hand key point identification is used for transmitting the data acquisition picture into a hand key point detection model for detection, and if the hand is identified, outputting the key point coordinate information of the hand. The key coordinate information of the hand is usually a set of coordinates. The more common is 21 key points, and the data processing amount can be increased or decreased according to the data processing capacity and the actual demand. The data preprocessing and control right judgment is used for preprocessing and controlling right judgment on the identified key point coordinate information, outputting a key point coordinate set of the hand with the control right, and inputting the key point coordinate into a key point-to-gesture model. And (4) converting the key points to gestures, and transmitting the output of the previous step into a gesture classification model of the training number for reasoning. And outputting the name corresponding to the current gesture, such as FIVE, FIRST, POINTER and the like. The command matching is based on a command matching algorithm, and the operation command matched with the current gesture is output, such as MOVE, CLICK _ DOWN/UP and the like. The smooth filtering (cursor anti-shake) refers to filtering the coordinate information of the key point. Cursor mapping (cursor positioning) refers to converting the keypoint coordinates to coordinates of a cursor on a projection screen. The system performs command processing and cursor positioning in response to the finger.
In order to solve the problems in the background art, an embodiment of the present application provides a gesture cursor mapping method based on machine vision, as shown in fig. 4, including:
a set of keypoint coordinates of the body is obtained. The embodiment of the application is obtained through the MediaPipe Pose model under the MediaPipe framework. The MediaPipe position model is a model for high fidelity body posture tracking. The keypoint coordinates of the whole body, here 33 keypoint coordinates, can be deduced from the single frame picture. Wherein the coordinates of two endpoints including the shoulder are respectively a (x) a ,y a ),b(x b ,y b ) Calculating shoulder width
Figure BDA0003812318310000051
As shown in fig. 2.
As shown in FIG. 3, the coordinates m (x) of the hand key points are obtained m ,y m ) And acquiring a hand key point coordinate set through a MediaPipe handles model under the MediaPipe architecture. The MediaPipe handles model is a high fidelity hand and finger tracking model. The hand keypoint coordinates, here 21 keypoints, were inferred from the single frame picture using machine learning. Each keypoint output by the Mediapipe handles model consists of x, y, and z. x and y are normalized to [0.0,1.0, respectively, by image width and height]. z represents the depth of the coordinate with the depth at the wrist as the origin, the smaller the value, the closer the coordinate is to the camera, and the size of z uses about the same scale as x.
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c ) (ii) a Calculated from algorithms previously determined, such as weighting or linear accumulation using hand keypoint coordinates.
According to the shoulder width D and the hand center point coordinate c (x) c ,y c ) Calculating two diagonal point coordinates p (x) of the virtual frame p ,y p )=(x c -D/2,y c -D/2),q(x q ,y q )=(x c +D/2,y c + D/2) as shown in FIG. 5.
Obtaining a cursor mapping coordinate M (x) of the hand key point on the projection screen based on the coordinate p and q of the virtual frame coordinate pair, the obtained coordinate M of the hand key point, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ) As shown in fig. 5). Different from direct mapping mode (x) M ,y M )=(width*x m ,height*y m ) And introducing cursor mapping of the virtual frame, firstly calculating the position of the key point relative to the virtual frame, and then multiplying the position by the width and the height of the screen to obtain the final coordinates on the screen. Obviously, when the size of the key point of the hand is not changed, the larger the virtual frame is, the slower the cursor moving speed is; the smaller the virtual frame, the faster the cursor moves. The virtual frame is generated according to key points of the body and the hand of the user, when the user is close to the camera, the proportion of the user in the camera picture is large, and the generated virtual frame is large; when the user is far away from the camera, the proportion of the user in the camera picture is small, and the generated virtual frame is small. The virtual frame based cursor mapping can have consistent cursor movement speed at different distances. Meanwhile, the generation of the virtual frame depends on the shoulder width and the hand center position, so that a user can control the cursor at any corner of the screen with small enough movement amplitude in the operation process, and the fatigue of gesture interaction is reduced.
The embodiment of the application further comprises the following steps of obtaining the cursor speed gain after calculating the shoulder width D: includes calculating a current speed level acquisition speed V = V min +(K-1)/(N-1)*(V max -V min );gain=V/V max (ii) a Wherein, V min Supported by the presentation systemMinimum moving speed of cursor, V max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')](ii) a Two diagonal point coordinates p (x) of virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t + gain D/2). The speed gain is used for realizing the adjustable moving speed of the cursor.
The embodiment of the application also comprises the step of judging the coordinates m (x) of the key points of the hand part m ,y m ) Whether it is within a virtual box; if not, updating the virtual frame; and if the cursor is in the virtual frame, calculating the cursor mapping coordinate. The method for updating the virtual frame comprises the following steps:
case 1: x is the number of m <x p Offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2: x is a radical of a fluorine atom m >x q Offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3: y is m <y t Offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4: y is m >y q Offset = y m -y q ;y q =y m ;y p =y p +offset。
When the cursor moves to the screen boundary, the virtual frame is subjected to self-adaptive adjustment, and no error of moving out of the virtual frame exists. Compared with the simple method of judging whether the cursor is directly set as the screen boundary after moving out of the virtual frame or not, the self-adaptive adjustment can effectively solve the damping phenomenon when the cursor returns from the boundary.
Before calculating the shoulder width D, the embodiment of the application is used for calculating the coordinates a (x) of two end points of the shoulders a ,y a ),b(x b ,y b ) And (5) performing filtering processing. It may be a data filtering processing method such as smoothing filtering, wavelet filtering, or the like.
The second aspect of the embodiments of the present application further provides a network device, as shown in fig. 6, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to perform the above-mentioned gesture cursor mapping method based on machine vision. And acquiring pictures shot by the camera, such as a computer, an iPad and the like.
The third aspect of the embodiments of the present application further provides a storage medium, on which computer program instructions are stored, and the program instructions, when executed by a processor, are configured to implement the above-mentioned machine vision-based gesture cursor mapping method.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed.

Claims (7)

1. A gesture cursor mapping method based on machine vision is characterized in that,
acquiring a key point set of a body;
the key point set comprises two end point coordinates of shoulders, and the two end point coordinates of the shoulders are a (x) respectively a ,y a ),b(x b ,y b );
Two endpoint coordinates a (x) through the shoulder a ,y a ),b(x b ,y b ) Calculating shoulder width D;
obtaining the coordinates m (x) of key points of hands m ,y m );
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c );
Calculating the coordinates p (x) of two diagonal points of the virtual frame according to the shoulder width D p ,y p )=(x c -D/2,y c -D/2)q(x q ,y q )=(x c +D/2,y c +D/2);
Obtaining cursor mapping coordinates M (x) of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ))。
2. The machine-vision-based gesture cursor mapping method according to claim 1, further comprising obtaining a cursor speed gain comprising
Calculating the current speed grade to obtain the speed V = V min +(K-1)/(N-1)*(V max -V min );
gain=V/V max
Wherein, V min Indicating the minimum cursor movement speed, V, supported by the system max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')];
Two diagonal point coordinates p (x) of the virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t +gain*D/2)。
3. The machine vision-based gesture cursor mapping method according to claim 1 or 2, further comprising
Judging the coordinates m (x) of key points of hands m ,y m ) Whether it is within a virtual box;
if not, updating the virtual frame;
and if the cursor is in the virtual frame, calculating the cursor mapping coordinate.
4. The machine-vision-based gesture cursor mapping method of claim 3, wherein updating the virtual box comprises:
case 1:
x m <x p offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2:
x m >x q offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3:
y m <y t offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4:
y m >y q offset = y m -y q ;y q =y m ;y p =y p +offset。
5. The machine-vision based gesture cursor mapping method of claim 1, wherein two endpoint coordinates a (x) for the shoulder are calculated before the shoulder width is calculated a ,y a ),b(x b ,y b ) And (5) performing filtering processing.
6. Network device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to perform a method comprising the machine vision based gesture cursor mapping method according to any one of claims 1 to 5.
7. Storage medium having stored thereon computer program instructions, characterized in that the program instructions, when executed by a processor, are adapted to implement the machine vision based gesture cursor mapping method of any of claims 1-5.
CN202211015270.0A 2022-08-23 2022-08-23 Gesture cursor mapping method based on machine vision, network equipment and storage medium Pending CN115454236A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211015270.0A CN115454236A (en) 2022-08-23 2022-08-23 Gesture cursor mapping method based on machine vision, network equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211015270.0A CN115454236A (en) 2022-08-23 2022-08-23 Gesture cursor mapping method based on machine vision, network equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115454236A true CN115454236A (en) 2022-12-09

Family

ID=84298479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211015270.0A Pending CN115454236A (en) 2022-08-23 2022-08-23 Gesture cursor mapping method based on machine vision, network equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115454236A (en)

Similar Documents

Publication Publication Date Title
WO2021103648A1 (en) Hand key point detection method, gesture recognition method, and related devices
TWI683259B (en) Method and related device of determining camera posture information
CN110188598B (en) Real-time hand posture estimation method based on MobileNet-v2
Arvo et al. Fluid sketches: continuous recognition and morphing of simple hand-drawn shapes
KR102186864B1 (en) Network topology adaptive data visualization method, device, equipment and storage medium
WO2021120834A1 (en) Biometrics-based gesture recognition method and apparatus, computer device, and medium
Heap et al. Towards 3D hand tracking using a deformable model
CN111768477B (en) Three-dimensional facial expression base establishment method and device, storage medium and electronic equipment
US6788809B1 (en) System and method for gesture recognition in three dimensions using stereo imaging and color vision
CN112329740B (en) Image processing method, image processing apparatus, storage medium, and electronic device
US20230230305A1 (en) Online streamer avatar generation method and apparatus
Huang et al. Gesture-based system for next generation natural and intuitive interfaces
CN112766027A (en) Image processing method, device, equipment and storage medium
TW201329791A (en) Display method and system with adjustment function
CN113289327A (en) Display control method and device of mobile terminal, storage medium and electronic equipment
CN116225256A (en) Circuit board movement control method and system based on touch screen
CN112613384A (en) Gesture recognition method, gesture recognition device and control method of interactive display equipment
CN115147265A (en) Virtual image generation method and device, electronic equipment and storage medium
CN114494046A (en) Touch trajectory processing method, device, terminal, storage medium and program product
CN109857322B (en) Android-based painting brush width control method and device
US20240153184A1 (en) Real-time hand-held markerless human motion recording and avatar rendering in a mobile platform
CN113822097B (en) Single-view human body posture recognition method and device, electronic equipment and storage medium
CN117372604A (en) 3D face model generation method, device, equipment and readable storage medium
CN112891954A (en) Virtual object simulation method and device, storage medium and computer equipment
CN115454236A (en) Gesture cursor mapping method based on machine vision, network equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination