CN115454236A - Gesture cursor mapping method based on machine vision, network equipment and storage medium - Google Patents
Gesture cursor mapping method based on machine vision, network equipment and storage medium Download PDFInfo
- Publication number
- CN115454236A CN115454236A CN202211015270.0A CN202211015270A CN115454236A CN 115454236 A CN115454236 A CN 115454236A CN 202211015270 A CN202211015270 A CN 202211015270A CN 115454236 A CN115454236 A CN 115454236A
- Authority
- CN
- China
- Prior art keywords
- coordinates
- cursor
- virtual frame
- offset
- mapping method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/04812—Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Position Input By Displaying (AREA)
Abstract
The application provides a gesture cursor mapping method based on machine vision, network equipment and a storage medium, wherein the gesture cursor mapping method based on machine vision comprises the steps of acquiring a key point set of a body; the key point set comprises two end point coordinates of shoulders, and the shoulder width D is calculated for the two end point coordinates of the shoulders as a and b respectively; acquiring a hand key point coordinate m; calculating the hand central point coordinate c according to the hand key point coordinate; calculating coordinates of two diagonal points of the virtual frame according to the shoulder width D; and obtaining cursor mapping coordinates M of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen. According to the method and the device, the size and the position of the virtual frame are added, and the virtual frame is generated through an algorithm, so that the cursor moving speed can be self-adaptive under different distances, the operation range is smaller, the user fatigue is reduced, and the cursor moving speed can be adjusted.
Description
Technical Field
The present application relates to the field of communications devices, and in particular, to a gesture cursor mapping method based on machine vision, a network device, and a storage medium.
Background
With the development of computer performance and AI field, more interaction modes appear in the human-computer interaction field, and compared with the traditional interaction modes such as a mouse, a keyboard, a touch screen, a remote controller and the like, the gesture control based on machine vision brings brand new operation experience. In the field of large-screen terminals, gesture control becomes an indispensable function for high-end machines under flags of various manufacturers.
In the gesture interaction process based on vision, the gesture control of the user is simulated into mouse operation, so that the learning cost of the user can be reduced, the current software ecology can be better adapted, and a gesture interaction mode is not required to be adapted again by third-party application software. The most important thing in the process of simulating mouse control by gesture control is cursor mapping, and the accuracy and stability of cursor mapping are the keys of user experience. At present, cursor mapping is often realized by adopting a direct mapping mode, and the biggest problem of the mode is that the moving speeds of mapping cursors at different distances are inconsistent, and the cursor moving speed is slower as the distance is farther. Meanwhile, the cursor movement in the direct mapping mode requires a larger gesture operation range of the user, which will cause the fatigue of the gesture operation of the user to be increased.
Disclosure of Invention
In order to overcome the problems in the related art, the application provides a gesture cursor mapping method based on machine vision, a network device and a storage medium.
According to a first aspect of embodiments of the present application there is provided a machine vision based gesture cursor mapping method,
acquiring a key point set of a body;
the key point set comprises two end point coordinates of the shoulder which are respectively a (x) a ,y a ),b(x b ,y b );
Two endpoint coordinates a (x) through the shoulder a ,y a ),b(x b ,y b ) Calculating the shoulder width D;
obtaining the coordinates m (x) of key points of hands m ,y m );
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c );
Calculating the coordinates p (x) of two diagonal points of the virtual frame according to the shoulder width D p ,y p )=(x c -D/2,y c -D/2)q(x q ,y q )=(x c +D/2,y c +D/2);
Obtaining cursor mapping coordinates M (x) of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ))。
Preferably, the method further comprises obtaining a cursor speed gain, including
Calculating the current speed grade to obtain the speed V = V min +(K-1)/(N-1)*(V max -V min );
gain=V/V max ;
Wherein, V min Indicating the minimum cursor movement speed, V, supported by the system max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')];
Two diagonal point coordinates p (x) of virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t +gain*D/2)。
Preferably, also comprises
Judging the coordinates m (x) of key points of hands m ,y m ) Whether it is within a virtual box;
if not, updating the virtual frame;
if the cursor is in the virtual frame, calculating the mapping coordinates of the cursor.
Preferably, updating the virtual frame includes:
case 1:
x m <x p offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2:
x m >x q offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3:
y m <y t offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4:
y m >y q offset = y m -y q ;y q =y m ;y p =y p +offset。
Preferably, before calculating the shoulder width, the coordinates a (x) of the two end points of the shoulder are calculated a ,y a ),b(x b ,y b ) And (5) performing filtering processing.
The network device provided by the second aspect of the present application includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the program, the processor executes the method for mapping the gesture cursor based on the machine vision.
A third aspect of the present application provides a storage medium having stored thereon computer program instructions for implementing the above-described machine vision based gesture cursor mapping method when executed by a processor.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
in the embodiment of the application, the scale conversion is performed on the traditional direct mapping mode by adding the virtual frame, and the virtual frame is different from the direct mapping mode in that the size and the position of the picture are fixed. The size and the position of the virtual frame are generated through an algorithm, so that the cursor moving speed can be self-adapted under different distances, the fatigue of a user is reduced by a smaller operation range, and the cursor moving speed can be adjusted.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments consistent with the present application and together with the application, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of a gesture interaction flow;
fig. 2 is a schematic diagram of obtaining coordinates of a whole body key point by a MediaPipe pos model according to an embodiment of the present application;
FIG. 3 is a schematic diagram of obtaining coordinates of a hand key point by the MediaPipe handles model according to the embodiment of the present application;
FIG. 4 is a flowchart illustrating a cursor mapping method according to an embodiment of the present application;
FIG. 5 is a diagram illustrating a transformation relationship between a virtual frame and a projection screen according to an embodiment of the present application;
fig. 6 is a schematic diagram of a hardware framework of a network device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
First, a more complete flow of gesture interaction is described. The method comprises the steps of data acquisition, hand key point identification, data preprocessing and control right judgment, key point gesture conversion, instruction matching, smooth filtering, cursor mapping and system response, and is shown in figure 1.
Specifically, data acquisition is used to acquire pictures through a monocular camera, and the sampling rate is determined by the camera frame rate. The key point collection can be realized by adopting a Mediapipe framework, and is a multimedia machine learning model application framework developed and sourced by Google. And the hand key point identification is used for transmitting the data acquisition picture into a hand key point detection model for detection, and if the hand is identified, outputting the key point coordinate information of the hand. The key coordinate information of the hand is usually a set of coordinates. The more common is 21 key points, and the data processing amount can be increased or decreased according to the data processing capacity and the actual demand. The data preprocessing and control right judgment is used for preprocessing and controlling right judgment on the identified key point coordinate information, outputting a key point coordinate set of the hand with the control right, and inputting the key point coordinate into a key point-to-gesture model. And (4) converting the key points to gestures, and transmitting the output of the previous step into a gesture classification model of the training number for reasoning. And outputting the name corresponding to the current gesture, such as FIVE, FIRST, POINTER and the like. The command matching is based on a command matching algorithm, and the operation command matched with the current gesture is output, such as MOVE, CLICK _ DOWN/UP and the like. The smooth filtering (cursor anti-shake) refers to filtering the coordinate information of the key point. Cursor mapping (cursor positioning) refers to converting the keypoint coordinates to coordinates of a cursor on a projection screen. The system performs command processing and cursor positioning in response to the finger.
In order to solve the problems in the background art, an embodiment of the present application provides a gesture cursor mapping method based on machine vision, as shown in fig. 4, including:
a set of keypoint coordinates of the body is obtained. The embodiment of the application is obtained through the MediaPipe Pose model under the MediaPipe framework. The MediaPipe position model is a model for high fidelity body posture tracking. The keypoint coordinates of the whole body, here 33 keypoint coordinates, can be deduced from the single frame picture. Wherein the coordinates of two endpoints including the shoulder are respectively a (x) a ,y a ),b(x b ,y b ) Calculating shoulder widthAs shown in fig. 2.
As shown in FIG. 3, the coordinates m (x) of the hand key points are obtained m ,y m ) And acquiring a hand key point coordinate set through a MediaPipe handles model under the MediaPipe architecture. The MediaPipe handles model is a high fidelity hand and finger tracking model. The hand keypoint coordinates, here 21 keypoints, were inferred from the single frame picture using machine learning. Each keypoint output by the Mediapipe handles model consists of x, y, and z. x and y are normalized to [0.0,1.0, respectively, by image width and height]. z represents the depth of the coordinate with the depth at the wrist as the origin, the smaller the value, the closer the coordinate is to the camera, and the size of z uses about the same scale as x.
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c ) (ii) a Calculated from algorithms previously determined, such as weighting or linear accumulation using hand keypoint coordinates.
According to the shoulder width D and the hand center point coordinate c (x) c ,y c ) Calculating two diagonal point coordinates p (x) of the virtual frame p ,y p )=(x c -D/2,y c -D/2),q(x q ,y q )=(x c +D/2,y c + D/2) as shown in FIG. 5.
Obtaining a cursor mapping coordinate M (x) of the hand key point on the projection screen based on the coordinate p and q of the virtual frame coordinate pair, the obtained coordinate M of the hand key point, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ) As shown in fig. 5). Different from direct mapping mode (x) M ,y M )=(width*x m ,height*y m ) And introducing cursor mapping of the virtual frame, firstly calculating the position of the key point relative to the virtual frame, and then multiplying the position by the width and the height of the screen to obtain the final coordinates on the screen. Obviously, when the size of the key point of the hand is not changed, the larger the virtual frame is, the slower the cursor moving speed is; the smaller the virtual frame, the faster the cursor moves. The virtual frame is generated according to key points of the body and the hand of the user, when the user is close to the camera, the proportion of the user in the camera picture is large, and the generated virtual frame is large; when the user is far away from the camera, the proportion of the user in the camera picture is small, and the generated virtual frame is small. The virtual frame based cursor mapping can have consistent cursor movement speed at different distances. Meanwhile, the generation of the virtual frame depends on the shoulder width and the hand center position, so that a user can control the cursor at any corner of the screen with small enough movement amplitude in the operation process, and the fatigue of gesture interaction is reduced.
The embodiment of the application further comprises the following steps of obtaining the cursor speed gain after calculating the shoulder width D: includes calculating a current speed level acquisition speed V = V min +(K-1)/(N-1)*(V max -V min );gain=V/V max (ii) a Wherein, V min Supported by the presentation systemMinimum moving speed of cursor, V max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')](ii) a Two diagonal point coordinates p (x) of virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t + gain D/2). The speed gain is used for realizing the adjustable moving speed of the cursor.
The embodiment of the application also comprises the step of judging the coordinates m (x) of the key points of the hand part m ,y m ) Whether it is within a virtual box; if not, updating the virtual frame; and if the cursor is in the virtual frame, calculating the cursor mapping coordinate. The method for updating the virtual frame comprises the following steps:
case 1: x is the number of m <x p Offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2: x is a radical of a fluorine atom m >x q Offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3: y is m <y t Offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4: y is m >y q Offset = y m -y q ;y q =y m ;y p =y p +offset。
When the cursor moves to the screen boundary, the virtual frame is subjected to self-adaptive adjustment, and no error of moving out of the virtual frame exists. Compared with the simple method of judging whether the cursor is directly set as the screen boundary after moving out of the virtual frame or not, the self-adaptive adjustment can effectively solve the damping phenomenon when the cursor returns from the boundary.
Before calculating the shoulder width D, the embodiment of the application is used for calculating the coordinates a (x) of two end points of the shoulders a ,y a ),b(x b ,y b ) And (5) performing filtering processing. It may be a data filtering processing method such as smoothing filtering, wavelet filtering, or the like.
The second aspect of the embodiments of the present application further provides a network device, as shown in fig. 6, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the program to perform the above-mentioned gesture cursor mapping method based on machine vision. And acquiring pictures shot by the camera, such as a computer, an iPad and the like.
The third aspect of the embodiments of the present application further provides a storage medium, on which computer program instructions are stored, and the program instructions, when executed by a processor, are configured to implement the above-mentioned machine vision-based gesture cursor mapping method.
It will be understood that the present application is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The present invention is not intended to be limited to the particular embodiments shown and described, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed.
Claims (7)
1. A gesture cursor mapping method based on machine vision is characterized in that,
acquiring a key point set of a body;
the key point set comprises two end point coordinates of shoulders, and the two end point coordinates of the shoulders are a (x) respectively a ,y a ),b(x b ,y b );
Two endpoint coordinates a (x) through the shoulder a ,y a ),b(x b ,y b ) Calculating shoulder width D;
obtaining the coordinates m (x) of key points of hands m ,y m );
Calculating the coordinates c (x) of the hand center point according to the coordinates of the hand key points c ,y c );
Calculating the coordinates p (x) of two diagonal points of the virtual frame according to the shoulder width D p ,y p )=(x c -D/2,y c -D/2)q(x q ,y q )=(x c +D/2,y c +D/2);
Obtaining cursor mapping coordinates M (x) of the hand key points on the projection screen based on the coordinate p and q of the virtual frame pair point, the obtained coordinate M of the hand key points, and the length width and height of the projection screen M ,y M ),(x M ,y M )=(width*(x m -x p )/(x q -x p ),height*(y m -y p )/(y q -y p ))。
2. The machine-vision-based gesture cursor mapping method according to claim 1, further comprising obtaining a cursor speed gain comprising
Calculating the current speed grade to obtain the speed V = V min +(K-1)/(N-1)*(V max -V min );
gain=V/V max ;
Wherein, V min Indicating the minimum cursor movement speed, V, supported by the system max The maximum moving speed of the cursor supported by the system is shown, N is the number N of speed grades more than or equal to 1, and the current speed grade K belongs to [1, N')];
Two diagonal point coordinates p (x) of the virtual frame p ,y p )=(x t -gain*D/2,y t -gain*D/2)q(x q ,y q )=(x t +gain*D/2,y t +gain*D/2)。
3. The machine vision-based gesture cursor mapping method according to claim 1 or 2, further comprising
Judging the coordinates m (x) of key points of hands m ,y m ) Whether it is within a virtual box;
if not, updating the virtual frame;
and if the cursor is in the virtual frame, calculating the cursor mapping coordinate.
4. The machine-vision-based gesture cursor mapping method of claim 3, wherein updating the virtual box comprises:
case 1:
x m <x p offset = x p -x m ;x p =x m ;x q =x q -offset;
Case 2:
x m >x q offset = x m -x q ;x q =x m ;x p =x p +offset;
Case 3:
y m <y t offset = y p -y m ;y p =y m ;y q =y q -offset;
Case 4:
y m >y q offset = y m -y q ;y q =y m ;y p =y p +offset。
5. The machine-vision based gesture cursor mapping method of claim 1, wherein two endpoint coordinates a (x) for the shoulder are calculated before the shoulder width is calculated a ,y a ),b(x b ,y b ) And (5) performing filtering processing.
6. Network device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the program to perform a method comprising the machine vision based gesture cursor mapping method according to any one of claims 1 to 5.
7. Storage medium having stored thereon computer program instructions, characterized in that the program instructions, when executed by a processor, are adapted to implement the machine vision based gesture cursor mapping method of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211015270.0A CN115454236A (en) | 2022-08-23 | 2022-08-23 | Gesture cursor mapping method based on machine vision, network equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211015270.0A CN115454236A (en) | 2022-08-23 | 2022-08-23 | Gesture cursor mapping method based on machine vision, network equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115454236A true CN115454236A (en) | 2022-12-09 |
Family
ID=84298479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211015270.0A Pending CN115454236A (en) | 2022-08-23 | 2022-08-23 | Gesture cursor mapping method based on machine vision, network equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115454236A (en) |
-
2022
- 2022-08-23 CN CN202211015270.0A patent/CN115454236A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021103648A1 (en) | Hand key point detection method, gesture recognition method, and related devices | |
TWI683259B (en) | Method and related device of determining camera posture information | |
CN110188598B (en) | Real-time hand posture estimation method based on MobileNet-v2 | |
Arvo et al. | Fluid sketches: continuous recognition and morphing of simple hand-drawn shapes | |
KR102186864B1 (en) | Network topology adaptive data visualization method, device, equipment and storage medium | |
WO2021120834A1 (en) | Biometrics-based gesture recognition method and apparatus, computer device, and medium | |
Heap et al. | Towards 3D hand tracking using a deformable model | |
CN111768477B (en) | Three-dimensional facial expression base establishment method and device, storage medium and electronic equipment | |
US6788809B1 (en) | System and method for gesture recognition in three dimensions using stereo imaging and color vision | |
CN112329740B (en) | Image processing method, image processing apparatus, storage medium, and electronic device | |
US20230230305A1 (en) | Online streamer avatar generation method and apparatus | |
Huang et al. | Gesture-based system for next generation natural and intuitive interfaces | |
CN112766027A (en) | Image processing method, device, equipment and storage medium | |
TW201329791A (en) | Display method and system with adjustment function | |
CN113289327A (en) | Display control method and device of mobile terminal, storage medium and electronic equipment | |
CN116225256A (en) | Circuit board movement control method and system based on touch screen | |
CN112613384A (en) | Gesture recognition method, gesture recognition device and control method of interactive display equipment | |
CN115147265A (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN114494046A (en) | Touch trajectory processing method, device, terminal, storage medium and program product | |
CN109857322B (en) | Android-based painting brush width control method and device | |
US20240153184A1 (en) | Real-time hand-held markerless human motion recording and avatar rendering in a mobile platform | |
CN113822097B (en) | Single-view human body posture recognition method and device, electronic equipment and storage medium | |
CN117372604A (en) | 3D face model generation method, device, equipment and readable storage medium | |
CN112891954A (en) | Virtual object simulation method and device, storage medium and computer equipment | |
CN115454236A (en) | Gesture cursor mapping method based on machine vision, network equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |