CN113448485A - Large-screen window control method and device, storage medium and equipment - Google Patents

Large-screen window control method and device, storage medium and equipment Download PDF

Info

Publication number
CN113448485A
CN113448485A CN202110785561.7A CN202110785561A CN113448485A CN 113448485 A CN113448485 A CN 113448485A CN 202110785561 A CN202110785561 A CN 202110785561A CN 113448485 A CN113448485 A CN 113448485A
Authority
CN
China
Prior art keywords
window
event
operating system
gesture
controlling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110785561.7A
Other languages
Chinese (zh)
Inventor
喻纯
翁跃庭
张磊
周诚驰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Interactive Future Beijing Technology Co ltd
Original Assignee
Interactive Future Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Interactive Future Beijing Technology Co ltd filed Critical Interactive Future Beijing Technology Co ltd
Priority to CN202110785561.7A priority Critical patent/CN113448485A/en
Publication of CN113448485A publication Critical patent/CN113448485A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range

Abstract

The application discloses a large-screen window control method, a large-screen window control device, a storage medium and equipment, which are used for acquiring image frames collected by camera equipment preset on a large screen. And performing gesture recognition on the human hand to obtain the plane coordinates of the key points of the hand. And inputting the plane coordinates of the key points of the hand as characteristic variables of the image frame into the classification model to obtain a classification result output by the classification model. And taking the gesture type with the highest probability as the gesture type of the image frame. And under the condition that the distance between the key point of the hand and the large screen is detected to be smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a masking layer with a transparent color. And analyzing the trigger operation to obtain the touch information. Based on the touch information, an interaction event of the user with the window is determined. And controlling the operating system to perform control operation corresponding to the interactive event on the window. Therefore, by means of the scheme, convenience of man-machine interaction of the large-screen window can be effectively improved.

Description

Large-screen window control method and device, storage medium and equipment
Technical Field
The application relates to the field of human-computer interaction, in particular to a large-screen window control method, a large-screen window control device, a large-screen window control storage medium and large-screen window control equipment.
Background
In order to control (for example, close, maximize, minimize, move) a window in a large screen (usually a touch screen) running an operating system, a user needs to interact with a menu or a button at the top of the window, which is not favorable for the user to operate, especially for a user with low height, the user cannot touch the large screen directly above.
Therefore, how to improve the convenience of the man-machine interaction of the large-screen window becomes a problem which needs to be solved urgently in the field.
Disclosure of Invention
The application provides a large-screen window control method, a large-screen window control device, a storage medium and equipment, and aims to improve the convenience of man-machine interaction of the large-screen window.
In order to achieve the above object, the present application provides the following technical solutions:
a large screen window control method comprises the following steps:
acquiring an image frame collected by a camera device preset on a large screen; the image frame is used for indicating a human hand;
performing gesture recognition on the human hand to obtain a plane coordinate of a key point of the hand;
taking the plane coordinates of the key points of the hand as the characteristic variables of the image frames, and inputting the characteristic variables into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the sample image frames marked in advance as training targets; the classification result comprises probabilities of various gesture types;
taking the gesture type with the highest probability as the gesture type of the image frame;
controlling the operating system to generate a transparent covering layer under the condition that the distance between the key point of the hand and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture; the covering layer is used for capturing the triggering operation of a user on the large screen;
analyzing the trigger operation to obtain touch information;
determining an interaction event of the user with a window based on the touch information;
and controlling the operating system to perform control operation corresponding to the interaction event on the window.
Optionally, the controlling the operating system to generate a masking layer with a transparent color when it is detected that the distance between the key point of the hand and the large screen is smaller than a first preset threshold and the gesture type of the image frame is a preset double-finger gesture includes:
calculating the distance between each hand key point in the image frame and a preset rectangle to obtain each first numerical value; the preset rectangle is an imaging area of the large screen in the image frame;
taking the first value with the minimum value as a target value;
and under the condition that the target value is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a globally-topped covering layer with a transparent color.
Optionally, the controlling the operating system to generate a globally top-located cover layer with a transparent color includes:
controlling the operating system to create a window for representing a covering layer, and setting the window style of the window to be null;
controlling the operating system to adjust the size of the window so that the window covers a preset area;
controlling the operating system to set the background color of the window to be transparent;
controlling the operating system to set the Topmos attribute of the window to true;
controlling the operating system to bind a touch callback event on the window; and the touch callback event is used for intercepting the triggering operation of the user on the large screen.
Optionally, the touch information includes a touch position, a motion track, and a touch duration of the two fingers;
the determining, based on the touch information, an interaction event of the user with a window includes:
judging whether the touch duration is greater than a second preset threshold value or not;
under the condition that the touch duration is greater than the second preset threshold, calculating a change value of the relative distance between the two fingers based on the touch position and the motion track;
judging whether the change value is larger than a third preset threshold value or not;
determining that the interaction event of the user and the target window is a zooming event under the condition that the change value is larger than the third preset threshold value; the target window is an original window which is closest to the touch position in each original window; the zoom event is used to characterize zooming the target window.
Optionally, the method further includes:
determining that the interaction event of the user and the target window is a dragging event under the condition that the variation value is not larger than the third preset threshold; wherein the drag event is used for representing the movement of the target window.
Optionally, the method further includes:
under the condition that the touch duration is not greater than the second preset threshold, calculating the sliding direction of the double fingers based on the touch position and the motion track;
under the condition that the sliding direction indicates upward sliding, determining that the interaction event of the user and the target window is a sliding-up event; the upsliding event is used to characterize maximizing the target window;
under the condition that the sliding direction indicates downward sliding, determining that the interaction event of the user and the target window is a sliding event; the glide-down event is used to characterize minimizing the target window;
determining that the interaction event of the user and the target window is a left-sliding event under the condition that the sliding direction indicates to slide left; the left-sliding event is used for representing the closing of the target window;
under the condition that the sliding direction indicates to slide rightwards, determining that the interaction event of the user and the target window is a right-sliding event; and the right slide event is used for representing and restoring the target window.
Optionally, after the controlling the operating system to perform the control operation corresponding to the interaction event on the window, the method further includes:
and controlling the operating system to close the cover layer.
A large screen window control apparatus comprising:
the acquisition unit is used for acquiring image frames acquired by the camera equipment preset on the large screen; the image frame is used for indicating a human hand;
the recognition unit is used for performing gesture recognition on the human hand to obtain a plane coordinate of a key point of the hand;
the classification unit is used for inputting the plane coordinates of the key points of the hand as the characteristic variables of the image frames into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the sample image frames marked in advance as training targets; the classification result comprises probabilities of various gesture types;
a first determining unit, configured to use the gesture type with the highest probability as the gesture type of the image frame;
the first control unit is used for controlling the operating system to generate a covering layer with a transparent color under the condition that the distance between the hand key point and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture; the covering layer is used for capturing the triggering operation of a user on the large screen;
the analysis unit is used for analyzing the trigger operation to obtain touch information;
a second determining unit, configured to determine an interaction event of the user with a window based on the touch information;
and the second control unit is used for controlling the operating system to perform control operation corresponding to the interactive event on the window.
A computer-readable storage medium including a stored program, wherein the program executes the large-screen window control method.
A large screen window control device comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing programs, and the processor is used for running the programs, wherein the large screen window control method is executed when the programs run.
According to the technical scheme, the image frames collected by the camera shooting equipment preset on the large screen are acquired, and the image frames are used for indicating the hands of the human body. And performing gesture recognition on the human hand to obtain the plane coordinates of the key points of the hand. And inputting the plane coordinates of the key points of the hand as characteristic variables of the image frame into the classification model to obtain a classification result output by the classification model. The classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the pre-labeled sample image frames as training targets. The classification result includes probabilities of the respective gesture types. And taking the gesture type with the highest probability as the gesture type of the image frame. And under the condition that the distance between the key point of the hand and the large screen is detected to be smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling an operating system to generate a covering layer with a transparent color, wherein the covering layer is used for capturing the triggering operation of the user on the large screen. And analyzing the trigger operation to obtain the touch information. Based on the touch information, an interaction event of the user with the window is determined. And controlling the operating system to perform control operation corresponding to the interactive event on the window. Compared with the prior art, the control operation of the window can be realized only by making the corresponding gesture (namely the preset double-finger gesture) by the user to trigger the large screen without the interaction of the user and the menu or the button at the top of the window. In addition, the color of the covering layer is transparent, so that the display of the original window cannot be influenced. Therefore, by means of the scheme, convenience of man-machine interaction of the large-screen window can be effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1a is a schematic diagram of a large screen window control method provided in an embodiment of the present application;
fig. 1b is a schematic diagram of a large screen provided in an embodiment of the present application;
fig. 1c is a schematic view of another large screen provided in the embodiment of the present application;
FIG. 1d is a schematic diagram of a sample image frame provided by an embodiment of the present application;
FIG. 1e is a schematic diagram of another sample image frame provided by an embodiment of the present application;
FIG. 1f is a schematic diagram of an image frame provided in an embodiment of the present application;
FIG. 1g is a schematic diagram of a two-finger gesture provided in the present application;
FIG. 1h is a schematic diagram of an interaction event according to an embodiment of the present application;
fig. 2 is a schematic diagram of another large-screen window control method provided in the embodiment of the present application;
fig. 3 is a schematic structural diagram of a large-screen window control device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
As shown in fig. 1a, a schematic diagram of a method for controlling a large screen window provided in an embodiment of the present application includes the following steps:
s101: and acquiring a sample image frame collected by a camera device preset on a large screen.
The image capturing device includes, but is not limited to, being disposed directly above the large screen, and the image capturing device can capture environment information located in front of the large screen, specifically, an installation position of the image capturing device, and the environment information that can be captured, as shown in fig. 1b and fig. 1c (in fig. 1b and fig. 1c, "fishery lens camera" represents the image capturing device, "large screen" represents the large screen, "and" Interactive space "represents the environment). For example, the entire process from raising the hand to touching when the user clicks on a large screen, the angle of view of the image pickup apparatus is 180 degrees, and an image is captured at a frame rate of 30 frames per second.
It should be noted that the image capturing device includes, but is not limited to, an RGB fisheye camera. In an embodiment of the application, the sample image frames are used to indicate a human hand.
S102: and marking a type identifier for the sample image frame.
Wherein the type identification is used for indicating the gesture type of the sample image frame.
S103: and inputting the sample image frame into a preset gesture recognition model to obtain a characteristic variable output by the gesture recognition model.
The preset gesture recognition model includes, but is not limited to, a gesture recognition neural network model in Mediapipe framework (the model is pre-trained on a wide hand image data set, so that the officially provided model parameters can be directly used without retraining the model), and the gesture recognition neural network model is used for performing gesture recognition on the human hand in the sample image frame and marking each hand key point in the sample image frame, that is, outputting plane coordinates (including abscissa and ordinate) of each hand key point.
In the present embodiment, the number of feature variables may be forty-two, i.e., representing the planar coordinates of twenty-one hand keypoints. In other words, the plane coordinates of the hand key points shown in the sample image frame may be regarded as the feature variables of the sample image frame.
Specifically, the sample image frame can be referred to as fig. 1d, and accordingly, gesture recognition is performed on the human hand in the sample image frame, and the recognition result is shown in fig. 1 e.
It should be noted that the above specific implementation process is only for illustration.
S104: and taking the characteristic variable of the sample image frame as input, taking the gesture type of the sample image frame as a training target, and training to obtain a classification model.
Wherein, the network structure of the classification model includes but is not limited to: VGG19 (typically as a skeletal structure), 16 convolutional layers, and 3 fully-connected layers. Specifically, when training the classification model, an Adam optimizer can be used to reduce the cross entropy of the loss function, so as to improve the training efficiency of the classification model.
S105: and under the condition that the operating system is detected to be started, acquiring image frames acquired by the camera equipment according to a preset time interval.
When the operating system is started, the large screen is determined to enter a human-computer interaction state. The preset time interval may also be understood as a preset frame rate, and in the field of image capturing, the time interval is usually represented by a frame rate.
In an embodiment of the application, the image frames are used to indicate a human hand.
S106: and inputting the image frame into the gesture recognition model to obtain the characteristic variable output by the gesture recognition model.
S107: and inputting the characteristic variables of the image frames into the classification model to obtain a classification result output by the classification model.
Wherein the classification result comprises probabilities of the respective gesture types.
In the embodiment of the present application, the classification result output by the classification model may be specifically represented by a set of vectors, where the number of elements included in the vector, that is, the total number of the elements representing the gesture types, and each element represents a probability that one gesture type appears in the image frame.
S108: and taking the gesture type with the highest probability as the gesture type of the image frame.
The gesture types of the image frames can be stored in a preset queue according to the sequence of the timestamps of the image frames from early to late.
S109: and calculating the distance between each hand key point in the image frame and a preset rectangle to obtain each first numerical value.
The preset rectangle is an imaging area of the large screen in the image frame. Since the camera device is arranged right above the large screen, the large screen can be regarded as a fixed rectangle (i.e. a preset rectangle) in the image frame. In the embodiment of the present application, the width of the preset rectangle may be the width of the entire image frame, and the height of the preset rectangle may be 8% of the height of the entire image frame, and the preset rectangle may be located in the middle of the image frame, and a specific preset rectangle may be shown in fig. 1 f.
It should be noted that, the process of calculating the distance between the key point of the hand and the preset rectangle, that is, the process of calculating the first numerical value, is shown in formula (1).
Figure BDA0003158611370000081
In the formula (1), dx is max (rect _ min _ x-x, 0, x-rect _ max _ x), dy is max (rect _ min _ y-y, 0, y-rect _ max _ y), rect _ min _ x, rect _ max _ x, rect _ min _ y, and rect _ max _ y are coordinates of four vertices of a preset rectangle, x is an abscissa of a hand key point, and y is an ordinate of the hand key point.
S110: and taking the first value with the minimum value as a target value.
S111: and under the condition that the target value is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a globally-topped covering layer with transparent color.
The cover layer is used for capturing the triggering operation of a user on the large screen. The color of the covering layer is transparent, so that the display of the content in the original window cannot be influenced. The so-called global top-setting covering layer is that the covering layer can cover the original window, so that the user is prevented from controlling the original window.
Furthermore, the image frames captured by the image capturing apparatus may indicate a person (i.e., a user) in front of the large screen in addition to the hand of the person. When it is detected that the total number of users indicated in the image frame is multiple and the original windows triggered by multiple users are all the same window, the control operating system allows the user who first triggers the window to perform the trigger operation (i.e. the cover layer only captures the trigger operation of the user who first triggers the window on the window), and the trigger operations of other users on the window are shielded until the user who first triggers the window finishes the trigger operation, so as to remove the shielding of other users.
It should be noted that the specific implementation steps of setting the color of the globally set top masking layer to be transparent by using an operating system (e.g., windows system, Android system, etc.) include:
1. the control operating system creates a window for characterizing the skin and sets the window style of the window to null.
2. And controlling the operating system to adjust the size of the window so that the window covers the preset area.
In the embodiment of the present application, the preset area includes, but is not limited to, a full screen and a rectangular area with the average value of the plane coordinates of all the key points of the hand of the user as the center and the area greater than the preset value range.
3. And controlling the operating system to set the background color of the window to be transparent.
4. The control operating system sets the Topmos attribute of the window to true (i.e., implements global set-top of the window).
5. And controlling the operating system to bind the touch callback event on the window.
The touch callback event is used for intercepting the triggering operation of a user on the large screen.
It should be emphasized that the two-finger gesture has a plurality of expressions, and for example, the two-finger gesture shown in fig. 1g can be expressed by using the index finger and the middle finger of one hand, or by using the index fingers of two hands. In the embodiment of the present application, one of the expressions of the two-finger gesture may be used as the preset two-finger gesture.
S112: and analyzing the triggering operation of the user on the large screen to obtain touch information.
Wherein the touch information includes but is not limited to: touch positions, motion tracks and touch duration of the double fingers.
S113: and judging whether the touch duration of the double fingers is greater than a second preset threshold value.
If the touch duration of the two fingers is greater than the second preset threshold, executing S114, otherwise executing S117.
S114: and calculating a change value of the relative distance between the two fingers based on the touch positions and the motion tracks of the two fingers, and judging whether the change value is greater than a third preset threshold value.
If the variation value is greater than the third preset threshold value, determining that the interaction event between the user and the target window is a zooming event, and executing S115, otherwise, determining that the interaction event between the user and the target window is a dragging event, and executing S116.
The specific implementation manner of calculating the change value of the relative distance between the two fingers based on the touch positions and the motion trajectories of the two fingers is common knowledge familiar to those skilled in the art, and is not described herein again.
It should be noted that the zoom event is used to characterize the zoom target window, and the drag event is used to characterize the moving target window. In the embodiment of the present application, the target window includes, but is not limited to: the original windows closest to the touch positions of the two fingers in each original window, the original window with the highest current activity in the operating system and the original window in a use state.
S115: and calculating the product of the change value and a preset coefficient, and controlling the operating system to scale the target window according to the product.
After execution of S115, execution continues with S122.
The specific calculation method of the distance between the original window and the touch positions of the two fingers is the same as the principle shown in the formula (1), that is, the two-dimensional plane coordinates of the touch positions of the two fingers, the center coordinates of the original window, the length and the width are known (that is, the original window is regarded as a rectangular area, and four vertexes of the rectangular area are known), and then the distance between the original window and the touch positions of the two fingers can be calculated by using the formula (1).
S116: and calculating the moving distance and moving direction of the double fingers based on the touch positions and the motion tracks of the double fingers, and controlling an operating system to move the target window according to the moving distance and the moving direction of the double fingers.
After execution of S116, execution continues with S122.
The specific implementation manner of calculating the moving distance and the moving direction of the two fingers based on the touch positions and the motion tracks of the two fingers is common knowledge familiar to those skilled in the art, and is not described herein again.
S117: and calculating the sliding directions of the double fingers based on the touch positions and the motion tracks of the double fingers.
The specific implementation manner of calculating the sliding directions of the two fingers based on the touch positions and the motion tracks of the two fingers is common general knowledge familiar to those skilled in the art, and is not described herein again.
S118: and under the condition that the sliding direction indicates upward sliding, determining that the interaction event of the user and the target window is a sliding-up event, and controlling the operating system to maximize the target window.
After execution of S118, execution continues with S122.
Wherein the upsliding event is used to characterize the maximized target window.
S119: and under the condition that the sliding direction indicates downward sliding, determining that the interaction event of the user and the target window is a sliding-down event, and controlling the operating system to minimize the target window.
After execution of S119, execution continues with S122.
Wherein the glide down event is used to characterize a minimization of the target window.
S120: and under the condition that the sliding direction indicates to slide leftwards, determining that the interaction event of the user and the target window is a left-sliding event, and controlling the operating system to close the target window.
After performing S120, execution continues with S122.
Wherein the left-sliding event is used to characterize closing the target window.
S121: and under the condition that the sliding direction indicates to slide to the right, determining that the interaction event of the user and the target window is a right sliding event, and controlling the operating system to restore the target window.
After execution of S121, execution continues with S122.
Wherein the right slip event is used to characterize the restore target window.
It should be noted that the zoom event, the drag event, the slide-up event, the slide-down event, the left slide event, and the right slide event are collectively referred to as an interaction event, and a corresponding relationship between the interaction event and a control instruction (i.e., a control operation performed on a window by an operating system) is shown in table 1.
TABLE 1
Interactive events Control instruction
Upsliding event Maximizing window
Event of gliding Minimizing windows
Left slip event Closing the window
Right slip event Reduction window
Zoom events Zooming window
Drag event Movable window
Specifically, the actual operations of the upslide event, the downslide event, the left slide event, and the drag event are sequentially shown in fig. 1 h.
It should be noted that the contents shown in table 1 and fig. 1h are only for illustration, and the correspondence between the interaction event and the control command can be set by a technician according to actual situations.
S122: and controlling the operating system to close the overlay.
After the control operation system closes the covering layer, normal interaction between the user and the original window can be recovered.
In summary, image frames are acquired by using a camera device preset on a large screen, gesture recognition is performed on the image frames, gesture types of the image frames are recognized by using a classification model, a transparent cover layer is used for capturing trigger operation of a user on the large screen, touch information is obtained according to the analysis trigger operation, an interaction event between the user and a window is determined, and an operating system is controlled to perform control operation corresponding to the interaction event on the window. Compared with the prior art, the control operation of the window can be realized only by making the corresponding gesture (namely the preset double-finger gesture) by the user to trigger the large screen without the interaction of the user and the menu or the button at the top of the window. In addition, the color of the covering layer is transparent, so that the display of the original window cannot be influenced. Therefore, by the scheme, convenience of man-machine interaction of the large-screen window can be effectively improved.
It should be noted that, in the foregoing embodiment, the step S101 is an optional specific implementation manner of the large-screen window control method described in this application. In addition, S102 mentioned in the above embodiment is also an optional specific implementation manner of the large-screen window control method described in this application. For this reason, the flow mentioned in the above embodiment can be summarized as the method shown in fig. 2.
As shown in fig. 2, a schematic diagram of another method for controlling a large screen window provided in the embodiment of the present application includes the following steps:
s201: and acquiring an image frame collected by a camera device preset on a large screen.
Wherein the image frames are used to indicate a human hand.
S202: and performing gesture recognition on the human hand to obtain the plane coordinates of the key points of the hand.
In the above embodiment, the process of performing gesture recognition on the image frame by using the preset gesture recognition model is an optional specific implementation manner of the gesture recognition.
S203: and inputting the plane coordinates of the key points of the hand as characteristic variables of the image frame into the classification model to obtain a classification result output by the classification model.
The classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the pre-labeled sample image frames as training targets. The classification result includes probabilities of the respective gesture types.
S204: and taking the gesture type with the highest probability as the gesture type of the image frame.
S205: and under the condition that the distance between the key point of the hand and the large screen is detected to be smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a masking layer with a transparent color.
The cover layer is used for capturing the triggering operation of a user on the large screen.
S206: and analyzing the trigger operation to obtain the touch information.
S207: based on the touch information, an interaction event of the user with the window is determined.
S208: and controlling the operating system to perform control operation corresponding to the interactive event on the window.
In summary, compared with the prior art, the control operation of the window can be realized only by making a corresponding gesture (i.e. a preset double-finger gesture) by the user to trigger the large screen without interaction between the user and the menu or button at the top of the window. In addition, the color of the covering layer is transparent, so that the display of the original window cannot be influenced. Therefore, by the scheme, convenience of man-machine interaction of the large-screen window can be effectively improved.
Corresponding to the large screen window control method provided by the embodiment of the application, the embodiment of the application also provides a large screen window control device.
As shown in fig. 3, a schematic structural diagram of a large screen window control device provided in an embodiment of the present application includes:
an acquiring unit 100, configured to acquire an image frame acquired by a camera device preset on a large screen; the image frames are used to indicate a human hand.
The recognition unit 200 is configured to perform gesture recognition on a human hand to obtain a plane coordinate of a key point of the hand.
The classification unit 300 is configured to input the plane coordinates of the hand key points as feature variables of the image frames into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the pre-labeled sample image frames as training targets; the classification result includes probabilities of the respective gesture types.
A first determining unit 400, configured to use the gesture type with the highest probability as the gesture type of the image frame.
The first control unit 500 is configured to control the operating system to generate a masking layer with a transparent color when it is detected that the distance between the key point of the hand and the large screen is smaller than a first preset threshold and the gesture type of the image frame is a preset double-finger gesture; the cover layer is used for capturing the triggering operation of the user on the large screen.
The first control unit 500 is specifically configured to: calculating the distance between each hand key point and a preset rectangle in the image frame to obtain each first numerical value; the preset rectangle is an imaging area of a large screen in an image frame; taking the first value with the minimum value as a target value; and under the condition that the target value is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a globally-topped covering layer with transparent color.
The first control unit 500 is configured to control a specific process of the operating system to generate a globally top-located masking layer with a transparent color, including: controlling an operating system to create a window for representing a Mongolian layer, and setting the window style of the window to be null; controlling an operating system to adjust the size of a window so that the window covers a preset area; controlling an operating system to set the background color of the window to be transparent; controlling the operating system to set the Topmos attribute of the window to true; controlling an operating system to bind a touch callback event on a window; the touch callback event is used for intercepting the triggering operation of a user on the large screen.
And an analyzing unit 600 configured to analyze the trigger operation to obtain the touch information. The touch information comprises the touch positions, the motion tracks and the touch duration of the double fingers.
A second determining unit 700 for determining an interaction event of the user with the window based on the touch information.
The second determining unit 700 is specifically configured to: judging whether the touch duration is greater than a second preset threshold value or not; under the condition that the touch duration is greater than a second preset threshold, calculating a change value of the relative distance between the two fingers based on the touch position and the motion track; judging whether the change value is larger than a third preset threshold value or not; determining the interaction event of the user and the target window as a zooming event under the condition that the variation value is larger than a third preset threshold value; the target window is an original window which is closest to the touch position in each original window; the zoom event is used to characterize the zoom target window.
The second determining unit 700 is further configured to: determining that the interaction event of the user and the target window is a dragging event under the condition that the variation value is not larger than a third preset threshold value; wherein the drag event is used to characterize the moving target window.
The second determining unit 700 is further configured to: under the condition that the touch duration is not greater than a second preset threshold, calculating the sliding direction of the two fingers based on the touch position and the motion track; under the condition that the sliding direction indication slides upwards, determining that an interaction event of a user and a target window is a slide-up event; the upsliding event is used to characterize a maximized target window; under the condition that the sliding direction indicates downward sliding, determining that an interaction event of a user and a target window is a sliding event; a glide-down event is used to characterize a minimization target window; under the condition that the sliding direction indicates to slide leftwards, determining that the interaction event of the user and the target window is a left-sliding event; the left slide event is used for representing the closing of the target window; under the condition that the sliding direction indicates to slide rightwards, determining that the interaction event of the user and the target window is a right-sliding event; the right slip event is used to characterize the restore target window.
And a second control unit 800, configured to control the operating system to perform a control operation corresponding to the interactive event on the window.
A third control unit 900, configured to control the operating system to close the overlay.
In summary, compared with the prior art, the control operation of the window can be realized only by making a corresponding gesture (i.e. a preset double-finger gesture) by the user to trigger the large screen without interaction between the user and the menu or button at the top of the window. In addition, the color of the covering layer is transparent, so that the display of the original window cannot be influenced. Therefore, by the scheme, convenience of man-machine interaction of the large-screen window can be effectively improved.
The application also provides a computer-readable storage medium, which comprises a stored program, wherein the program executes the large-screen window control method provided by the application.
The application also provides a large screen window controlgear, includes: a processor, a memory, and a bus. The processor is connected with the memory through a bus, the memory is used for storing programs, and the processor is used for running the programs, wherein when the programs run, the large screen window control method provided by the application is executed, and the method comprises the following steps:
acquiring an image frame collected by a camera device preset on a large screen; the image frame is used for indicating a human hand;
performing gesture recognition on the human hand to obtain a plane coordinate of a key point of the hand;
taking the plane coordinates of the key points of the hand as the characteristic variables of the image frames, and inputting the characteristic variables into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the sample image frames marked in advance as training targets; the classification result comprises probabilities of various gesture types;
taking the gesture type with the highest probability as the gesture type of the image frame;
controlling the operating system to generate a transparent covering layer under the condition that the distance between the key point of the hand and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture; the covering layer is used for capturing the triggering operation of a user on the large screen;
analyzing the trigger operation to obtain touch information;
determining an interaction event of the user with a window based on the touch information;
and controlling the operating system to perform control operation corresponding to the interaction event on the window.
Optionally, the controlling the operating system to generate a masking layer with a transparent color when it is detected that the distance between the key point of the hand and the large screen is smaller than a first preset threshold and the gesture type of the image frame is a preset double-finger gesture includes:
calculating the distance between each hand key point in the image frame and a preset rectangle to obtain each first numerical value; the preset rectangle is an imaging area of the large screen in the image frame;
taking the first value with the minimum value as a target value;
and under the condition that the target value is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a globally-topped covering layer with a transparent color.
Optionally, the controlling the operating system to generate a globally top-located cover layer with a transparent color includes:
controlling the operating system to create a window for representing a covering layer, and setting the window style of the window to be null;
controlling the operating system to adjust the size of the window so that the window covers a preset area;
controlling the operating system to set the background color of the window to be transparent;
controlling the operating system to set the Topmos attribute of the window to true;
controlling the operating system to bind a touch callback event on the window; and the touch callback event is used for intercepting the triggering operation of the user on the large screen.
Optionally, the touch information includes a touch position, a motion track, and a touch duration of the two fingers;
the determining, based on the touch information, an interaction event of the user with a window includes:
judging whether the touch duration is greater than a second preset threshold value or not;
under the condition that the touch duration is greater than the second preset threshold, calculating a change value of the relative distance between the two fingers based on the touch position and the motion track;
judging whether the change value is larger than a third preset threshold value or not;
determining that the interaction event of the user and the target window is a zooming event under the condition that the change value is larger than the third preset threshold value; the target window is an original window which is closest to the touch position in each original window; the zoom event is used to characterize zooming the target window.
Optionally, the method further includes:
determining that the interaction event of the user and the target window is a dragging event under the condition that the variation value is not larger than the third preset threshold; wherein the drag event is used for representing the movement of the target window.
Optionally, the method further includes:
under the condition that the touch duration is not greater than the second preset threshold, calculating the sliding direction of the double fingers based on the touch position and the motion track;
under the condition that the sliding direction indicates upward sliding, determining that the interaction event of the user and the target window is a sliding-up event; the upsliding event is used to characterize maximizing the target window;
under the condition that the sliding direction indicates downward sliding, determining that the interaction event of the user and the target window is a sliding event; the glide-down event is used to characterize minimizing the target window;
determining that the interaction event of the user and the target window is a left-sliding event under the condition that the sliding direction indicates to slide left; the left-sliding event is used for representing the closing of the target window;
under the condition that the sliding direction indicates to slide rightwards, determining that the interaction event of the user and the target window is a right-sliding event; and the right slide event is used for representing and restoring the target window.
Optionally, after the controlling the operating system to perform the control operation corresponding to the interaction event on the window, the method further includes:
and controlling the operating system to close the cover layer.
The functions described in the method of the embodiment of the present application, if implemented in the form of software functional units and sold or used as independent products, may be stored in a storage medium readable by a computing device. Based on such understanding, part of the contribution to the prior art of the embodiments of the present application or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including several instructions for causing a computing device (which may be a personal computer, a server, a mobile computing device or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A large screen window control method is characterized by comprising the following steps:
acquiring an image frame collected by a camera device preset on a large screen; the image frame is used for indicating a human hand;
performing gesture recognition on the human hand to obtain a plane coordinate of a key point of the hand;
taking the plane coordinates of the key points of the hand as the characteristic variables of the image frames, and inputting the characteristic variables into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the sample image frames marked in advance as training targets; the classification result comprises probabilities of various gesture types;
taking the gesture type with the highest probability as the gesture type of the image frame;
controlling the operating system to generate a transparent covering layer under the condition that the distance between the key point of the hand and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture; the covering layer is used for capturing the triggering operation of a user on the large screen;
analyzing the trigger operation to obtain touch information;
determining an interaction event of the user with a window based on the touch information;
and controlling the operating system to perform control operation corresponding to the interaction event on the window.
2. The method according to claim 1, wherein in the case that it is detected that the distance between the hand key point and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a masking layer with a transparent color comprises:
calculating the distance between each hand key point in the image frame and a preset rectangle to obtain each first numerical value; the preset rectangle is an imaging area of the large screen in the image frame;
taking the first value with the minimum value as a target value;
and under the condition that the target value is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture, controlling the operating system to generate a globally-topped covering layer with a transparent color.
3. The method of claim 2, wherein controlling the operating system to generate a globally topped matte of transparent color comprises:
controlling the operating system to create a window for representing a covering layer, and setting the window style of the window to be null;
controlling the operating system to adjust the size of the window so that the window covers a preset area;
controlling the operating system to set the background color of the window to be transparent;
controlling the operating system to set the Topmos attribute of the window to true;
controlling the operating system to bind a touch callback event on the window; and the touch callback event is used for intercepting the triggering operation of the user on the large screen.
4. The method of claim 1, wherein the touch information comprises touch positions, motion tracks, and touch durations of the two fingers;
the determining, based on the touch information, an interaction event of the user with a window includes:
judging whether the touch duration is greater than a second preset threshold value or not;
under the condition that the touch duration is greater than the second preset threshold, calculating a change value of the relative distance between the two fingers based on the touch position and the motion track;
judging whether the change value is larger than a third preset threshold value or not;
determining that the interaction event of the user and the target window is a zooming event under the condition that the change value is larger than the third preset threshold value; the target window is an original window which is closest to the touch position in each original window; the zoom event is used to characterize zooming the target window.
5. The method of claim 4, further comprising:
determining that the interaction event of the user and the target window is a dragging event under the condition that the variation value is not larger than the third preset threshold; wherein the drag event is used for representing the movement of the target window.
6. The method of claim 4, further comprising:
under the condition that the touch duration is not greater than the second preset threshold, calculating the sliding direction of the double fingers based on the touch position and the motion track;
under the condition that the sliding direction indicates upward sliding, determining that the interaction event of the user and the target window is a sliding-up event; the upsliding event is used to characterize maximizing the target window;
under the condition that the sliding direction indicates downward sliding, determining that the interaction event of the user and the target window is a sliding event; the glide-down event is used to characterize minimizing the target window;
determining that the interaction event of the user and the target window is a left-sliding event under the condition that the sliding direction indicates to slide left; the left-sliding event is used for representing the closing of the target window;
under the condition that the sliding direction indicates to slide rightwards, determining that the interaction event of the user and the target window is a right-sliding event; and the right slide event is used for representing and restoring the target window.
7. The method of claim 4, wherein after controlling the operating system to perform the control operation corresponding to the interaction event on the window, the method further comprises:
and controlling the operating system to close the cover layer.
8. A large screen window control device, characterized by comprising:
the acquisition unit is used for acquiring image frames acquired by the camera equipment preset on the large screen; the image frame is used for indicating a human hand;
the recognition unit is used for performing gesture recognition on the human hand to obtain a plane coordinate of a key point of the hand;
the classification unit is used for inputting the plane coordinates of the key points of the hand as the characteristic variables of the image frames into a classification model to obtain a classification result output by the classification model; the classification model is obtained by training based on taking the characteristic variables of the sample image frames as input and taking the gesture types of the sample image frames marked in advance as training targets; the classification result comprises probabilities of various gesture types;
a first determining unit, configured to use the gesture type with the highest probability as the gesture type of the image frame;
the first control unit is used for controlling the operating system to generate a covering layer with a transparent color under the condition that the distance between the hand key point and the large screen is smaller than a first preset threshold value and the gesture type of the image frame is a preset double-finger gesture; the covering layer is used for capturing the triggering operation of a user on the large screen;
the analysis unit is used for analyzing the trigger operation to obtain touch information;
a second determining unit, configured to determine an interaction event of the user with a window based on the touch information;
and the second control unit is used for controlling the operating system to perform control operation corresponding to the interactive event on the window.
9. A computer-readable storage medium characterized in that the computer-readable storage medium includes a stored program, wherein the program executes the large-screen window control method according to any one of claims 1 to 7.
10. A large screen window control apparatus, characterized by comprising: a processor, a memory, and a bus; the processor and the memory are connected through the bus;
the memory is used for storing a program, and the processor is used for running the program, wherein the program executes the large-screen window control method according to any one of claims 1 to 7 when running.
CN202110785561.7A 2021-07-12 2021-07-12 Large-screen window control method and device, storage medium and equipment Pending CN113448485A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110785561.7A CN113448485A (en) 2021-07-12 2021-07-12 Large-screen window control method and device, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110785561.7A CN113448485A (en) 2021-07-12 2021-07-12 Large-screen window control method and device, storage medium and equipment

Publications (1)

Publication Number Publication Date
CN113448485A true CN113448485A (en) 2021-09-28

Family

ID=77815881

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110785561.7A Pending CN113448485A (en) 2021-07-12 2021-07-12 Large-screen window control method and device, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN113448485A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311209A (en) * 2023-03-28 2023-06-23 北京匠数科技有限公司 Window detection system method and system and electronic equipment
CN116931735A (en) * 2023-08-03 2023-10-24 北京行者无疆科技有限公司 AR (augmented reality) glasses display terminal equipment key suspension position identification system and method
CN117572984A (en) * 2024-01-15 2024-02-20 南京极域信息科技有限公司 Operation window positioning method for large touch screen

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255324A (en) * 2018-09-05 2019-01-22 北京航空航天大学青岛研究院 Gesture processing method, interaction control method and equipment
CN109710071A (en) * 2018-12-26 2019-05-03 青岛小鸟看看科技有限公司 A kind of screen control method and device
CN111273778A (en) * 2020-02-14 2020-06-12 北京百度网讯科技有限公司 Method and device for controlling electronic equipment based on gestures

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109255324A (en) * 2018-09-05 2019-01-22 北京航空航天大学青岛研究院 Gesture processing method, interaction control method and equipment
CN109710071A (en) * 2018-12-26 2019-05-03 青岛小鸟看看科技有限公司 A kind of screen control method and device
CN111273778A (en) * 2020-02-14 2020-06-12 北京百度网讯科技有限公司 Method and device for controlling electronic equipment based on gestures

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116311209A (en) * 2023-03-28 2023-06-23 北京匠数科技有限公司 Window detection system method and system and electronic equipment
CN116311209B (en) * 2023-03-28 2024-01-19 北京匠数科技有限公司 Window detection method, system and electronic equipment
CN116931735A (en) * 2023-08-03 2023-10-24 北京行者无疆科技有限公司 AR (augmented reality) glasses display terminal equipment key suspension position identification system and method
CN117572984A (en) * 2024-01-15 2024-02-20 南京极域信息科技有限公司 Operation window positioning method for large touch screen

Similar Documents

Publication Publication Date Title
CN113448485A (en) Large-screen window control method and device, storage medium and equipment
US9104242B2 (en) Palm gesture recognition method and device as well as human-machine interaction method and apparatus
US11221681B2 (en) Methods and apparatuses for recognizing dynamic gesture, and control methods and apparatuses using gesture interaction
KR101526644B1 (en) Method system and software for providing image sensor based human machine interfacing
AU2022202817B2 (en) Method for identifying an object within an image and mobile device for executing the method
JP5297530B2 (en) Image processing apparatus and interface apparatus
TWI489317B (en) Method and system for operating electric apparatus
US20130169530A1 (en) Human eye controlled computer mouse interface
CN111062312A (en) Gesture recognition method, gesture control method, device, medium and terminal device
CN109697394B (en) Gesture detection method and gesture detection device
US20120069168A1 (en) Gesture recognition system for tv control
CN102200830A (en) Non-contact control system and control method based on static gesture recognition
CN107357414B (en) Click action recognition method and device
US20140369559A1 (en) Image recognition method and image recognition system
CN107797748B (en) Virtual keyboard input method and device and robot
CN109376618B (en) Image processing method and device and electronic equipment
WO2019037257A1 (en) Password input control device and method, and computer readable storage medium
KR101433543B1 (en) Gesture-based human-computer interaction method and system, and computer storage media
CN113448442A (en) Large screen false touch prevention method and device, storage medium and equipment
CN113031464A (en) Device control method, device, electronic device and storage medium
CN115826764A (en) Gesture control method and system based on thumb
Ghodichor et al. Virtual mouse using hand gesture and color detection
KR20190132885A (en) Apparatus, method and computer program for detecting hand from video
CN114816045A (en) Method and device for determining interaction gesture and electronic equipment
CN115421590A (en) Gesture control method, storage medium and camera device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination