CN115904063A - Non-contact human-computer interaction pen handwriting generation method, device, equipment and system - Google Patents

Non-contact human-computer interaction pen handwriting generation method, device, equipment and system Download PDF

Info

Publication number
CN115904063A
CN115904063A CN202211346692.6A CN202211346692A CN115904063A CN 115904063 A CN115904063 A CN 115904063A CN 202211346692 A CN202211346692 A CN 202211346692A CN 115904063 A CN115904063 A CN 115904063A
Authority
CN
China
Prior art keywords
pen
human
computer interaction
interaction pen
hand detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211346692.6A
Other languages
Chinese (zh)
Inventor
王文通
王梦魁
赵洁
滕达
刘强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Petrochemical Technology
Original Assignee
Beijing Institute of Petrochemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Petrochemical Technology filed Critical Beijing Institute of Petrochemical Technology
Priority to CN202211346692.6A priority Critical patent/CN115904063A/en
Publication of CN115904063A publication Critical patent/CN115904063A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a non-contact human-computer interaction pen handwriting generation method, a non-contact human-computer interaction pen handwriting generation device, non-contact human-computer interaction pen handwriting generation equipment and a non-contact human-computer interaction pen handwriting generation system, and belongs to the technical field of human-computer interaction; determining the pen point position of the current man-machine interaction pen in a subimage of the man-machine interaction pen according to a preset pen point detection model; based on a preset multi-target tracking algorithm, tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video to generate a human-computer interaction pen handwriting. Therefore, by adopting the technical scheme of the application, after the position of the human-computer interaction pen is obtained, the position of the pen point is determined, so that the interaction pen handwriting is generated, wherein the position of the interaction pen can be determined by combining the hand position of a user, any pen can be well tracked, more continuous handwriting can be generated, and handwriting drawing can be completed in a complex scene.

Description

Non-contact human-computer interaction pen handwriting generation method, device, equipment and system
Technical Field
The invention relates to the technical field of human-computer interaction, in particular to a non-contact human-computer interaction pen handwriting generation method, device, equipment and system.
Background
Human-Computer Interaction technologies (HCI) establishes "communication" between humans and machines based on Computer technology. The man-machine interaction technology sends instruction information to the machine through certain man-machine interaction actions and programs, so that the machine feeds back results according to the received instruction information to meet the requirements of people. In various human-computer interaction devices, a human-computer interaction pen is an important carrier for realizing mutual communication and feedback between human and a computer.
In the construction of a traditional human-computer interaction pen, a large number of inertial sensing devices are generally adopted for information acquisition. However, a large number of sensing devices are often expensive to manufacture; in addition, various noises, interference from the outside and the like exist in the working process, and the messy signals cause errors in the interaction process, so that the handwriting generating accuracy of the interactive pen is reduced. The method for drawing the handwriting according to the computer vision solves the technical problems of high manufacturing cost of the interactive pen and low accuracy rate of handwriting generation caused by interference of sensor signal interaction. However, when the background is complex, when the computer vision technology is used for detecting an object, the object is often interfered and shielded by the complex and confusable background or different scenes, so that the target area is difficult to extract, handwriting cannot be accurately drawn, the handwriting is discontinuous, and the like, and the method is poor in universality and cannot be applied to actual life scenes.
Disclosure of Invention
In view of the above, the present invention provides a method, an apparatus, a device and a system for generating handwriting of a non-contact human-computer interaction pen, so as to overcome the problems of low handwriting drawing accuracy, discontinuous handwriting and poor universality.
In order to realize the purpose, the invention adopts the following technical scheme:
in one aspect, a method for generating a handwriting of a non-contact human-computer interaction pen comprises the following steps:
acquiring a current frame image in real time;
detecting a subimage of a human-computer interaction pen in the current frame image according to a preset interaction pen detection model;
determining the pen point position of the current human-computer interaction pen in the subimage of the human-computer interaction pen according to a preset pen point detection model;
tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; wherein the target video comprises a current frame image and a historical frame image.
Optionally, the detecting a sub-image of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model includes:
determining a hand detection frame of a user in the current frame image by adopting a hand detection model;
and detecting the current frame image in the user hand detection frame by using the preset interactive pen detection model to obtain a boundary frame of the human-computer interactive pen, and taking the image in the boundary frame of the human-computer interactive pen as a subimage of the human-computer interactive pen.
Optionally, the determining, by using the hand detection model, a hand detection frame of the user in the current frame image includes:
determining a hand detection frame in the current frame image as an original user hand detection frame by adopting a hand detection model;
and amplifying the original user hand detection box according to a coordinate expansion algorithm to obtain an amplified user hand detection box, and taking the amplified user hand detection box as the user hand detection box.
Optionally, the enlarging the original user hand detection box according to the coordinate expansion algorithm to obtain the enlarged user hand detection box includes:
determining the coordinates of each hand key point in the original user hand detection frame according to the hand detection model;
determining the minimum value of the abscissa and the maximum value of the ordinate in each hand key point coordinate;
and reducing the minimum value of the abscissa by a preset value, and increasing the maximum value of the ordinate by the preset value so as to obtain the amplified user hand detection frame according to the modified abscissa and ordinate.
Optionally, the preset multi-target tracking algorithm includes: the DeepSORT algorithm.
Optionally, the hand detection model includes a MediaPipe hand detection model; the preset interactive pen detection model comprises: a first YOLOv5 convolutional neural network model, wherein the first YOLOv5 convolutional neural network model is obtained by training based on an interactive pen sample image and a predetermined interactive pen bounding box label;
the preset pen tip detection model comprises: and a second YOLOv5 convolutional neural network model, wherein the second YOLOv5 convolutional neural network model is obtained by training based on the interactive pen sample image and a predetermined interactive pen nib label.
From one aspect, a non-contact human-computer interaction pen handwriting generating device comprises:
the acquisition module is used for acquiring a current frame image in real time;
the first detection module is used for detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model;
the second detection module is used for determining the current pen point position of the human-computer interaction pen in the subimage of the human-computer interaction pen according to the preset pen point detection model;
the generating module is used for tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in the target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image.
In another aspect, a system for generating a handwriting of a non-contact human-computer interaction pen includes: the camera shooting assembly and the control assembly are connected with each other;
the camera shooting assembly is used for shooting a frame image of the man-machine interaction pen at a preset distance;
the control component is used for executing the non-contact human-computer interaction pen handwriting generating method.
In yet another aspect, a human-computer interaction system includes: a human-computer interaction pen and the non-contact human-computer interaction pen handwriting generating system.
In another aspect, a non-contact human-computer interaction pen handwriting generating device comprises a processor and a memory, wherein the processor is connected with the memory:
the processor is used for calling and executing the program stored in the memory;
the memory is used for storing the program, and the program is at least used for executing the non-contact man-machine interaction pen handwriting generating method.
The invention at least comprises the following beneficial effects:
according to the non-contact human-computer interaction pen handwriting generation method, device, equipment and system provided by the embodiment of the invention, the current frame image is obtained in real time; detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model; determining the pen point position of the current man-machine interaction pen according to a preset pen point detection model in a subimage of the man-machine interaction pen; tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image. Therefore, by adopting the technical scheme of the application, after the position of the human-computer interaction pen is obtained, the position of the pen point is determined, so that the interaction pen handwriting is generated, wherein the position of the interaction pen can be determined by combining the hand position of a user, any pen can be well tracked, more continuous handwriting can be generated, and handwriting drawing can be completed in a complex scene.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flowchart of a method for generating a handwriting of a non-contact human-computer interaction pen according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a human-computer interaction system according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating an enlarging process of a hand detection box of a user according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating system according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail below. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without any inventive step, are within the scope of the present invention.
Currently available interactive pens can be classified into contact and non-contact:
contact human-computer interaction pen: the technology based on a contact type human-computer interactive pen is applied to mobile computing tools represented by PDA (Personal Digital Assistant), such as Apple Pencil and Logitech Cryon. The Apple Pencil is a multifunctional and visual contact type man-machine interactive pen applied to the iPad, and can provide pixel-level precision when notes, prototypes, paintings, marked documents and the like are taken in the iPad application program. Apple Pencil is expensive, single in applicable equipment, large in size, and meanwhile the pen body is not easy to hold due to excessive slip in the using process.
Non-contact human-computer interaction pen: contactless interactive pens can in principle be divided into sensor-based and vision-based.
A non-contact infrared interactive pen developed by InfiniteZ company utilizes a virtual reality technology and a real-time human-computer interaction technology to develop a tablet computer supporting a 3D stereoscopic scene and human-computer interaction. For the 3D model in the tablet computer, the user can perform various operations on the 3D model by using the infrared interactive pen, such as rotation, translation, zooming and the like. In 2019, luo Tech issued a VR pen named "Logitech VR Ink Pilot Edition". This is the first VR pen in the world that utilizes two-dimensional surfaces and aerial pictures in three-dimensional space, aiming to help professionals to complete the design more easily in VR. The stylus is manufactured using SteamVR tracking technology, similar to a typical SteamVR controller, but with natural accuracy based on finger control.
The two non-contact type human-computer interaction pens based on the sensor accord with daily use behaviors of people, and can carry out various operations on complex virtual objects. However, the non-contact interactive pen based on the sensor needs to be provided with a large number of sensors, is expensive in manufacturing cost, is only suitable for a single scene, and does not have universality.
Based on the above, the embodiment of the invention provides a non-contact human-computer interaction pen handwriting generation method, device, equipment and system.
Fig. 1 is a schematic flowchart of a method for generating pen handwriting with a non-contact human-computer interaction pen according to an embodiment of the present invention, where an execution main body of the method provided by the present application may be a control component, and the control component may be a single chip, a programmable controller, and the like, and referring to fig. 1, this embodiment may include the following steps:
s1, acquiring a current frame image in real time.
In a specific implementation process, the non-contact human-computer interaction pen handwriting generating method provided by the application can be applied to a human-computer interaction system.
Fig. 2 is a schematic structural diagram of a human-computer interaction system according to an embodiment of the present invention, and referring to fig. 2, the system provided in the present application may include: a human-computer interaction pen 21, a camera assembly 22 and a control assembly 23; the camera shooting assembly is connected with the control assembly, and a user of the camera shooting assembly shoots frame images of the human-computer interaction pen at a preset distance. The frame image is each frame forming an image, and a plurality of frame images form a video.
It should be noted that, in order to ensure the accuracy of frame image acquisition, when the technical solution described in the embodiment of the present application is applied, a user needs to stand at a position at least a preset distance (for example, 1 meter) away from the camera assembly, so that the camera assembly can capture the hand gesture of a person and the human-computer interaction pen in the hand. The camera shooting assembly can be installed on a notebook computer, a desktop computer, a mobile phone and the like.
When non-contact human-computer interaction pen handwriting is generated, the current frame image can be shot in real time by the camera shooting component (for example, a camera), and the current frame image can be obtained by the control component after the current frame image is shot by the camera shooting component.
And S2, detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model.
After the current frame image is obtained, the current frame image can be input into a preset interactive pen detection model, a human-computer interactive pen in the current frame image is detected, and a sub-image of the human-computer interactive pen is obtained.
It should be noted that the preset interactive pen detection model may be a first YOLOv5 convolutional neural network model, where the first YOLOv5 convolutional neural network model is obtained by training based on an interactive pen sample image and a predetermined interactive pen bounding box label. The training process is not described in detail in this application.
In some embodiments, detecting a sub-image of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model comprises:
determining a user hand detection frame in the current frame image by adopting a hand detection model;
and detecting the current frame image in the user hand detection frame by using a preset interactive pen detection model to obtain a boundary frame of the human-computer interactive pen, and taking the image in the boundary frame of the human-computer interactive pen as a sub-image of the human-computer interactive pen.
Notably, the hand detection models include MediaPipe hand detection models (e.g., based on MediaPipe gesture algorithms). The preset pen point detection model comprises: and a second Yolov5 convolutional neural network model, wherein the second Yolov5 convolutional neural network model is obtained by training based on the interactive pen sample image and a predetermined interactive pen nib label.
To improve the accuracy of interactive pen detection, a MediaPipe hand detection model is used to obtain the hand detection box in this application. After the user hand detection frame in the current frame image is acquired, the detection range of the human-computer interaction pen can be narrowed to the periphery of the hand of the user, unnecessary interference is removed, and the detection accuracy of the human-computer interaction pen is improved. After the hand detection frame of the user is obtained, the content in the hand detection frame of the user can be detected through a second YOLOv5 convolutional neural network model to obtain a boundary frame of the man-machine interaction pen, and an image in the boundary frame of the man-machine interaction pen is used as a sub-image of the man-machine interaction pen.
In some embodiments, determining a user hand detection box in the current frame image using a hand detection model includes:
determining a hand detection frame in the current frame image as an original user hand detection frame by adopting a hand detection model;
and amplifying the original user hand detection box according to a coordinate expansion algorithm to obtain an amplified user hand detection box, and taking the amplified user hand detection box as a user hand detection box.
In the application, in order to further improve the detection accuracy, after the hand detection frame is obtained through the hand detection model, the hand detection frame is used as an original user hand detection frame, and the original hand detection frame is enlarged through an enlarging algorithm.
In some embodiments, enlarging the original user hand detection box according to a coordinate expansion algorithm to obtain an enlarged user hand detection box comprises:
determining the coordinates of each hand key point in the original user hand detection frame according to the hand detection model;
determining the minimum value of the abscissa and the maximum value of the ordinate in each hand key point coordinate;
and reducing the minimum value of the abscissa by a preset value, and increasing the maximum value of the ordinate by the preset value so as to obtain the amplified hand detection frame of the user according to the modified abscissa and ordinate.
For example, hand bone point recognition can be performed based on the MediaPipe gesture algorithm, hand position information can be accurately acquired from a single picture, real-time detection can be supported, and 21 coordinates of hand joints and finger joints of a user can be obtained. According to the hand key point data generated by MediaPipe, the hand detection box is enlarged by adopting a coordinate enlargement algorithm so as to enlarge the hand detection box to the size capable of completely detecting the size of the interactive pen in the hand. Specifically, coordinates of 21 key points of the hand can be acquired, all coordinate points are placed in a list, the minimum value and the maximum value in the x-axis direction and the y-axis direction are obtained in the list through calculation, the point with the minimum value in the x-axis direction is moved to the left side of the x-axis direction, the maximum value in the y-axis direction is moved to the upper side of the y-axis direction, an enlarged detection frame is obtained, and the enlarged detection frame is used as a user hand detection frame. In order to ensure the expansion effect, the expanded user hand detection frame can completely comprise a hand and a pen by adjusting the size of the preset numerical value.
Fig. 3 is a schematic diagram of an enlarging process of a user hand detection box according to an embodiment of the present invention, where a coordinate image is an original user hand detection box generated by MediaPipe, and the right side of an arrow is an enlarged user hand detection box, and the enlarged user hand detection box can completely include a hand and a pen.
In the present application, the coordinate expansion algorithm is exemplified:
the algorithm is as follows: hand coordinate expansion algorithm
Figure BDA0003917449010000091
Figure BDA0003917449010000101
/>
And S3, determining the current pen point position of the man-machine interaction pen in a subimage of the man-machine interaction pen according to the preset pen point detection model.
Wherein, predetermine nib detection model, include: and a second YOLOv5 convolutional neural network model, wherein the second YOLOv5 convolutional neural network model is obtained by training based on the interactive pen sample image and a predetermined interactive pen nib label.
And determining the current pen point position of the man-machine interaction pen in the subimage of the man-machine interaction pen according to the second YOLOv5 convolutional neural network model.
S4, tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in the target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image.
In the target video, a current frame image and a historical frame image are included, wherein the number of the historical frame images can be at least two, and a plurality of frame images form the target video. And after the position of the pen point of the human-computer interaction pen in the current frame image and the position of the pen point of the human-computer interaction pen in the historical frame image are obtained, tracking the pen point position by applying a multi-target tracking algorithm to generate the handwriting of the human-computer interaction pen.
In some embodiments, the predictive multi-target tracking algorithm comprises: deepsORT algorithm. DeepSORT is an improved version of an SORT multi-target tracking algorithm, and by improving the association mode, the accuracy of tracking objects shielded for a long time is improved, and the phenomenon of frequent switching of the ID of the tracked objects is reduced.
According to the non-contact human-computer interaction pen handwriting generation method provided by the embodiment of the invention, the current frame image is obtained in real time; detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model; determining the pen point position of the current man-machine interaction pen according to a preset pen point detection model in a subimage of the man-machine interaction pen; tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image. Therefore, by adopting the technical scheme, after the position of the man-machine interaction pen is obtained, the position of the pen point is determined, so that the handwriting of the interaction pen is generated, wherein the position of the interaction pen can be determined by combining the hand position of a user, any pen can be well tracked, more continuous handwriting can be generated, and handwriting drawing can be completed in a complex scene.
In order to explain the technical effects of the technical solutions provided in the present application, the present application further provides a verification embodiment:
for the test of the present invention, ten users were randomly selected in the experiment, and each user tried to draw ten shapes. In the experiment, the distance between the user and the camera is 1m, and the camera is over against the user. And judging the drawing track result of the shape, if the track is consistent with the preset track, considering that the track is correct, and expressing the track by using T, otherwise, expressing the track by using F. The sum refers to the number of times each trace was successful. Specific results are shown in table 1.
Table 1: track test chart
Figure BDA0003917449010000111
Referring to table 1, the test accuracy of 10 testers is 88%, and the shape of the interactive pen and the writing habit of the randomly drawn writing user have an influence on the accuracy of trajectory determination. The experiment shows that: the method has good practical effect, does not need specific hardware, can be applied to any pen and has universality.
Based on a general inventive concept, the embodiment of the present invention further provides a non-contact human-computer interaction pen handwriting generating device, which is used for implementing the above method embodiment.
Fig. 4 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating device according to an embodiment of the present invention, and referring to fig. 4, the device according to the embodiment of the present application may include:
an obtaining module 41, configured to obtain a current frame image in real time;
the first detection module 42 is configured to detect a sub-image of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model;
the second detection module 43 is configured to determine a current pen tip position of the human-computer interaction pen in the subimage of the human-computer interaction pen according to the preset pen tip detection model;
the generating module 44 is used for tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in the target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image.
Optionally, the first detection module is specifically configured to determine a user hand detection frame in the current frame image by using a hand detection model;
and detecting the current frame image in the user hand detection frame by using a preset interactive pen detection model to obtain a boundary frame of the human-computer interactive pen, and taking the image in the boundary frame of the human-computer interactive pen as a sub-image of the human-computer interactive pen.
Optionally, the first detection module is specifically configured to determine a hand detection frame in the current frame image as an original user hand detection frame by using a hand detection model;
and amplifying the original user hand detection box according to a coordinate expansion algorithm to obtain an amplified user hand detection box, and taking the amplified user hand detection box as a user hand detection box.
Optionally, the first detection module is specifically configured to determine coordinates of each hand key point in the original user hand detection frame according to the hand detection model;
determining the minimum value of the abscissa and the maximum value of the ordinate in each hand key point coordinate;
and reducing the minimum value of the abscissa by a preset value, and increasing the maximum value of the ordinate by the preset value so as to obtain the amplified hand detection frame of the user according to the modified abscissa and ordinate.
With regard to the apparatus in the above embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be described in detail here.
The non-contact human-computer interaction pen handwriting generating device provided by the embodiment of the invention acquires the current frame image in real time; detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model; determining the pen point position of the current man-machine interaction pen according to a preset pen point detection model in a subimage of the man-machine interaction pen; tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; the target video comprises a current frame image and a historical frame image. Therefore, by adopting the technical scheme of the application, after the position of the human-computer interaction pen is obtained, the position of the pen point is determined, so that the interaction pen handwriting is generated, wherein the position of the interaction pen can be determined by combining the hand position of a user, any pen can be well tracked, more continuous handwriting can be generated, and handwriting drawing can be completed in a complex scene.
Based on a general inventive concept, the embodiment of the invention also provides a non-contact human-computer interaction pen handwriting generating system.
Fig. 5 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating system according to an embodiment of the present invention, and referring to fig. 5, the system according to the embodiment of the present application may include:
a camera assembly 22 and a control assembly 23 connected to each other;
the camera shooting assembly is used for shooting a frame image of the man-machine interaction pen at a preset distance;
the control component is used for executing the non-contact human-computer interaction pen handwriting generation method described in any one of the above embodiments.
Based on a general inventive concept, the embodiment of the present invention further provides a non-contact human-computer interaction pen handwriting generating device.
Fig. 6 is a schematic structural diagram of a non-contact human-computer interaction pen handwriting generating device according to an embodiment of the present invention, and referring to fig. 6, the device according to the embodiment of the present application may include: a processor 61 and a memory 62, the processor 61 being connected to the memory 62. Wherein, the processor 61 is used for calling and executing the program stored in the memory 62; the memory 62 is used for storing the program, and the program is at least used for executing the non-contact man-machine interaction pen handwriting generating method in the above embodiment.
The specific implementation of the non-contact human-computer interaction pen trace generation device provided in the embodiment of the present application may refer to the implementation of the non-contact human-computer interaction pen trace generation method in any of the above embodiments, and details are not repeated here.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar contents in other embodiments may be referred to for the contents which are not described in detail in some embodiments.
It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are exemplary and not to be construed as limiting the present invention, and that changes, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A non-contact human-computer interaction pen handwriting generation method is characterized by comprising the following steps:
acquiring a current frame image in real time;
detecting a subimage of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model;
determining the pen point position of the current human-computer interaction pen in the subimage of the human-computer interaction pen according to a preset pen point detection model;
tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in a target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; wherein the target video comprises a current frame image and a historical frame image.
2. The method according to claim 1, wherein the detecting a sub-image of a human-computer interaction pen in the current frame image according to a preset interaction pen detection model comprises:
determining a hand detection frame of a user in the current frame image by adopting a hand detection model;
and detecting the current frame image in the user hand detection frame by using the preset interactive pen detection model to obtain a boundary frame of the human-computer interactive pen, and taking the image in the boundary frame of the human-computer interactive pen as a subimage of the human-computer interactive pen.
3. The method of claim 2, wherein determining a user hand detection box in the current frame image using a hand detection model comprises:
determining a hand detection frame in the current frame image as an original user hand detection frame by adopting a hand detection model;
and amplifying the original user hand detection frame according to a coordinate expansion algorithm to obtain an amplified user hand detection frame, and taking the amplified user hand detection frame as the user hand detection frame.
4. The method of claim 3, wherein the enlarging the original user hand detection box according to a coordinate dilation algorithm to obtain an enlarged user hand detection box comprises:
determining the coordinates of each hand key point in the original user hand detection frame according to the hand detection model;
determining the minimum value of the abscissa and the maximum value of the ordinate in each hand key point coordinate;
and reducing the minimum value of the abscissa by a preset value, and increasing the maximum value of the ordinate by the preset value so as to obtain the amplified user hand detection frame according to the modified abscissa and ordinate.
5. The method of claim 1, wherein the predictive multi-target tracking algorithm comprises: deepsORT algorithm.
6. The method of claim 2, wherein the hand detection model comprises a MediaPipe hand detection model; the preset interactive pen detection model comprises: a first YOLOv5 convolutional neural network model, wherein the first YOLOv5 convolutional neural network model is obtained by training based on an interactive pen sample image and a predetermined interactive pen bounding box label;
the preset pen point detection model comprises: and a second YOLOv5 convolutional neural network model, wherein the second YOLOv5 convolutional neural network model is obtained by training based on the interactive pen sample image and a predetermined interactive pen nib label.
7. A non-contact human-computer interaction pen handwriting generating device is characterized by comprising:
the acquisition module is used for acquiring a current frame image in real time;
the first detection module is used for detecting a sub-image of the human-computer interaction pen in the current frame image according to a preset interaction pen detection model;
the second detection module is used for determining the current pen point position of the human-computer interaction pen in the subimage of the human-computer interaction pen according to the preset pen point detection model;
the generating module is used for tracking and detecting the current pen point position of the human-computer interaction pen and the historical pen point position of the human-computer interaction pen in the target video based on a preset multi-target tracking algorithm to generate a human-computer interaction pen handwriting; wherein the target video comprises a current frame image and a historical frame image.
8. A non-contact human-computer interaction pen handwriting generation system is characterized by comprising: the camera shooting assembly and the control assembly are connected with each other;
the camera shooting assembly is used for shooting a frame image of the man-machine interaction pen at a preset distance;
the control component is used for executing the non-contact man-machine interaction pen handwriting generation method of any one of claims 1-6.
9. A human-computer interaction system, comprising: a human-computer interaction pen and a non-contact human-computer interaction pen handwriting generation system of claim 8.
10. A non-contact human-computer interaction pen handwriting generating device is characterized by comprising a processor and a memory, wherein the processor is connected with the memory:
the processor is used for calling and executing the program stored in the memory;
the memory is used for storing the program, and the program is at least used for executing the non-contact man-machine interaction pen handwriting generation method of any one of claims 1-6.
CN202211346692.6A 2022-10-31 2022-10-31 Non-contact human-computer interaction pen handwriting generation method, device, equipment and system Pending CN115904063A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211346692.6A CN115904063A (en) 2022-10-31 2022-10-31 Non-contact human-computer interaction pen handwriting generation method, device, equipment and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211346692.6A CN115904063A (en) 2022-10-31 2022-10-31 Non-contact human-computer interaction pen handwriting generation method, device, equipment and system

Publications (1)

Publication Number Publication Date
CN115904063A true CN115904063A (en) 2023-04-04

Family

ID=86471644

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211346692.6A Pending CN115904063A (en) 2022-10-31 2022-10-31 Non-contact human-computer interaction pen handwriting generation method, device, equipment and system

Country Status (1)

Country Link
CN (1) CN115904063A (en)

Similar Documents

Publication Publication Date Title
US20220129060A1 (en) Three-dimensional object tracking to augment display area
US11379996B2 (en) Deformable object tracking
CN107710111B (en) Determining pitch angle for proximity sensitive interaction
US9430093B2 (en) Monitoring interactions between two or more objects within an environment
US8941587B2 (en) Method and device for gesture recognition diagnostics for device orientation
CN112926423B (en) Pinch gesture detection and recognition method, device and system
EP2790089A1 (en) Portable device and method for providing non-contact interface
US8194926B1 (en) Motion estimation for mobile device user interaction
US20110267258A1 (en) Image based motion gesture recognition method and system thereof
KR20160124786A (en) In-air ultrasound pen gestures
US10346992B2 (en) Information processing apparatus, information processing method, and program
CN112363629B (en) Novel non-contact man-machine interaction method and system
Ye et al. 3D curve creation on and around physical objects with mobile AR
EP2618237B1 (en) Gesture-based human-computer interaction method and system, and computer storage media
US10114469B2 (en) Input method touch device using the input method, gesture detecting device, computer-readable recording medium, and computer program product
CN115904063A (en) Non-contact human-computer interaction pen handwriting generation method, device, equipment and system
CN114360047A (en) Hand-lifting gesture recognition method and device, electronic equipment and storage medium
Grzejszczak et al. Tracking of dynamic gesture fingertips position in video sequence
US11789543B2 (en) Information processing apparatus and information processing method
US20240053835A1 (en) Pen state detection circuit and method, and input system
WO2021075103A1 (en) Information processing device, information processing method, and program
CN110609626B (en) Virtual reality control system and method
CN115981492A (en) Three-dimensional handwriting generation method, equipment and system
Fritz et al. Markerless 3d interaction in an unconstrained handheld mixed reality setup
CN115393427A (en) Method and device for determining position and posture of camera, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination