CN112363629B - A new non-contact human-computer interaction method and system - Google Patents
A new non-contact human-computer interaction method and system Download PDFInfo
- Publication number
- CN112363629B CN112363629B CN202011395956.8A CN202011395956A CN112363629B CN 112363629 B CN112363629 B CN 112363629B CN 202011395956 A CN202011395956 A CN 202011395956A CN 112363629 B CN112363629 B CN 112363629B
- Authority
- CN
- China
- Prior art keywords
- point
- computer interaction
- target
- display screen
- depth camera
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 30
- 230000009471 action Effects 0.000 claims abstract description 8
- 238000013528 artificial neural network Methods 0.000 claims description 14
- 238000001514 detection method Methods 0.000 claims description 14
- 238000013135 deep learning Methods 0.000 claims description 9
- 230000007246 mechanism Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 6
- 230000001960 triggered effect Effects 0.000 description 6
- 230000002452 interceptive effect Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 210000004556 brain Anatomy 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 208000013409 limited attention Diseases 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
本发明适用于人机交互的技术领域,提供了一种新的非接触式人机交互方法及系统,所述方法包括步骤:S100、在深度摄像机的二维平面图像中,自动检测显示屏幕的三个顶点A、B、C和目标点F;S200、结合深度摄像机内参,将顶点A、B、C和目标点F的平面像素坐标转换为深度摄像机坐标系下的三维坐标;S300、计算目标点F在显示屏幕上的投影点F’在深度摄像机坐标系下的三维坐标;S400、计算投影点F’在显示屏幕上的平面像素坐标;S500、识别目标点F的动作,调用相关系统鼠标接口进行鼠标事件触发。本发明解决了无鼠标环境下利用手指指尖等可检测的目标物体对屏幕内容进行操控的问题,无需任何标定,具有计算量小、硬件设备简单的优点。
The present invention is applicable to the technical field of human-computer interaction, and provides a new non-contact human-computer interaction method and system. The method includes the steps of: S100. In the two-dimensional plane image of the depth camera, automatically detect the Three vertices A, B, C and target point F; S200, combine the internal parameters of the depth camera, convert the plane pixel coordinates of the vertices A, B, C and the target point F into three-dimensional coordinates in the depth camera coordinate system; S300, calculate the target The three-dimensional coordinates of the projection point F' of the point F on the display screen in the depth camera coordinate system; S400, calculating the plane pixel coordinates of the projection point F' on the display screen; S500, recognizing the action of the target point F, and calling the relevant system mouse The interface triggers mouse events. The invention solves the problem of using detectable target objects such as fingertips to control the screen content in a mouseless environment, does not need any calibration, and has the advantages of small calculation amount and simple hardware equipment.
Description
技术领域technical field
本发明属于人机交互的技术领域,具体涉及一种新的非接触式人机交互方法及系统。The invention belongs to the technical field of human-computer interaction, and in particular relates to a new non-contact human-computer interaction method and system.
背景技术Background technique
目前,非接触式人机交互技术在互动游戏、互动博物馆、互动旅游馆以及VR/AR(虚拟现实/增强现实)等领域都有着广泛的应用,使用者在电脑显示屏、智能电视等显示区域前用各种非接触式的手势就可以与电子产品进行交互操作,比如可以用手势调节音量,这种无需鼠标操作的人机交互技术具有极大的市场价值及经济价值。At present, non-contact human-computer interaction technology has a wide range of applications in interactive games, interactive museums, interactive tourist museums, and VR/AR (virtual reality/augmented reality). Various non-contact gestures can be used to interact with electronic products. For example, gestures can be used to adjust the volume. This human-computer interaction technology without mouse operation has great market value and economic value.
在公开号CN102841679A,发明名称为一种非接触式人机互动方法与装置的专利申请文件中,公开了一种非接触式人机互动的方法与装置,主要是利用了摄像机标定原理,其中需要获取摄像机的位置信息和方向信息计算标定结果,还需要引入重力传感模块和滑动测阻模块,硬件设计较为复杂。In the patent application document with the publication number CN102841679A, the invention title is a non-contact human-computer interaction method and device, a non-contact human-computer interaction method and device are disclosed, mainly using the camera calibration principle, which requires To obtain the position information and direction information of the camera to calculate the calibration results, it is also necessary to introduce a gravity sensing module and a sliding resistance measurement module, and the hardware design is relatively complicated.
而随着深度摄像机的出现和计算机视觉领域中深度学习技术的发展,基于深度摄像机的人机交互系统也越来越多。深度摄像机与普通摄像机的区别在于,除了能够获取平面图像信息,还可以获得拍摄对象的深度信息,也就是三维的位置和尺寸信息,使得整个计算系统能够获得环境和对象的三维立体数据。深度摄像机具备三维感测与识别的能力,经过进一步深化处理,还可以完成三维建模等应用。With the emergence of depth cameras and the development of deep learning technology in the field of computer vision, there are more and more human-computer interaction systems based on depth cameras. The difference between a depth camera and an ordinary camera is that, in addition to obtaining plane image information, it can also obtain depth information of the shooting object, that is, three-dimensional position and size information, so that the entire computing system can obtain three-dimensional data of the environment and objects. The depth camera has the ability of three-dimensional sensing and recognition. After further processing, it can also complete applications such as three-dimensional modeling.
发明内容SUMMARY OF THE INVENTION
本发明实施例的目的在于提供一种新的非接触式人机交互方法,旨在解决现有技术中的利用摄像机标定原理进行非接触式人机交互存在的计算过程复杂并且硬件复杂的问题。The purpose of the embodiments of the present invention is to provide a new non-contact human-computer interaction method, which aims to solve the problems in the prior art that the non-contact human-computer interaction using the camera calibration principle is complicated in calculation process and hardware.
本发明实施例是这样实现的,提供一种新的非接触式人机交互方法,包括如下步骤:The embodiments of the present invention are implemented in this way, and provide a new non-contact human-computer interaction method, which includes the following steps:
S100、在深度摄像机的二维平面图像中,利用深度学习系统自动检测显示屏幕的三个顶点A、B、C和目标点F,得到各点的平面像素坐标;S100, in the two-dimensional plane image of the depth camera, use the deep learning system to automatically detect the three vertices A, B, C and the target point F of the display screen, and obtain the plane pixel coordinates of each point;
S200、结合深度摄像机内参,将顶点A、B、C和目标点F的平面像素坐标转换为深度摄像机坐标系下的三维坐标;S200, combining the internal parameters of the depth camera, convert the plane pixel coordinates of the vertices A, B, C and the target point F into three-dimensional coordinates in the depth camera coordinate system;
S300、计算目标点F在显示屏幕上的投影点F’在深度摄像机坐标系下的三维坐标;S300, calculate the three-dimensional coordinates of the projection point F' of the target point F on the display screen under the depth camera coordinate system;
S400、计算投影点F’在显示屏幕上的平面像素坐标;S400, calculate the plane pixel coordinates of the projection point F' on the display screen;
S500、识别目标点F的动作,调用相关系统鼠标接口进行鼠标事件触发。S500. Identify the action of the target point F, and call the relevant system mouse interface to trigger mouse events.
进一步地,所述步骤S100包括如下子步骤:Further, the step S100 includes the following sub-steps:
S110、利用基于深度学习的目标检测算法对显示屏幕的顶点A、B、C进行自动检测,得到各点的平面像素坐标;S110, using the target detection algorithm based on deep learning to automatically detect the vertices A, B, and C of the display screen, and obtain the plane pixel coordinates of each point;
S120、建立一个两阶段的目标检测深度神经网络结构,对目标点F进行自动检测,包括:S120. Establish a two-stage target detection deep neural network structure to automatically detect the target point F, including:
S121、建立一个目标物体检测深度神经网络,用于检测二维平面图像中的目标物体,对检测到的目标物体区域进行扩展,并根据扩展后的区域定位目标物体;S121, establishing a target object detection deep neural network for detecting the target object in the two-dimensional plane image, expanding the detected target object area, and locating the target object according to the expanded area;
S122、建立一个目标点检测深度神经网络,用于检测目标物体的目标点,并定位出目标点,得到目标点F的平面像素坐标。S122 , establishing a target point detection deep neural network for detecting the target point of the target object, locating the target point, and obtaining the plane pixel coordinates of the target point F.
进一步地,所述步骤S120还包括:建立一个双通道注意力机制的神经网络,用于提高检测目标点的定位精度。Further, the step S120 further includes: establishing a neural network with a dual-channel attention mechanism to improve the positioning accuracy of the detected target point.
进一步地,所述步骤S300包括:Further, the step S300 includes:
根据顶点A、B、C,计算经过顶点A的屏幕平面的法向量n,根据线段FF’平行于法向量n和线段AF’垂直于法向量n,计算投影点F’的三维坐标。According to the vertices A, B, C, calculate the normal vector n of the screen plane passing through the vertex A, and calculate the three-dimensional coordinates of the projection point F' according to the line segment FF' parallel to the normal vector n and the line segment AF' perpendicular to the normal vector n.
进一步地,所述步骤S400包括:Further, the step S400 includes:
计算投影点F’的平面像素坐标中横坐标u与纵坐标v之比,然后结合显示屏幕的屏幕分辨率,得到投影点F’的平面像素坐标。Calculate the ratio of the abscissa u to the ordinate v in the plane pixel coordinates of the projection point F', and then combine with the screen resolution of the display screen to obtain the plane pixel coordinates of the projection point F'.
进一步地,所述步骤S400包括:Further, the step S400 includes:
分别计算投影点F’到线段AB的距离DFAB和投影点F’到线段AC的距离DFAC,得到投影点F’的平面像素坐标中横坐标u与纵坐标v之比,然后结合显示屏幕的屏幕分辨率,根据u=(DFAB/AC)*屏幕水平像素,v=(DFAC/AB)*屏幕垂直像素,得到投影点F’的平面像素坐标F’(u,v)。Calculate the distance D FAB from the projection point F' to the line segment AB and the distance D FAC from the projection point F' to the line segment AC respectively, and obtain the ratio of the abscissa u to the ordinate v in the plane pixel coordinates of the projection point F', and then combine with the display screen According to u=(D FAB /AC)*screen horizontal pixel, v=(D FAC /AB)*screen vertical pixel, the plane pixel coordinate F'(u,v) of the projection point F' is obtained.
本发明实施例的另一目的在于提供一种新的非接触式人机交互系统,包括计算机、深度摄像机、显示屏幕,计算机分别与深度摄像机和显示屏幕连接,所述系统采用上述新的非接触式人机交互方法进行人机交互。Another object of the embodiments of the present invention is to provide a new non-contact human-computer interaction system, including a computer, a depth camera, and a display screen. The computer is respectively connected to the depth camera and the display screen. The system adopts the above-mentioned new contactless human-computer interaction method.
本发明实施例的又一目的在于提供一种计算机可读存储介质,其存储用于电子数据交换的程序,所述程序用于执行上述新的非接触式人机交互方法。Another object of the embodiments of the present invention is to provide a computer-readable storage medium, which stores a program for electronic data exchange, and the program is used to execute the above-mentioned new contactless human-computer interaction method.
与现有技术相比,本发明提供的一种新的非接触式人机交互方法及系统的有益效果为:解决了无鼠标环境下利用手指指尖等可检测的目标物体对屏幕内容进行操控的问题,这种新的非接触式人机交互方法无需任何标定,具有计算量小、硬件设备简单的优点。本发明基于深度学习,通过深度摄像机获取二维平面信息、三维深度信息,计算得到目标物体在显示屏幕上的投影的坐标,触发鼠标事件,实现非接触式人机交互。Compared with the prior art, the beneficial effect of a new non-contact human-computer interaction method and system provided by the present invention is that it solves the problem of using detectable target objects such as fingertips to control screen content in a mouse-free environment. This new non-contact human-computer interaction method does not require any calibration, and has the advantages of small computational complexity and simple hardware equipment. Based on deep learning, the invention obtains two-dimensional plane information and three-dimensional depth information through a depth camera, calculates the projected coordinates of the target object on the display screen, triggers mouse events, and realizes non-contact human-computer interaction.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图得到其它的附图。In order to explain the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings required in the technical description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present invention. , for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative effort.
图1是本发明实施例提供的一种新的非接触式人机交互方法的流程图。FIG. 1 is a flowchart of a new non-contact human-computer interaction method provided by an embodiment of the present invention.
图2是本发明实施例中深度摄像机拍摄到的二维平面图像。FIG. 2 is a two-dimensional plane image captured by a depth camera in an embodiment of the present invention.
图3是本发明实施例中两阶段的目标检测深度神经网络结构的示意图。FIG. 3 is a schematic diagram of a two-stage target detection deep neural network structure in an embodiment of the present invention.
图4是本发明实施例中深度摄像机、手指指尖与显示屏幕的位置关系示意图。FIG. 4 is a schematic diagram of the positional relationship between a depth camera, a fingertip, and a display screen in an embodiment of the present invention.
图5是本发明实施例中计算投影点F’在深度摄像机坐标系中的三维坐标的流程图。Fig. 5 is a flow chart of calculating the three-dimensional coordinates of the projection point F' in the depth camera coordinate system in the embodiment of the present invention.
图6是本发明实施例中触发鼠标双击事件时手指指尖的当前位置与原始位置的关系示意图。6 is a schematic diagram of the relationship between the current position of the fingertip and the original position when a mouse double-click event is triggered in an embodiment of the present invention.
图7是本发明实施例中触发鼠标双击事件的流程图。FIG. 7 is a flowchart of triggering a mouse double-click event in an embodiment of the present invention.
图8是本发明实施例提供的一种新的非接触式人机交互系统的结构示意图。FIG. 8 is a schematic structural diagram of a new non-contact human-computer interaction system provided by an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明要解决的技术问题、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the technical problems, technical solutions and beneficial effects to be solved by the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
参考图1,图1是本发明实施例提供的一种新的非接触式人机交互方法的流程图,该方法包括以下步骤:Referring to FIG. 1, FIG. 1 is a flowchart of a new non-contact human-computer interaction method provided by an embodiment of the present invention, and the method includes the following steps:
S100、在深度摄像机的二维平面图像中,利用深度学习系统自动检测显示屏幕的三个顶点A、B、C和目标点F,得到各点的平面像素坐标A(ua,va),B(ub,vb),C(uc,vc),F(u0,v0);S100, in the two-dimensional plane image of the depth camera, use the deep learning system to automatically detect the three vertices A, B, C and the target point F of the display screen, and obtain the plane pixel coordinates A(ua,va) of each point, B( ub, vb), C (uc, vc), F (u0, v0);
S200、结合深度摄像机内参,将各点的平面像素坐标转换为深度摄像机坐标系下的三维坐标,得到各点的三维坐标A(x1,y1,z1),B(x2,y2,z2),C(x3,y3,z3),F(x0,y0,z0);S200. Combine the internal parameters of the depth camera, convert the plane pixel coordinates of each point into three-dimensional coordinates in the depth camera coordinate system, and obtain the three-dimensional coordinates A(x 1 , y 1 , z 1 ), B(x 2 , y 2 of each point) , z 2 ), C(x 3 , y 3 , z 3 ), F(x 0 , y 0 , z 0 );
S300、计算目标点F在显示屏幕上的投影点F’在深度摄像机坐标系下的三维坐标F’(x′,y′,z′);S300, calculate the three-dimensional coordinates F' (x', y', z') of the projection point F' of the target point F on the display screen in the depth camera coordinate system;
S400、计算投影点F’在显示屏幕上的平面像素坐标F’(u,v),包括:S400. Calculate the plane pixel coordinates F'(u, v) of the projection point F' on the display screen, including:
分别计算投影点F’到线段AB的距离DFAB和投影点F’到线段AC的距离DFAC,得到投影点F’的平面像素坐标中横坐标u与纵坐标v之比,然后结合显示屏幕的屏幕分辨率(屏幕水平像素*屏幕垂直像素),根据u=(DFAB/AC)*屏幕水平像素,v=(DFAC/AB)*屏幕垂直像素,得到投影点F’的平面像素坐标F’(u,v);Calculate the distance D FAB from the projection point F' to the line segment AB and the distance D FAC from the projection point F' to the line segment AC respectively, and obtain the ratio of the abscissa u to the ordinate v in the plane pixel coordinates of the projection point F', and then combine with the display screen The screen resolution (screen horizontal pixels * screen vertical pixels), according to u = (D FAB /AC) * screen horizontal pixels, v = (D FAC /AB) * screen vertical pixels, get the plane pixel coordinates of the projection point F'F'(u,v);
S500、识别目标点F的动作,在投影点F’的位置显示鼠标标记,调用相关系统鼠标接口进行鼠标事件触发,对显示屏幕进行操控。S500. Identify the action of the target point F, display a mouse mark at the position of the projection point F', call the relevant system mouse interface to trigger a mouse event, and manipulate the display screen.
图2是本发明实施例中深度摄像机拍摄到的二维平面图像,图像中显示了一块呈长方形的显示屏幕和手势为伸出食指的一只手。在本发明实施例中,将手指指尖作为目标点进行说明,在其他实施例中,还可以选择其他可检测的目标物体作为目标点。FIG. 2 is a two-dimensional plane image captured by a depth camera in an embodiment of the present invention, and the image shows a rectangular display screen and a hand with a gesture of extending an index finger. In the embodiment of the present invention, the fingertip is used as the target point for description. In other embodiments, other detectable target objects may also be selected as the target point.
参考图2,上述步骤S100包括如下子步骤:Referring to FIG. 2, the above step S100 includes the following sub-steps:
S110、利用基于深度学习的目标检测算法(比如YOLO算法或SSD算法)对显示屏幕的三个顶点A、B、C进行自动检测,选择显示屏幕的三个顶点A、B、C构成屏幕平面,并以顶点A作为屏幕平面的像素坐标系的原点,得到各点的平面像素坐标A(ua,va),B(ub,vb),C(uc,vc);S110. Use a deep learning-based target detection algorithm (such as the YOLO algorithm or the SSD algorithm) to automatically detect the three vertices A, B, and C of the display screen, and select the three vertices A, B, and C of the display screen to form a screen plane, And take vertex A as the origin of the pixel coordinate system of the screen plane, and obtain the plane pixel coordinates A(ua,va), B(ub,vb), C(uc,vc) of each point;
S120、参考图3,建立一个两阶段的目标检测深度神经网络结构,对手指指尖F进行自动检测,包括:S120, referring to FIG. 3, establish a two-stage target detection deep neural network structure to automatically detect the fingertip F, including:
S121、建立一个手指检测深度神经网络,用于检测二维平面图像中的手指,对检测到的手指区域进行扩展,并根据扩展后的区域定位手指;S121, establishing a finger detection deep neural network for detecting fingers in a two-dimensional plane image, expanding the detected finger region, and locating the finger according to the expanded region;
S122、建立一个指尖检测深度神经网络,用于检测手指的指尖,并定位出指尖,得到手指指尖F的平面像素坐标F(u0,v0)。S122 , establishing a deep neural network for fingertip detection, which is used to detect the fingertip of the finger, locate the fingertip, and obtain the plane pixel coordinates F(u0, v0) of the fingertip F of the finger.
为了提高检测指尖的定位精度,本发明实施例还建立了一个双通道注意力机制的神经网络。注意力机制是一种模拟人脑注意力的模型,利用有限的注意力快速筛选出重要信息,从而提高人脑在处理视觉信息上的效率与准确率。In order to improve the positioning accuracy of detecting fingertips, the embodiment of the present invention also establishes a neural network with a dual-channel attention mechanism. The attention mechanism is a model that simulates the attention of the human brain. It uses limited attention to quickly screen out important information, thereby improving the efficiency and accuracy of the human brain in processing visual information.
参考图4,图4是本发明实施例中深度摄像机、手指指尖与显示屏幕的位置示意图。图中显示了手指指尖F在显示屏幕的投影点F’。Referring to FIG. 4 , FIG. 4 is a schematic diagram of the positions of the depth camera, the fingertip, and the display screen in the embodiment of the present invention. The figure shows the projection point F' of the fingertip F on the display screen.
参考图5,上述步骤S300包括如下子步骤:Referring to FIG. 5, the above step S300 includes the following sub-steps:
S310、根据顶点A、B、C,计算经过顶点A的屏幕平面的法向量n=(a,b,c),其中:S310, according to the vertices A, B, C, calculate the normal vector n=(a, b, c) of the screen plane passing through the vertex A, wherein:
a=y1(z2-z3)+y2(z3-z1)+y3(z1-z2),a=y 1 (z 2 -z 3 )+y 2 (z 3 -z 1 )+y 3 (z 1 -z 2 ),
b=z1(x2-x3)+z2(x3-x1)+z3(x1-x2),b=z 1 (x 2 -x 3 )+z 2 (x 3 -x 1 )+z 3 (x 1 -x 2 ),
c=x1(y2-y3)+x2(y3-y1)+x3(y1-y2);c=x 1 (y 2 -y 3 )+x 2 (y 3 -y 1 )+x 3 (y 1 -y 2 );
S320、设投影点F’在深度摄像机坐标系下的三维坐标为F’(x′,y′,z′),根据线段FF’平行于法向量n,得到方程组:S320. Let the three-dimensional coordinates of the projection point F' in the depth camera coordinate system be F'(x', y', z'), and according to the line segment FF' being parallel to the normal vector n, a system of equations is obtained:
S330、根据线段AF’垂直于法向量n,得到方程:S330. According to the line segment AF' being perpendicular to the normal vector n, the equation is obtained:
a(x′-x1)+b(y′-y1)+c(z′-z1)=0(**);a(x'-x 1 )+b(y'-y 1 )+c(z'-z 1 )=0(**);
S340、联立方程组(*)和方程(**),得到t,将t代入方程组(*),即得到投影点F’的三维坐标F’(x′,y′,z′)。S340. Simultaneously combine the equation system (*) and the equation (**) to obtain t, and substitute t into the equation system (*) to obtain the three-dimensional coordinates F'(x', y', z') of the projection point F'.
具体地,上述步骤S500中的所述识别目标点F的动作包括:Specifically, the action of identifying the target point F in the above step S500 includes:
当其他顶点A、B、C均固定不变时,使用者移动手指,利用上述步骤S100到S200重新计算手指指尖F的三维坐标。其中,目标点F的动作包括点击、双击、保持按下状态、按下后被释放等,对应触发的所述鼠标事件包括鼠标点击事件(click)、鼠标双击事件(dbclick)、鼠标按钮被按下时触发的事件(mousedown)、鼠标按钮被释放时触发的事件(mouseup)。When the other vertices A, B, and C are all fixed, the user moves the finger, and uses the above steps S100 to S200 to recalculate the three-dimensional coordinates of the fingertip F of the finger. The actions of the target point F include clicking, double-clicking, keeping the pressed state, and releasing after being pressed. The event triggered when the mouse button is released (mousedown), and the event triggered when the mouse button is released (mouseup).
下面以触发鼠标双击事件为例进行说明。The following is an example of triggering a mouse double-click event.
图6是本发明实施例中触发鼠标双击事件时手指指尖的当前位置与手指指尖原始位置的关系示意图,图中D1、D2分别为手指指尖的当前位置到原始位置的距离,D1>D2。参考图6,本发明实施例中双击动作的定义为:手指指尖离开原始位置后又回到原始位置附近,所述手指指尖离开原始位置的定义为:手指指尖的位置与原始位置的距离大于D1,所述手指指尖回到原始位置附近的定义为:手指指尖的位置与原始位置的距离小于D2。6 is a schematic diagram of the relationship between the current position of the fingertip and the original position of the fingertip when the mouse double-click event is triggered in the embodiment of the present invention, in the figure D1 and D2 are the distances from the current position of the fingertip to the original position, D1> D2. Referring to FIG. 6 , in the embodiment of the present invention, a double-click action is defined as: the fingertip leaves the original position and then returns to the vicinity of the original position. The definition of the fingertip leaving the original position is: the difference between the position of the fingertip and the original position. When the distance is greater than D1, the definition of returning the fingertip to the vicinity of the original position is that the distance between the position of the fingertip and the original position is less than D2.
图7是本发明实施例中触发鼠标双击事件的流程图。参考图7,由于鼠标双击事件一般会在2秒内结束,这样在50帧图像序列内会出现一个点击动作。本发明实施例中通过每5帧图像检测手指指尖的坐标,并设置标记FAR=TRUE为第tn帧图像的指尖坐标与原始位置的距离大于D1,设置标记NEAR=TRUE为第tm帧(0<tn<tm<50)图像的指尖坐标与原始位置的距离小于D2;当FAR=TRUE并且NEAR=TRUE时,触发鼠标双击事件,对显示屏幕进行操控。具体流程如下步骤:FIG. 7 is a flowchart of triggering a mouse double-click event in an embodiment of the present invention. Referring to FIG. 7 , since a mouse double-click event generally ends within 2 seconds, a click action occurs within a 50-frame image sequence. In the embodiment of the present invention, the coordinates of the fingertip are detected by every 5 frames of images, and the mark FAR=TRUE is set to indicate that the distance between the fingertip coordinates of the tnth frame image and the original position is greater than D1, and the mark NEAR=TRUE is set to be the tmth frame ( 0<tn<tm<50) The distance between the fingertip coordinates of the image and the original position is less than D2; when FAR=TRUE and NEAR=TRUE, the mouse double-click event is triggered to control the display screen. The specific process is as follows:
首先读取图像,判断是否距离FAR=TRUE过去了50帧图像也没有NEAR=TRUE,若是,则设置FAR=FALSE并且NEAR=FALSE,即如果第tn帧图像的手指指尖在离开原始位置后,直到第(tn+50)帧图像的手指指尖都没有回到原始位置附近,则对第tn帧图像的手指指尖进行归位;若否,则每5帧图像检测手指指尖的坐标,当检测到的手指指尖的坐标满足FAR=TRUE时,继续检测,直到手指指尖的坐标还满足NEAR=TRUE时,触发鼠标双击事件,并设置FAR=FALSE并且NEAR=FALSE,重复以上步骤。First, read the image, and judge whether there is no NEAR=TRUE after 50 frames of images from FAR=TRUE, if so, set FAR=FALSE and NEAR=FALSE, that is, if the fingertip of the tnth frame image leaves the original position, Until the fingertip of the (tn+50)th frame image does not return to the original position, the fingertip of the tnth frame image is returned to the original position; if not, the coordinates of the fingertip are detected every 5 frames of images, When the detected coordinates of the fingertip satisfy FAR=TRUE, continue to detect until the coordinates of the fingertip also satisfy NEAR=TRUE, trigger the mouse double-click event, and set FAR=FALSE and NEAR=FALSE, and repeat the above steps.
设原始位置的手指指尖的三维坐标为(x0,y0,z0),第tn帧图像的指尖坐标为(xtn,ytn,ztn),第tm帧图像的指尖坐标为(xtm,ytm,ztm),则当同时满足:Let the three-dimensional coordinates of the fingertip of the original position be (x 0 , y 0 , z 0 ), the coordinates of the fingertip of the tn-th frame image are (x tn , y tn , z tn ), and the finger-tip coordinates of the tm-th frame image is (x tm , y tm , z tm ), then when both:
(xtn-x0)2+(ytn-y0)2+(ztn-z0)2>D12和(x tn -x 0 ) 2 +(y tn -y 0 ) 2 +(z tn -z 0 ) 2 >D1 2 and
(xtm-x0)2+(ytm-y0)2+(ztm-z0)2<D22时,触发鼠标双击事件。When (x tm -x 0 ) 2 +(y tm -y 0 ) 2 +(z tm -z 0 ) 2 < D2 2 , a mouse double-click event is triggered.
本发明实施例提供了一种非接触式人机交互方法,用以解决无鼠标环境下利用手指指尖等可检测的目标物体对屏幕内容进行操控的问题。这种新的非接触式人机交互方法无需任何标定,而是基于深度学习,通过深度摄像机获取二维平面信息、三维深度信息,计算得到目标物体在显示屏幕上的投影的坐标,触发鼠标事件,实现非接触式人机交互。具有计算量小、硬件设备简单的优点。The embodiment of the present invention provides a non-contact human-computer interaction method, which is used to solve the problem of manipulating screen content with detectable target objects such as fingertips in a mouse-free environment. This new non-contact human-computer interaction method does not require any calibration, but is based on deep learning, obtains two-dimensional plane information and three-dimensional depth information through a depth camera, calculates the projected coordinates of the target object on the display screen, and triggers mouse events. , to achieve non-contact human-computer interaction. It has the advantages of small calculation amount and simple hardware device.
参考图8,本发明实施例还提供了一种新的非接触式人机交互系统,包括:计算机、深度摄像机、显示屏幕,计算机分别与深度摄像机和显示屏幕连接。该系统采用上述实施例的方法进行人机交互。深度摄像机采集显示屏幕的顶点的位置信息和目标点的位置信息,计算机用于根据深度摄像机采集到的位置信息计算目标点在显示屏幕上的投影的位置信息,并调用相关系统鼠标接口进行鼠标事件触发,对显示屏幕进行操控。Referring to FIG. 8 , an embodiment of the present invention further provides a new non-contact human-computer interaction system, including: a computer, a depth camera, and a display screen, and the computer is respectively connected to the depth camera and the display screen. The system adopts the method of the above embodiment to perform human-computer interaction. The depth camera collects the position information of the vertex of the display screen and the position information of the target point. The computer is used to calculate the position information of the projection of the target point on the display screen according to the position information collected by the depth camera, and call the relevant system mouse interface for mouse events. Trigger to control the display screen.
本发明实施例还提供一种计算机可读存储介质,其存储用于电子数据交换的程序,程序用于执行本发明的新的非接触式人机交互方法。The embodiment of the present invention also provides a computer-readable storage medium, which stores a program for electronic data exchange, and the program is used to execute the novel non-contact human-computer interaction method of the present invention.
以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the principles of the present invention shall be included in the protection scope of the present invention. Inside.
Claims (8)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011395956.8A CN112363629B (en) | 2020-12-03 | 2020-12-03 | A new non-contact human-computer interaction method and system |
PCT/CN2020/137285 WO2022116281A1 (en) | 2020-12-03 | 2020-12-17 | New non-contact human-computer interaction method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011395956.8A CN112363629B (en) | 2020-12-03 | 2020-12-03 | A new non-contact human-computer interaction method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112363629A CN112363629A (en) | 2021-02-12 |
CN112363629B true CN112363629B (en) | 2021-05-28 |
Family
ID=74536668
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011395956.8A Expired - Fee Related CN112363629B (en) | 2020-12-03 | 2020-12-03 | A new non-contact human-computer interaction method and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112363629B (en) |
WO (1) | WO2022116281A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113095243B (en) * | 2021-04-16 | 2022-02-15 | 推想医疗科技股份有限公司 | Mouse control method and device, computer equipment and medium |
CN115885238A (en) * | 2021-07-26 | 2023-03-31 | 广州视源电子科技股份有限公司 | Implementation method and system of fingertip mouse |
CN113807191B (en) * | 2021-08-23 | 2024-06-14 | 南京航空航天大学 | Non-invasive visual test script automatic recording method |
CN114647361B (en) * | 2022-03-02 | 2025-04-01 | 北京当红齐天国际文化科技发展集团有限公司 | A touch screen object positioning method and device based on artificial intelligence |
CN115617178B (en) * | 2022-11-08 | 2023-04-25 | 润芯微科技(江苏)有限公司 | Method for completing key and function triggering by no contact between finger and vehicle |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102354345A (en) * | 2011-10-21 | 2012-02-15 | 北京理工大学 | Medical image browse device with somatosensory interaction mode |
CN102749991A (en) * | 2012-04-12 | 2012-10-24 | 广东百泰科技有限公司 | Non-contact free space eye-gaze tracking method suitable for man-machine interaction |
CN103345301A (en) * | 2013-06-18 | 2013-10-09 | 华为技术有限公司 | Depth information acquisition method and device |
CN103914152A (en) * | 2014-04-11 | 2014-07-09 | 周光磊 | Recognition method and system for multi-point touch and gesture movement capturing in three-dimensional space |
CN109683699A (en) * | 2019-01-07 | 2019-04-26 | 深圳增强现实技术有限公司 | The method, device and mobile terminal of augmented reality are realized based on deep learning |
CN110782532A (en) * | 2019-10-23 | 2020-02-11 | 北京达佳互联信息技术有限公司 | Image generation method, image generation device, electronic device, and storage medium |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011081480A (en) * | 2009-10-05 | 2011-04-21 | Seiko Epson Corp | Image input system |
US20120242806A1 (en) * | 2011-03-23 | 2012-09-27 | Tk Holdings Inc. | Dynamic stereo camera calibration system and method |
CN102968222A (en) * | 2012-11-07 | 2013-03-13 | 电子科技大学 | Multi-point touch equipment based on depth camera |
CN103207709A (en) * | 2013-04-07 | 2013-07-17 | 布法罗机器人科技(苏州)有限公司 | Multi-touch system and method |
CN103793060B (en) * | 2014-02-14 | 2017-07-28 | 杨智 | A kind of user interactive system and method |
AU2019308228B2 (en) * | 2018-07-16 | 2021-06-03 | Accel Robotics Corporation | Autonomous store tracking system |
-
2020
- 2020-12-03 CN CN202011395956.8A patent/CN112363629B/en not_active Expired - Fee Related
- 2020-12-17 WO PCT/CN2020/137285 patent/WO2022116281A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102354345A (en) * | 2011-10-21 | 2012-02-15 | 北京理工大学 | Medical image browse device with somatosensory interaction mode |
CN102749991A (en) * | 2012-04-12 | 2012-10-24 | 广东百泰科技有限公司 | Non-contact free space eye-gaze tracking method suitable for man-machine interaction |
CN103345301A (en) * | 2013-06-18 | 2013-10-09 | 华为技术有限公司 | Depth information acquisition method and device |
CN103914152A (en) * | 2014-04-11 | 2014-07-09 | 周光磊 | Recognition method and system for multi-point touch and gesture movement capturing in three-dimensional space |
CN109683699A (en) * | 2019-01-07 | 2019-04-26 | 深圳增强现实技术有限公司 | The method, device and mobile terminal of augmented reality are realized based on deep learning |
CN110782532A (en) * | 2019-10-23 | 2020-02-11 | 北京达佳互联信息技术有限公司 | Image generation method, image generation device, electronic device, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2022116281A1 (en) | 2022-06-09 |
CN112363629A (en) | 2021-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112363629B (en) | A new non-contact human-computer interaction method and system | |
TWI690842B (en) | Method and apparatus of interactive display based on gesture recognition | |
CN103383731B (en) | A kind of projection interactive method based on finger tip location, system and the equipment of calculating | |
JP5802247B2 (en) | Information processing device | |
CN109359514B (en) | A joint strategy method for gesture tracking and recognition for deskVR | |
US20030004678A1 (en) | System and method for providing a mobile input device | |
CN103955316B (en) | A kind of finger tip touching detecting system and method | |
CN104423569A (en) | Pointing position detecting device, method and computer readable recording medium | |
CN102508574A (en) | Projection-screen-based multi-touch detection method and multi-touch system | |
CN112001886A (en) | A temperature detection method, device, terminal and readable storage medium | |
CN112657176A (en) | Binocular projection man-machine interaction method combined with portrait behavior information | |
JP2016103137A (en) | User interface system, image processor and control program | |
CN116661604A (en) | Man-machine interaction recognition system based on Media Pipe frame acquisition gesture | |
Simion et al. | Finger detection based on hand contour and colour information | |
US20220050528A1 (en) | Electronic device for simulating a mouse | |
Cheng et al. | Fingertip-based interactive projector–camera system | |
CN102479002B (en) | Optical touch system and sensing method thereof | |
Zhenying et al. | Research on human-computer interaction with laser-pen in projection display | |
CN108920088A (en) | A kind of desktop projection exchange method and system based on every empty touch operation | |
Haubner et al. | Recognition of dynamic hand gestures with time-of-flight cameras | |
Jia et al. | Tracking pointing gesture in 3d space for wearable visual interfaces | |
CN115981492A (en) | Three-dimensional handwriting generation method, device and system | |
Černeková et al. | Single camera pointing gesture recognition for interaction in edutainment applications | |
Wu et al. | Research and implementation on multi-touch whiteboard system | |
Lokhande et al. | ML based Projector to Smart Board Conversion |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210528 |