CN111258411A - User interaction method and device - Google Patents

User interaction method and device Download PDF

Info

Publication number
CN111258411A
CN111258411A CN202010370009.7A CN202010370009A CN111258411A CN 111258411 A CN111258411 A CN 111258411A CN 202010370009 A CN202010370009 A CN 202010370009A CN 111258411 A CN111258411 A CN 111258411A
Authority
CN
China
Prior art keywords
user
operation data
computing board
information
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010370009.7A
Other languages
Chinese (zh)
Other versions
CN111258411B (en
Inventor
冯翀
郭嘉伟
罗观洲
马宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenguang Technology Co Ltd
Original Assignee
Beijing Shenguang Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenguang Technology Co Ltd filed Critical Beijing Shenguang Technology Co Ltd
Priority to CN202010370009.7A priority Critical patent/CN111258411B/en
Publication of CN111258411A publication Critical patent/CN111258411A/en
Application granted granted Critical
Publication of CN111258411B publication Critical patent/CN111258411B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/002Specific input/output arrangements not covered by G06F3/01 - G06F3/16
    • G06F3/005Input arrangements through a video camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention provides a user interaction method and a device, wherein the method mainly comprises the following steps: acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera; acquiring second operation data of a user on a user interface by using a depth camera, and fusing the first operation data and the second operation data to obtain user operation data; and updating the display content of the projection unit on the user operation interface based on the user operation. The judgment of the user action is more accurate, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.

Description

User interaction method and device
Technical Field
The invention relates to the technical field of human-computer interaction, in particular to a user interaction method and user interaction equipment.
Background
Human-computer interaction is a study of the interactive relationships between a research system and a user. The system may be a variety of machines, and may be a computerized system and software. The human-computer interaction interface generally refers to a portion visible to a user. And the user communicates with the system through a human-computer interaction interface and performs operation. Such as the play button of a radio, the instrument panel of an airplane or the control room of a power plant. The human-machine interface is designed to contain the user's understanding of the system (i.e., mental models) for the usability or user-friendliness of the system.
In the prior art, the touch control scheme used by the interactive projector is basically an infrared flat scanning scheme, that is, an infrared emitter is placed at a fixed height position of a desktop, and an object (such as a finger) is shielded and then identified as a click event. The disadvantages of this solution are: the projector must be placed on an interactive plane, and the projector is limited in shape; the occlusion in the horizontal direction cannot be handled, and objects with height cannot be handled; any object may be touched by mistake; the precision needs to be improved; the projector interface cannot be automatically corrected.
In addition, in the prior art, no matter the user action is obtained based on infrared or video, the action is captured based on the current frame, the recognition precision is low, and in the prior art, the action of the user is obtained based on a single mode, the user action cannot be captured based on two or more signals at the same time, and the gesture precision of the user action obtained by a single signal is low, so how to improve the recognition precision of the user action is a key point and a difficulty point of human-computer interaction.
Disclosure of Invention
The present invention provides the following technical solutions to overcome the above-mentioned drawbacks in the prior art.
A method of user interaction, the method comprising:
the method comprises the following steps of initializing, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface;
a first acquisition step, namely acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera;
a second acquisition step of acquiring second operation data of the user on the user interface by using the depth camera;
a fusion step, fusing the first operation data and the second operation data to obtain user operation data;
and updating the display content of the projection unit on a user operation interface based on the user operation.
Further, the first acquiring step includes:
when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the light spot shielding information of the first N frames of the stored current frame of light spot information from the computing board, and the computing board determines the first operation data through the current frame of light spot information and the light spot information of the first N frames.
Further, the second acquiring step includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the action of the hand of the user based on the depth information and the hand position, when the computing board judges that the action of the hand of the user in the current frame image is the user operation as a pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames.
Further, the step of fusing the first operation data and the second operation data to obtain user operation data is as follows: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.
Further, the user operation data marks data for a user or calls other functions for the user.
Further, the operation of the computing board determining the first operation data by the information of the light spot of the current frame and the information of the light spot of the previous N frames is: the computing board determines the finger action of the user through the current frame light spot information and the previous N frames of light spot information to obtain the hand track information of the user; and acquiring projection content on a user operation interface of the current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a spot center point position calculated by utilizing a mean value.
Further, the operation of the computing board determining the second operation data from the current frame image and the previous N frames of images is: the calculation board obtains the hand track information of the user through the specific hand motions of the user of the current frame image and the previous N frames of images; acquiring projection content on a user operation interface of a current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a fingertip position.
Further, the updating step includes: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.
The invention also proposes a user interaction device, comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board;
the projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface;
the infrared camera acquires first operation data of a user on a user interface through an infrared grating and sends the first operation data to the computing board;
the depth camera acquires second operation data of a user on a user interface and sends the second operation data to the computing board;
the computing board fuses the first operation data and the second operation data to obtain user operation data, and sends the user operation data to the projection unit;
and after receiving the user operation data, the projection unit updates the display content of the projection unit on the user operation interface based on the user operation.
Further, the step of acquiring first operation data of a user on a user interface through an infrared grating by the infrared camera and sending the first operation data to the computing board includes: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.
Further, the acquiring, by the depth camera, second operation data of the user on the user interface and sending the second operation data to the computing board includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the motion of the hand of the user based on the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines second operation data through the current frame image and the image of the first N frames and sends the second operation data to the computing board.
Further, the step of fusing the first operation data and the second operation data to obtain user operation data is as follows: and the computing board carries out Kalman filtering processing on the first operation data and the second operation data to obtain user operation data.
Further, the user operation data marks data for a user or calls other functions for the user.
Further, the operation of the computing board determining the first operation data by the information of the light spot of the current frame and the information of the light spot of the previous N frames is: the computing board determines the finger action of the user through the current frame light spot information and the previous N frames of light spot information to obtain the hand track information of the user; and acquiring projection content on a user operation interface of the current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a spot center point position calculated by utilizing a mean value.
Further, the operation of the computing board determining the second operation data from the current frame image and the previous N frames of images is: the calculation board obtains the hand track information of the user through the specific hand motions of the user of the current frame image and the previous N frames of images; acquiring projection content on a user operation interface of a current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a fingertip position.
Further, the updating the display content of the projection unit on the user operation interface includes: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.
The invention has the technical effects that: the invention discloses a user interaction method, which comprises the following steps: the method comprises the following steps of initializing, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface; a first acquisition step, namely acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera; a second acquisition step of acquiring second operation data of the user on the user interface by using the depth camera; a fusion step, fusing the first operation data and the second operation data to obtain user operation data; and updating the display content of the projection unit on a user operation interface based on the user operation. The main advantages of the invention are: in the action analysis process, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method of user interaction according to one of the embodiments of the invention.
FIG. 2 is a schematic diagram of a user interaction device in accordance with one of the embodiments of the present invention.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
FIG. 1 illustrates a user interaction method of the present invention, the method comprising:
and an initialization step S101, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface.
The method can be applied to an intelligent desk lamp, the upper part of the desk lamp is provided with a projection unit, namely a projector, an infrared camera and a depth camera, a computing board is arranged in the desk lamp, and the computing board is at least provided with a processor and a memory and used for finishing data processing and the like. The projection unit can be a projector, the signal emission unit is arranged at the bottom of the desk lamp, so that the projection unit projects an operation interface on a desktop, the signal emission unit (such as an infrared laser) generates an infrared grating parallel to the user interface, the infrared grating is close to the user operation interface, and the close proximity generally means a distance of 1-2 mm.
The initializing step S101 may specifically include the following steps:
the first step is as follows: initializing the projector, focusing, performing trapezoidal correction, performing coincidence and calibration judgment of picture signals until the projection is clear, and displaying a loaded user operation interface.
The second step is that: an infrared laser located at the bottom end of the device emits infrared beams in a diffuse manner, each beam being at a prescribed distance of 1mm from the plane.
The third step: the infrared camera shoots the grating state and processes the grating state to obtain light spot information, if the light spot information is judged to be non-planar by the computing board, the projection content is updated to be in an error state, and a user is reminded to adjust the position until the position becomes a normal planar grating.
The fourth step: the projector acquires the setting of the current user from the computing board and projects a formal user operation interface according to the setting of the current user.
Through the steps, automatic correction of the projector is achieved, a corresponding user operation interface is projected based on the setting of a user, user operation is facilitated, the signal transmitting unit is placed at the bottom of the intelligent desk lamp, and the problem that the projector is limited in shape due to the fact that the transmitter must be placed on an interactive plane in the prior art is solved, so that shielding in the horizontal direction can be processed, high objects can also be processed, and the method is one of important invention points.
A first obtaining step S102, obtaining first operation data of a user on a user interface through an infrared grating by using an infrared camera.
In one embodiment, when a user operates on the user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information for forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the first N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the first N frames of light spot information.
Specifically, how the computing board determines the first operation data according to the information of the current frame of light spots and the information of the light spots of the previous N frames, where the specific flow is described with the infrared information and the camera frame rate as 50 frames:
when the current reflected light position of the computing board is judged to be the pressing behavior of the user, a duration needs to be judged, wherein a real pressing event is calculated by assuming that the duration lasts for 100ms (namely 5 frames), and a corresponding processing method is further called;
when only one frame of user is detected to be a pressing behavior, the computing board starts query operation, firstly obtains the behavior type of the user in the previous frame, and if the frame of user is the pressing behavior in the same position, the computing board continues to obtain the behavior type of the user in the previous frame. When an illegal action (pressing action or non-pressing action at a non-same position) is encountered, special treatment is carried out: skip this frame and read one frame forward.
There are two cases at this time: 1. and if the previous frame is still illegal, the query is terminated until the current frame cannot be counted as a real pressing event, multi-frame judgment is finished, and at the moment, the computing board starts to wait for the user behavior of the next frame and judges. 2. And a frame before is the pressing behavior of the same position, the illegal behavior encountered before is marked as error data and is treated as the pressing behavior of the same position. After the inquiry and special processing, if the computing board judges that the pressing action of the same position of five continuous frames exists at the moment, the computing board regards the pressing action as a real pressing event, and the multi-frame judgment is finished.
Preferably, the formation of the light spot by the infrared light reflected after the infrared light emitted by the signal emitting unit is blocked by the finger is specifically: the user utilizes the finger (or other shelter) to press the position that needs to carry out the interaction, and the infrared beam can be sheltered from when the distance is less than 1 mm. The infrared light beam is shielded, and the shielded part can be used as a reflecting surface to reflect the emitted infrared light to form a so-called light spot, and the position of the light spot can be captured by the infrared camera. The infrared camera continuously shoots the grating state, and the information for forming the light spot obtained through filtering processing is specifically as follows: the infrared camera continuously shoots and records the infrared light distribution condition of the current plane; after acquiring the distribution condition, the infrared camera processes the shot image by using a plurality of filtering algorithms to obtain the position and the shape of the shielded part of the infrared ray; the infrared camera carries out standardized adjustment on the obtained light spot information, and the light spot information is transmitted to the computing board and stored by using a connected data wire.
A second obtaining step S103, obtaining second operation data of the user on the user interface by using the depth camera.
In one embodiment, the depth camera shoots a scene of the user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for acquiring the hand position of the user in the image, the computing board determines the motion of the hand of the user on the basis of the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation as a pressing event, the computing board acquires the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames. The multi-frame judgment mode is the same as the infrared multi-frame judgment mode described above, and is not described in detail.
Preferably, the acquiring of the hand position of the user in the image using the computing pad is specifically: the depth camera shoots a scene by using a binocular camera, a first-level distance is obtained by light reflection, the detailed depth information of each part in the scene is calculated in a calculation mode of further summarizing and processing information of the two cameras, namely, an image and complete RGB-D information of the current scene are obtained, and the information is preprocessed and slightly corrected by using a white balance and histogram balance method; after the integral depth information is obtained, the computing board processes the acquired scene picture by using a deployed mobile-ssd detection network to obtain the rough position of the hand of the user; and combining the hand position with the depth information, and further predicting the positions of the bone joint points by using a curled neural network with a hourglass structure, so that the current hand posture of the user can be obtained, and further the hand action is obtained and stored.
First, the method uses hourglas to determine the action formula of the hands derived label QUOTE
Figure 613873DEST_PATH_IMAGE002
Figure 734275DEST_PATH_IMAGE002
Generating a thermodynamic diagram of the hand joint point k (the thermodynamic diagram is a probability diagram and is consistent with the pixel composition of an image, but the data at each pixel position is the probability that the current pixel is a certain joint, and further joint information is analyzed based on the probability):
Figure DEST_PATH_IMAGE003
then obtaining a thermodynamic diagram QUOTE according to the prediction
Figure DEST_PATH_IMAGE005
Figure 186117DEST_PATH_IMAGE005
Further, the position P of the hand joint point k in the image is obtained (further correction is performed based on the predicted position, more accurate position information is obtained)
Figure DEST_PATH_IMAGE007
Then, regarding the classification of the gesture, and the position area of each joint point is given for each class, the current action is determined as long as each joint point is in the corresponding remainder, and the processes of the hand action and the corresponding formulas are also called advanced depth judgment models.
The operation of the computing board in judging that the motion of the hand of the user in the current frame image is the operation of pressing the event by the user is as follows: after the hand action analysis is obtained, if the distance difference between the hand and the projection plane is judged to be less than 1mm, the user action is judged to be a pressing plane; after the pressing event is judged, in order to analyze the specific action of the user, the user action information of the previous frames is obtained from the storage, and the information is also used as the source data of the next analysis.
In the invention, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state in the action analysis process, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, thereby realizing more accurate control; the invention adopts the acquired depth image to acquire rich gesture actions based on the advanced depth judgment model, and more and richer interaction methods can be realized based on the gestures of the user, so that the subsequent functions can be expanded, which is another important invention point of the invention.
And a fusion step S104, fusing the first operation data and the second operation data to obtain user operation data. The user operation data obtained by fusing the first operation data and the second operation data is: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.
The Kalman filtering method is used for fusing low-level real-time dynamic multi-sensor redundant data, and optimal fusion and data estimation under the statistical significance are determined by utilizing the statistical characteristic recursion of a measurement model. The operation process of fusing the first operation data and the second operation data is as follows:
acquiring data (namely first operation data and second operation data) of an infrared observation target and a depth observation target by using sensors (namely an infrared camera and a depth camera) on equipment;
using a computing board to perform feature extraction transformation on the two output data (i.e. the first operation data and the second operation data) (discrete or continuous time function data, output vectors, imaging data or a direct attribute description), and extracting feature vectors Yi representing the two data;
performing pattern recognition processing on the feature vector Yi to finish the description of each sensor about the target; grouping, namely associating (namely associating the first operation data and the second operation data) the description data of the sensors about the targets according to the same target; the method synthesizes the data of each sensor of the target by utilizing a random algorithm-Kalman filtering method to obtain the consistency explanation and description of the target, thereby realizing the determination of the gesture action by combining infrared signals through a depth camera and further improving the recognition precision of the gesture action, which is another important invention point of the invention.
And an updating step S105, updating the display content of the projection unit on the user operation interface based on the user operation. In the invention, the user operation data is user marking data or other functions called by the user.
In one embodiment, the update implementation process is: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.
The operation panel (i.e. the user operation interface) of the invention provides various schemes and styles for users to select during design, thereby meeting different requirements of different types of users during interaction, after setting, the operation panel can be automatically loaded, more flexible region selection (after the region is defined based on a painting brush and a range selection tool, the contents in the region are stored, identified, transmitted and the like), and the automatic execution of the interaction track, such as later-stage teaching or display, can firstly display the track during operation by projection after the operation flow is stored in the computing board, and execute corresponding operation at proper time, thereby realizing the automatic display and execution, which belongs to another important invention point of the invention.
Fig. 2 shows a user interaction device of the present invention, the device comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board; the projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface; the infrared camera acquires first operation data of a user on a user interface through an infrared grating and sends the first operation data to the computing board; the depth camera acquires second operation data of a user on a user interface and sends the second operation data to the computing board; the computing board fuses the first operation data and the second operation data to obtain user operation data, and sends the user operation data to the projection unit; and after receiving the user operation data, the projection unit updates the display content of the projection unit on the user operation interface based on the user operation.
The device of the invention can be an intelligent desk lamp, the upper part of the desk lamp is provided with a projection unit, namely a projector, an infrared camera and a depth camera, the interior of the desk lamp is provided with a computing board, and the computing board is at least provided with a processor and a memory and is used for finishing data processing and the like. The projection unit can be a projector, the signal emission unit is arranged at the bottom of the desk lamp, so that the projection unit projects an operation interface on a desktop, the signal emission unit (such as an infrared laser) generates an infrared grating parallel to the user interface, the infrared grating is close to the user operation interface, and the close proximity generally means a distance of 1-2 mm.
The projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface, which can be realized by the following operations:
the first step is as follows: initializing the projector, focusing, performing trapezoidal correction, performing coincidence and calibration judgment of picture signals until the projection is clear, and displaying a loaded user operation interface.
The second step is that: an infrared laser located at the bottom end of the device emits infrared beams in a diffuse manner, each beam being at a prescribed distance of 1mm from the plane.
The third step: the infrared camera shoots the grating state and processes the grating state to obtain light spot information, if the light spot information is judged to be non-planar by the computing board, the projection content is updated to be in an error state, and a user is reminded to adjust the position until the position becomes a normal planar grating.
The fourth step: the projector acquires the setting of the current user from the computing board and projects a formal user operation interface according to the setting of the current user.
Through the operation, the automatic correction of the projector is realized, the corresponding user operation interface is projected based on the setting of a user, the user operation is facilitated, the signal transmitting unit is arranged at the bottom of the intelligent desk lamp, and the problem that the projector is limited in shape due to the fact that the transmitter is required to be arranged on an interactive plane in the prior art is solved, so that the shielding in the horizontal direction can be processed, high objects can also be processed, and the method is one of important invention points.
In one embodiment, the step of acquiring first operation data of a user on a user interface by the infrared camera through the infrared grating and sending the first operation data to the computing board includes: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.
Preferably, the formation of the light spot by the infrared light reflected after the infrared light emitted by the signal emitting unit is blocked by the finger is specifically: the user utilizes the finger (or other shelter) to press the position that needs to carry out the interaction, and the infrared beam can be sheltered from when the distance is less than 1 mm. The infrared light beam is shielded, and the shielded part can be used as a reflecting surface to reflect the emitted infrared light to form a so-called light spot, and the position of the light spot can be captured by the infrared camera. The infrared camera continuously shoots the grating state, and the information for forming the light spot obtained through filtering processing is specifically as follows: the infrared camera continuously shoots and records the infrared light distribution condition of the current plane; after acquiring the distribution condition, the infrared camera processes the shot image by using a plurality of filtering algorithms to obtain the position and the shape of the shielded part of the infrared ray; the infrared camera carries out standardized adjustment on the obtained light spot information, and the light spot information is transmitted to the computing board and stored by using a connected data wire.
In one embodiment, the step of acquiring second operation data of the user on the user interface by the depth camera and sending the second operation data to the computing board includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the motion of the hand of the user based on the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines second operation data through the current frame image and the image of the first N frames and sends the second operation data to the computing board.
Specifically, how the computing board determines the first operation data according to the information of the current frame of light spots and the information of the light spots of the previous N frames, where the specific flow is described with the infrared information and the camera frame rate as 50 frames:
when the current reflected light position of the computing board is judged to be the pressing behavior of the user, a duration needs to be judged, wherein a real pressing event is calculated by assuming that the duration lasts for 100ms (namely 5 frames), and a corresponding processing method is further called;
when only one frame of user is detected to be a pressing behavior, the computing board starts query operation, firstly obtains the behavior type of the user in the previous frame, and if the frame of user is the pressing behavior in the same position, the computing board continues to obtain the behavior type of the user in the previous frame. When an illegal action (pressing action or non-pressing action at a non-same position) is encountered, special treatment is carried out: skip this frame and read one frame forward.
There are two cases at this time: 1. and if the previous frame is still illegal, the query is terminated until the current frame cannot be counted as a real pressing event, multi-frame judgment is finished, and at the moment, the computing board starts to wait for the user behavior of the next frame and judges. 2. And a frame before is the pressing behavior of the same position, the illegal behavior encountered before is marked as error data and is treated as the pressing behavior of the same position. After the inquiry and special processing, if the computing board judges that the pressing action of the same position of five continuous frames exists at the moment, the computing board regards the pressing action as a real pressing event, and the multi-frame judgment is finished. The manner of determining the multiple frames in the depth image is the same as the manner of determining the multiple frames in the infrared image described above, and is not described in detail.
Preferably, the acquiring of the hand position of the user in the image using the computing pad is specifically: the depth camera shoots a scene by using a binocular camera, a first-level distance is obtained by light reflection, the detailed depth information of each part in the scene is calculated in a calculation mode of further summarizing and processing information of the two cameras, namely, an image and complete RGB-D information of the current scene are obtained, and the information is preprocessed and slightly corrected by using a white balance and histogram balance method; after the integral depth information is obtained, the computing board processes the acquired scene picture by using a deployed mobile-ssd detection network to obtain the rough position of the hand of the user; and combining the hand position with the depth information, and further predicting the positions of the bone joint points by using a curled neural network with a hourglass structure, so that the current hand posture of the user can be obtained, and further the hand action is obtained and stored.
In determining the action of obtaining the hand, first, the label QUOTE obtained by using hourglass is used
Figure 784588DEST_PATH_IMAGE002
Figure 67802DEST_PATH_IMAGE002
Generating a thermodynamic diagram of a hand joint point k (a thermodynamic diagram is a probability diagram, and an image)Is consistent, but the data at each pixel location is the probability that the current pixel is a certain joint, further joint information is analyzed based on the probability):
Figure 624685DEST_PATH_IMAGE003
then obtaining a thermodynamic diagram QUOTE according to the prediction
Figure 360560DEST_PATH_IMAGE005
Figure 293881DEST_PATH_IMAGE005
Further, the position P of the hand joint point k in the image is obtained (further correction is performed based on the predicted position, more accurate position information is obtained)
Figure 697180DEST_PATH_IMAGE007
Then, regarding the classification of the gesture, and the position area of each joint point is given for each class, the current action is determined as long as each joint point is in the corresponding remainder, and the processes of the hand action and the corresponding formulas are also called advanced depth judgment models.
The operation of the computing board in judging that the motion of the hand of the user in the current frame image is the operation of pressing the event by the user is as follows: after the hand action analysis is obtained, if the distance difference between the hand and the projection plane is judged to be less than 1mm, the user action is judged to be a pressing plane; after the pressing event is judged, in order to analyze the specific action of the user, the user action information of the previous frames is obtained from the storage, and the information is also used as the source data of the next analysis.
In the invention, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state in the action analysis process, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, thereby realizing more accurate control; the invention adopts the acquired depth image to acquire rich gesture actions based on the advanced depth judgment model, and more and richer interaction methods can be realized based on the gestures of the user, so that the subsequent functions can be expanded, which is another important invention point of the invention.
The step of fusing the first operation data and the second operation data by the computing board to obtain user operation data is as follows: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.
The Kalman filtering method is used for fusing low-level real-time dynamic multi-sensor redundant data, and optimal fusion and data estimation under the statistical significance are determined by utilizing the statistical characteristic recursion of a measurement model. The operation process of fusing the first operation data and the second operation data is as follows:
acquiring data (namely first operation data and second operation data) of an infrared observation target and a depth observation target by using sensors (namely an infrared camera and a depth camera) on equipment;
using a computing board to perform feature extraction transformation on the two output data (i.e. the first operation data and the second operation data) (discrete or continuous time function data, output vectors, imaging data or a direct attribute description), and extracting feature vectors Yi representing the two data;
performing pattern recognition processing on the feature vector Yi to finish the description of each sensor about the target; grouping, namely associating (namely associating the first operation data and the second operation data) the description data of the sensors about the targets according to the same target; the method synthesizes the data of each sensor of the target by utilizing a random algorithm-Kalman filtering method to obtain the consistency explanation and description of the target, thereby realizing the determination of the gesture action by combining infrared signals through a depth camera and further improving the recognition precision of the gesture action, which is another important invention point of the invention.
In the invention, the user operation data is user marking data or other functions called by the user. In one embodiment, the update implementation process is: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.
The operation panel (i.e. the user operation interface) of the invention provides various schemes and styles for users to select during design, thereby meeting different requirements of different types of users during interaction, after setting, the operation panel can be automatically loaded, more flexible region selection (after the region is defined based on a painting brush and a range selection tool, the contents in the region are stored, identified, transmitted and the like), and the automatic execution of the interaction track, such as later-stage teaching or display, can firstly display the track during operation by projection after the operation flow is stored in the computing board, and execute corresponding operation at proper time, thereby realizing the automatic display and execution, which belongs to another important invention point of the invention.
The invention has the main technical effects that: in the action analysis process, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims (10)

1. A method of user interaction, the method comprising:
the method comprises the following steps of initializing, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface;
a first acquisition step, namely acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera;
a second acquisition step of acquiring second operation data of the user on the user interface by using the depth camera;
a fusion step, fusing the first operation data and the second operation data to obtain user operation data;
and updating the display content of the projection unit on a user operation interface based on the user operation.
2. The method of claim 1, wherein the first obtaining step comprises:
when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the light spot shielding information of the first N frames of the stored current frame of light spot information from the computing board, and the computing board determines the first operation data through the current frame of light spot information and the light spot information of the first N frames.
3. The method of claim 2, wherein the second obtaining step comprises: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the action of the hand of the user based on the depth information and the hand position, when the computing board judges that the action of the hand of the user in the current frame image is the user operation as a pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames.
4. The method according to claim 3, wherein the fusing the first operation data and the second operation data to obtain user operation data is: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.
5. Method according to any of claims 1-4, wherein the user manipulation data is user tagging data or user invoking other functionality.
6. The method according to claim 2, wherein the operation of the computing board determining the first operation data by using the information of the light spot of the current frame and the light spot information of the previous N frames is: the computing board determines the finger action of the user through the current frame light spot information and the previous N frames of light spot information to obtain the hand track information of the user; and acquiring projection content on a user operation interface of the current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a spot center point position calculated by utilizing a mean value.
7. The method of claim 3, wherein the operation of the computing board determining the second operation data from the current frame image and the image of the previous N frames is: the calculation board obtains the hand track information of the user through the specific hand motions of the user of the current frame image and the previous N frames of images; acquiring projection content on a user operation interface of a current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a fingertip position.
8. The method of claim 5, wherein the updating step comprises: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.
9. A user interaction device, the device comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board;
the projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface;
the infrared camera acquires first operation data of a user on a user interface through an infrared grating and sends the first operation data to the computing board;
the depth camera acquires second operation data of a user on a user interface and sends the second operation data to the computing board;
the computing board fuses the first operation data and the second operation data to obtain user operation data, and sends the user operation data to the projection unit;
and after receiving the user operation data, the projection unit updates the display content of the projection unit on the user operation interface based on the user operation.
10. The device of claim 9, wherein the infrared camera acquiring first operation data of the user on the user interface through the infrared grating and sending the first operation data to the computing board comprises: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.
CN202010370009.7A 2020-05-06 2020-05-06 User interaction method and device Active CN111258411B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010370009.7A CN111258411B (en) 2020-05-06 2020-05-06 User interaction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010370009.7A CN111258411B (en) 2020-05-06 2020-05-06 User interaction method and device

Publications (2)

Publication Number Publication Date
CN111258411A true CN111258411A (en) 2020-06-09
CN111258411B CN111258411B (en) 2020-08-14

Family

ID=70950027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010370009.7A Active CN111258411B (en) 2020-05-06 2020-05-06 User interaction method and device

Country Status (1)

Country Link
CN (1) CN111258411B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558818A (en) * 2021-02-19 2021-03-26 北京深光科技有限公司 Projection-based remote live broadcast interaction method and system
CN114138121A (en) * 2022-02-07 2022-03-04 北京深光科技有限公司 User gesture recognition method, device and system, storage medium and computing equipment
CN114245093A (en) * 2022-02-25 2022-03-25 北京深光科技有限公司 Projection operation method based on infrared and thermal sensing, electronic device and storage medium
CN114721552A (en) * 2022-05-23 2022-07-08 北京深光科技有限公司 Touch identification method, device, equipment and medium based on infrared and visible light
CN117075730A (en) * 2023-08-18 2023-11-17 广东早安文化发展有限公司 3D virtual exhibition hall control system based on image recognition technology

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106204604A (en) * 2016-04-29 2016-12-07 北京仁光科技有限公司 Projection touch control display apparatus and exchange method thereof
CN108537827A (en) * 2018-03-23 2018-09-14 上海数迹智能科技有限公司 A kind of real-time low complex degree finger motion locus shape recognition algorithm based on depth map
CN110221732A (en) * 2019-05-15 2019-09-10 青岛小鸟看看科技有限公司 A kind of touch control projection system and touch action recognition methods
CN110310336A (en) * 2019-06-10 2019-10-08 青岛小鸟看看科技有限公司 A kind of touch control projection system and image processing method
CN110308817A (en) * 2019-06-10 2019-10-08 青岛小鸟看看科技有限公司 A kind of touch action recognition methods and touch control projection system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106204604A (en) * 2016-04-29 2016-12-07 北京仁光科技有限公司 Projection touch control display apparatus and exchange method thereof
CN108537827A (en) * 2018-03-23 2018-09-14 上海数迹智能科技有限公司 A kind of real-time low complex degree finger motion locus shape recognition algorithm based on depth map
CN110221732A (en) * 2019-05-15 2019-09-10 青岛小鸟看看科技有限公司 A kind of touch control projection system and touch action recognition methods
CN110310336A (en) * 2019-06-10 2019-10-08 青岛小鸟看看科技有限公司 A kind of touch control projection system and image processing method
CN110308817A (en) * 2019-06-10 2019-10-08 青岛小鸟看看科技有限公司 A kind of touch action recognition methods and touch control projection system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112558818A (en) * 2021-02-19 2021-03-26 北京深光科技有限公司 Projection-based remote live broadcast interaction method and system
CN112558818B (en) * 2021-02-19 2021-06-08 北京深光科技有限公司 Projection-based remote live broadcast interaction method and system
WO2022174706A1 (en) * 2021-02-19 2022-08-25 北京深光科技有限公司 Remote live streaming interaction method and system based on projection
CN114138121A (en) * 2022-02-07 2022-03-04 北京深光科技有限公司 User gesture recognition method, device and system, storage medium and computing equipment
CN114138121B (en) * 2022-02-07 2022-04-22 北京深光科技有限公司 User gesture recognition method, device and system, storage medium and computing equipment
CN114245093A (en) * 2022-02-25 2022-03-25 北京深光科技有限公司 Projection operation method based on infrared and thermal sensing, electronic device and storage medium
CN114721552A (en) * 2022-05-23 2022-07-08 北京深光科技有限公司 Touch identification method, device, equipment and medium based on infrared and visible light
CN114721552B (en) * 2022-05-23 2022-08-23 北京深光科技有限公司 Touch identification method, device, equipment and medium based on infrared and visible light
CN117075730A (en) * 2023-08-18 2023-11-17 广东早安文化发展有限公司 3D virtual exhibition hall control system based on image recognition technology
CN117075730B (en) * 2023-08-18 2024-04-30 广东早安文化发展有限公司 3D virtual exhibition hall control system based on image recognition technology

Also Published As

Publication number Publication date
CN111258411B (en) 2020-08-14

Similar Documents

Publication Publication Date Title
CN111258411B (en) User interaction method and device
US7161596B2 (en) Display location calculation means
US7256772B2 (en) Auto-aligning touch system and method
US8602887B2 (en) Synthesis of information from multiple audiovisual sources
US6624833B1 (en) Gesture-based input interface system with shadow detection
EP0961965B1 (en) Method and system for gesture based option selection
US8854433B1 (en) Method and system enabling natural user interface gestures with an electronic system
CN104364733A (en) Position-of-interest detection device, position-of-interest detection method, and position-of-interest detection program
KR20020086931A (en) Single camera system for gesture-based input and target indication
JP2005353071A (en) Pointing input system and method using array sensors
JP2014170511A (en) System, image projection device, information processing device, information processing method, and program
CN114138121B (en) User gesture recognition method, device and system, storage medium and computing equipment
CN111078018A (en) Touch control method of display, terminal device and storage medium
US20160259402A1 (en) Contact detection apparatus, projector apparatus, electronic board apparatus, digital signage apparatus, projector system, and contact detection method
WO2021040896A1 (en) Automatically generating an animatable object from various types of user input
CN111258410B (en) Man-machine interaction equipment
CN113760131B (en) Projection touch processing method and device and computer readable storage medium
EP3910451A1 (en) Display systems and methods for aligning different tracking means
JP2017219942A (en) Contact detection device, projector device, electronic blackboard system, digital signage device, projector device, contact detection method, program and recording medium
JP4296607B2 (en) Information input / output device and information input / output method
CN110213407B (en) Electronic device, operation method thereof and computer storage medium
CN114167997B (en) Model display method, device, equipment and storage medium
CN211827195U (en) Interactive device
CN114721552B (en) Touch identification method, device, equipment and medium based on infrared and visible light
JP2002107127A (en) Shape-measuring apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant