CN111258411A

CN111258411A - User interaction method and device

Info

Publication number: CN111258411A
Application number: CN202010370009.7A
Authority: CN
Inventors: 冯翀; 郭嘉伟; 罗观洲; 马宇航
Original assignee: Beijing Shenguang Technology Co Ltd
Current assignee: Beijing Shenguang Technology Co Ltd
Priority date: 2020-05-06
Filing date: 2020-05-06
Publication date: 2020-06-09
Anticipated expiration: 2040-05-06
Also published as: CN111258411B

Abstract

The invention provides a user interaction method and a device, wherein the method mainly comprises the following steps: acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera; acquiring second operation data of a user on a user interface by using a depth camera, and fusing the first operation data and the second operation data to obtain user operation data; and updating the display content of the projection unit on the user operation interface based on the user operation. The judgment of the user action is more accurate, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.

Description

User interaction method and device

Technical Field

The invention relates to the technical field of human-computer interaction, in particular to a user interaction method and user interaction equipment.

Background

Human-computer interaction is a study of the interactive relationships between a research system and a user. The system may be a variety of machines, and may be a computerized system and software. The human-computer interaction interface generally refers to a portion visible to a user. And the user communicates with the system through a human-computer interaction interface and performs operation. Such as the play button of a radio, the instrument panel of an airplane or the control room of a power plant. The human-machine interface is designed to contain the user's understanding of the system (i.e., mental models) for the usability or user-friendliness of the system.

In the prior art, the touch control scheme used by the interactive projector is basically an infrared flat scanning scheme, that is, an infrared emitter is placed at a fixed height position of a desktop, and an object (such as a finger) is shielded and then identified as a click event. The disadvantages of this solution are: the projector must be placed on an interactive plane, and the projector is limited in shape; the occlusion in the horizontal direction cannot be handled, and objects with height cannot be handled; any object may be touched by mistake; the precision needs to be improved; the projector interface cannot be automatically corrected.

In addition, in the prior art, no matter the user action is obtained based on infrared or video, the action is captured based on the current frame, the recognition precision is low, and in the prior art, the action of the user is obtained based on a single mode, the user action cannot be captured based on two or more signals at the same time, and the gesture precision of the user action obtained by a single signal is low, so how to improve the recognition precision of the user action is a key point and a difficulty point of human-computer interaction.

Disclosure of Invention

The present invention provides the following technical solutions to overcome the above-mentioned drawbacks in the prior art.

A method of user interaction, the method comprising:

the method comprises the following steps of initializing, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface;

a first acquisition step, namely acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera;

a second acquisition step of acquiring second operation data of the user on the user interface by using the depth camera;

a fusion step, fusing the first operation data and the second operation data to obtain user operation data;

and updating the display content of the projection unit on a user operation interface based on the user operation.

Further, the first acquiring step includes:

when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the light spot shielding information of the first N frames of the stored current frame of light spot information from the computing board, and the computing board determines the first operation data through the current frame of light spot information and the light spot information of the first N frames.

Further, the second acquiring step includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the action of the hand of the user based on the depth information and the hand position, when the computing board judges that the action of the hand of the user in the current frame image is the user operation as a pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames.

Further, the step of fusing the first operation data and the second operation data to obtain user operation data is as follows: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.

Further, the user operation data marks data for a user or calls other functions for the user.

Further, the operation of the computing board determining the first operation data by the information of the light spot of the current frame and the information of the light spot of the previous N frames is: the computing board determines the finger action of the user through the current frame light spot information and the previous N frames of light spot information to obtain the hand track information of the user; and acquiring projection content on a user operation interface of the current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a spot center point position calculated by utilizing a mean value.

Further, the operation of the computing board determining the second operation data from the current frame image and the previous N frames of images is: the calculation board obtains the hand track information of the user through the specific hand motions of the user of the current frame image and the previous N frames of images; acquiring projection content on a user operation interface of a current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a fingertip position.

Further, the updating step includes: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.

The invention also proposes a user interaction device, comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board;

the projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface;

the infrared camera acquires first operation data of a user on a user interface through an infrared grating and sends the first operation data to the computing board;

the depth camera acquires second operation data of a user on a user interface and sends the second operation data to the computing board;

the computing board fuses the first operation data and the second operation data to obtain user operation data, and sends the user operation data to the projection unit;

and after receiving the user operation data, the projection unit updates the display content of the projection unit on the user operation interface based on the user operation.

Further, the step of acquiring first operation data of a user on a user interface through an infrared grating by the infrared camera and sending the first operation data to the computing board includes: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.

Further, the acquiring, by the depth camera, second operation data of the user on the user interface and sending the second operation data to the computing board includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the motion of the hand of the user based on the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines second operation data through the current frame image and the image of the first N frames and sends the second operation data to the computing board.

Further, the step of fusing the first operation data and the second operation data to obtain user operation data is as follows: and the computing board carries out Kalman filtering processing on the first operation data and the second operation data to obtain user operation data.

Further, the updating the display content of the projection unit on the user operation interface includes: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.

The invention has the technical effects that: the invention discloses a user interaction method, which comprises the following steps: the method comprises the following steps of initializing, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface; a first acquisition step, namely acquiring first operation data of a user on a user interface through an infrared grating by using an infrared camera; a second acquisition step of acquiring second operation data of the user on the user interface by using the depth camera; a fusion step, fusing the first operation data and the second operation data to obtain user operation data; and updating the display content of the projection unit on a user operation interface based on the user operation. The main advantages of the invention are: in the action analysis process, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings.

FIG. 1 is a flow chart of a method of user interaction according to one of the embodiments of the invention.

FIG. 2 is a schematic diagram of a user interaction device in accordance with one of the embodiments of the present invention.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

FIG. 1 illustrates a user interaction method of the present invention, the method comprising:

and an initialization step S101, projecting a user operation interface on a plane by using a projection unit, and generating an infrared grating parallel to the user interface through a signal emission unit, wherein the infrared grating is adjacent to the user operation interface.

The method can be applied to an intelligent desk lamp, the upper part of the desk lamp is provided with a projection unit, namely a projector, an infrared camera and a depth camera, a computing board is arranged in the desk lamp, and the computing board is at least provided with a processor and a memory and used for finishing data processing and the like. The projection unit can be a projector, the signal emission unit is arranged at the bottom of the desk lamp, so that the projection unit projects an operation interface on a desktop, the signal emission unit (such as an infrared laser) generates an infrared grating parallel to the user interface, the infrared grating is close to the user operation interface, and the close proximity generally means a distance of 1-2 mm.

The initializing step S101 may specifically include the following steps:

the first step is as follows: initializing the projector, focusing, performing trapezoidal correction, performing coincidence and calibration judgment of picture signals until the projection is clear, and displaying a loaded user operation interface.

The second step is that: an infrared laser located at the bottom end of the device emits infrared beams in a diffuse manner, each beam being at a prescribed distance of 1mm from the plane.

The third step: the infrared camera shoots the grating state and processes the grating state to obtain light spot information, if the light spot information is judged to be non-planar by the computing board, the projection content is updated to be in an error state, and a user is reminded to adjust the position until the position becomes a normal planar grating.

The fourth step: the projector acquires the setting of the current user from the computing board and projects a formal user operation interface according to the setting of the current user.

Through the steps, automatic correction of the projector is achieved, a corresponding user operation interface is projected based on the setting of a user, user operation is facilitated, the signal transmitting unit is placed at the bottom of the intelligent desk lamp, and the problem that the projector is limited in shape due to the fact that the transmitter must be placed on an interactive plane in the prior art is solved, so that shielding in the horizontal direction can be processed, high objects can also be processed, and the method is one of important invention points.

A first obtaining step S102, obtaining first operation data of a user on a user interface through an infrared grating by using an infrared camera.

In one embodiment, when a user operates on the user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information for forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the first N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the first N frames of light spot information.

Specifically, how the computing board determines the first operation data according to the information of the current frame of light spots and the information of the light spots of the previous N frames, where the specific flow is described with the infrared information and the camera frame rate as 50 frames:

when the current reflected light position of the computing board is judged to be the pressing behavior of the user, a duration needs to be judged, wherein a real pressing event is calculated by assuming that the duration lasts for 100ms (namely 5 frames), and a corresponding processing method is further called;

when only one frame of user is detected to be a pressing behavior, the computing board starts query operation, firstly obtains the behavior type of the user in the previous frame, and if the frame of user is the pressing behavior in the same position, the computing board continues to obtain the behavior type of the user in the previous frame. When an illegal action (pressing action or non-pressing action at a non-same position) is encountered, special treatment is carried out: skip this frame and read one frame forward.

There are two cases at this time: 1. and if the previous frame is still illegal, the query is terminated until the current frame cannot be counted as a real pressing event, multi-frame judgment is finished, and at the moment, the computing board starts to wait for the user behavior of the next frame and judges. 2. And a frame before is the pressing behavior of the same position, the illegal behavior encountered before is marked as error data and is treated as the pressing behavior of the same position. After the inquiry and special processing, if the computing board judges that the pressing action of the same position of five continuous frames exists at the moment, the computing board regards the pressing action as a real pressing event, and the multi-frame judgment is finished.

Preferably, the formation of the light spot by the infrared light reflected after the infrared light emitted by the signal emitting unit is blocked by the finger is specifically: the user utilizes the finger (or other shelter) to press the position that needs to carry out the interaction, and the infrared beam can be sheltered from when the distance is less than 1 mm. The infrared light beam is shielded, and the shielded part can be used as a reflecting surface to reflect the emitted infrared light to form a so-called light spot, and the position of the light spot can be captured by the infrared camera. The infrared camera continuously shoots the grating state, and the information for forming the light spot obtained through filtering processing is specifically as follows: the infrared camera continuously shoots and records the infrared light distribution condition of the current plane; after acquiring the distribution condition, the infrared camera processes the shot image by using a plurality of filtering algorithms to obtain the position and the shape of the shielded part of the infrared ray; the infrared camera carries out standardized adjustment on the obtained light spot information, and the light spot information is transmitted to the computing board and stored by using a connected data wire.

A second obtaining step S103, obtaining second operation data of the user on the user interface by using the depth camera.

In one embodiment, the depth camera shoots a scene of the user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for acquiring the hand position of the user in the image, the computing board determines the motion of the hand of the user on the basis of the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation as a pressing event, the computing board acquires the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames. The multi-frame judgment mode is the same as the infrared multi-frame judgment mode described above, and is not described in detail.

Preferably, the acquiring of the hand position of the user in the image using the computing pad is specifically: the depth camera shoots a scene by using a binocular camera, a first-level distance is obtained by light reflection, the detailed depth information of each part in the scene is calculated in a calculation mode of further summarizing and processing information of the two cameras, namely, an image and complete RGB-D information of the current scene are obtained, and the information is preprocessed and slightly corrected by using a white balance and histogram balance method; after the integral depth information is obtained, the computing board processes the acquired scene picture by using a deployed mobile-ssd detection network to obtain the rough position of the hand of the user; and combining the hand position with the depth information, and further predicting the positions of the bone joint points by using a curled neural network with a hourglass structure, so that the current hand posture of the user can be obtained, and further the hand action is obtained and stored.

First, the method uses hourglas to determine the action formula of the hands derived label QUOTE

Generating a thermodynamic diagram of the hand joint point k (the thermodynamic diagram is a probability diagram and is consistent with the pixel composition of an image, but the data at each pixel position is the probability that the current pixel is a certain joint, and further joint information is analyzed based on the probability):

then obtaining a thermodynamic diagram QUOTE according to the prediction

Further, the position P of the hand joint point k in the image is obtained (further correction is performed based on the predicted position, more accurate position information is obtained)

Then, regarding the classification of the gesture, and the position area of each joint point is given for each class, the current action is determined as long as each joint point is in the corresponding remainder, and the processes of the hand action and the corresponding formulas are also called advanced depth judgment models.

The operation of the computing board in judging that the motion of the hand of the user in the current frame image is the operation of pressing the event by the user is as follows: after the hand action analysis is obtained, if the distance difference between the hand and the projection plane is judged to be less than 1mm, the user action is judged to be a pressing plane; after the pressing event is judged, in order to analyze the specific action of the user, the user action information of the previous frames is obtained from the storage, and the information is also used as the source data of the next analysis.

In the invention, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state in the action analysis process, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, thereby realizing more accurate control; the invention adopts the acquired depth image to acquire rich gesture actions based on the advanced depth judgment model, and more and richer interaction methods can be realized based on the gestures of the user, so that the subsequent functions can be expanded, which is another important invention point of the invention.

And a fusion step S104, fusing the first operation data and the second operation data to obtain user operation data. The user operation data obtained by fusing the first operation data and the second operation data is: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.

The Kalman filtering method is used for fusing low-level real-time dynamic multi-sensor redundant data, and optimal fusion and data estimation under the statistical significance are determined by utilizing the statistical characteristic recursion of a measurement model. The operation process of fusing the first operation data and the second operation data is as follows:

acquiring data (namely first operation data and second operation data) of an infrared observation target and a depth observation target by using sensors (namely an infrared camera and a depth camera) on equipment;

using a computing board to perform feature extraction transformation on the two output data (i.e. the first operation data and the second operation data) (discrete or continuous time function data, output vectors, imaging data or a direct attribute description), and extracting feature vectors Yi representing the two data;

performing pattern recognition processing on the feature vector Yi to finish the description of each sensor about the target; grouping, namely associating (namely associating the first operation data and the second operation data) the description data of the sensors about the targets according to the same target; the method synthesizes the data of each sensor of the target by utilizing a random algorithm-Kalman filtering method to obtain the consistency explanation and description of the target, thereby realizing the determination of the gesture action by combining infrared signals through a depth camera and further improving the recognition precision of the gesture action, which is another important invention point of the invention.

And an updating step S105, updating the display content of the projection unit on the user operation interface based on the user operation. In the invention, the user operation data is user marking data or other functions called by the user.

In one embodiment, the update implementation process is: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.

The operation panel (i.e. the user operation interface) of the invention provides various schemes and styles for users to select during design, thereby meeting different requirements of different types of users during interaction, after setting, the operation panel can be automatically loaded, more flexible region selection (after the region is defined based on a painting brush and a range selection tool, the contents in the region are stored, identified, transmitted and the like), and the automatic execution of the interaction track, such as later-stage teaching or display, can firstly display the track during operation by projection after the operation flow is stored in the computing board, and execute corresponding operation at proper time, thereby realizing the automatic display and execution, which belongs to another important invention point of the invention.

Fig. 2 shows a user interaction device of the present invention, the device comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board; the projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface; the infrared camera acquires first operation data of a user on a user interface through an infrared grating and sends the first operation data to the computing board; the depth camera acquires second operation data of a user on a user interface and sends the second operation data to the computing board; the computing board fuses the first operation data and the second operation data to obtain user operation data, and sends the user operation data to the projection unit; and after receiving the user operation data, the projection unit updates the display content of the projection unit on the user operation interface based on the user operation.

The device of the invention can be an intelligent desk lamp, the upper part of the desk lamp is provided with a projection unit, namely a projector, an infrared camera and a depth camera, the interior of the desk lamp is provided with a computing board, and the computing board is at least provided with a processor and a memory and is used for finishing data processing and the like. The projection unit can be a projector, the signal emission unit is arranged at the bottom of the desk lamp, so that the projection unit projects an operation interface on a desktop, the signal emission unit (such as an infrared laser) generates an infrared grating parallel to the user interface, the infrared grating is close to the user operation interface, and the close proximity generally means a distance of 1-2 mm.

The projection unit is configured to project a user operation interface on a plane, the signal emission unit is configured to generate an infrared grating parallel to the user interface, and the infrared grating is adjacent to the user operation interface, which can be realized by the following operations:

Through the operation, the automatic correction of the projector is realized, the corresponding user operation interface is projected based on the setting of a user, the user operation is facilitated, the signal transmitting unit is arranged at the bottom of the intelligent desk lamp, and the problem that the projector is limited in shape due to the fact that the transmitter is required to be arranged on an interactive plane in the prior art is solved, so that the shielding in the horizontal direction can be processed, high objects can also be processed, and the method is one of important invention points.

In one embodiment, the step of acquiring first operation data of a user on a user interface by the infrared camera through the infrared grating and sending the first operation data to the computing board includes: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.

In one embodiment, the step of acquiring second operation data of the user on the user interface by the depth camera and sending the second operation data to the computing board includes: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the motion of the hand of the user based on the depth information and the hand position, when the computing board judges that the motion of the hand of the user in the current frame image is the user operation pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines second operation data through the current frame image and the image of the first N frames and sends the second operation data to the computing board.

There are two cases at this time: 1. and if the previous frame is still illegal, the query is terminated until the current frame cannot be counted as a real pressing event, multi-frame judgment is finished, and at the moment, the computing board starts to wait for the user behavior of the next frame and judges. 2. And a frame before is the pressing behavior of the same position, the illegal behavior encountered before is marked as error data and is treated as the pressing behavior of the same position. After the inquiry and special processing, if the computing board judges that the pressing action of the same position of five continuous frames exists at the moment, the computing board regards the pressing action as a real pressing event, and the multi-frame judgment is finished. The manner of determining the multiple frames in the depth image is the same as the manner of determining the multiple frames in the infrared image described above, and is not described in detail.

In determining the action of obtaining the hand, first, the label QUOTE obtained by using hourglass is used

Generating a thermodynamic diagram of a hand joint point k (a thermodynamic diagram is a probability diagram, and an image)Is consistent, but the data at each pixel location is the probability that the current pixel is a certain joint, further joint information is analyzed based on the probability):

then obtaining a thermodynamic diagram QUOTE according to the prediction

The step of fusing the first operation data and the second operation data by the computing board to obtain user operation data is as follows: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.

In the invention, the user operation data is user marking data or other functions called by the user. In one embodiment, the update implementation process is: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.

The invention has the main technical effects that: in the action analysis process, the combined analysis is carried out not only on the current frame but also in combination with the previous multi-frame state, and the judgment on the action of the user can be more accurate through the analysis of the dynamic effect, so that more accurate control is realized; according to the invention, rich gesture actions are obtained by adopting the obtained depth image based on an advanced depth judgment model, more and richer interaction methods can be realized based on the gesture of the user, and the subsequent functions can be expanded; the infrared signals are jointly determined through the depth camera to determine the gesture actions, so that the gesture action recognition precision is further improved, and in addition, the automatic correction of the projector can be realized.

For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.

From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.

Finally, it should be noted that: although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that: modifications and equivalents may be made thereto without departing from the spirit and scope of the invention and it is intended to cover in the claims the invention as defined in the appended claims.

Claims

1. A method of user interaction, the method comprising:

2. The method of claim 1, wherein the first obtaining step comprises:

3. The method of claim 2, wherein the second obtaining step comprises: the depth camera shoots a scene of a user operation interface by using the binocular camera and sends the shot image to the computing board for storage, the computing board calculates depth information of each part in the scene of the user operation interface through the images shot by the two cameras, the computing board is used for obtaining the hand position of a user in the image, the computing board determines the action of the hand of the user based on the depth information and the hand position, when the computing board judges that the action of the hand of the user in the current frame image is the user operation as a pressing event, the computing board obtains the stored image of the first N frames of the current frame image from the computing board, and the computing board determines the second operation data through the current frame image and the image of the first N frames.

4. The method according to claim 3, wherein the fusing the first operation data and the second operation data to obtain user operation data is: and processing the first operation data and the second operation data by a Kalman filtering method to obtain user operation data.

5. Method according to any of claims 1-4, wherein the user manipulation data is user tagging data or user invoking other functionality.

6. The method according to claim 2, wherein the operation of the computing board determining the first operation data by using the information of the light spot of the current frame and the light spot information of the previous N frames is: the computing board determines the finger action of the user through the current frame light spot information and the previous N frames of light spot information to obtain the hand track information of the user; and acquiring projection content on a user operation interface of the current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a spot center point position calculated by utilizing a mean value.

7. The method of claim 3, wherein the operation of the computing board determining the second operation data from the current frame image and the image of the previous N frames is: the calculation board obtains the hand track information of the user through the specific hand motions of the user of the current frame image and the previous N frames of images; acquiring projection content on a user operation interface of a current projection unit, and judging a function related to a pressing position based on the track information to determine the first operation data, wherein the pressing position is a fingertip position.

8. The method of claim 5, wherein the updating step comprises: the computing board sends the user operation data to a projection unit, the projection unit determines the type of the user operation data after acquiring the user operation data, and if the user operation data is user mark data, corresponding marks are directly drawn on projection contents; and if the user operation data calls other functions for the user, calling the application or function stored in the computing board to acquire the display content updated to the user operation interface, and displaying the display content on the user operation interface.

9. A user interaction device, the device comprising: the system comprises a projection unit, a signal emission unit, an infrared camera, a depth camera and a calculation board;

10. The device of claim 9, wherein the infrared camera acquiring first operation data of the user on the user interface through the infrared grating and sending the first operation data to the computing board comprises: when a user operates on a user operation interface through a hand, infrared light emitted by the signal emitting unit is shielded by fingers and then reflected to form light spots, the infrared camera continuously shoots a grating state, information forming the light spots is obtained through filtering processing, then each frame of information is transmitted to the computing board to be stored and analyzed, when the computing board judges that the current frame of light spot information is a pressing event, the computing board obtains the previous N frames of light spot shielding information of the stored current frame of light spot information, and the computing board determines the first operation data through the current frame of light spot information and the previous N frames of light spot information and sends the first operation data to the computing board.