CN117631825A

CN117631825A - Man-machine interaction equipment and application method thereof

Info

Publication number: CN117631825A
Application number: CN202311374551.XA
Authority: CN
Inventors: 方兆忠; 闫斌; 吴志宏; 郭韩健; 朱旭锋; 陆地
Original assignee: Zhejiang Communications Services Co Ltd
Current assignee: Zhejiang Communications Services Co Ltd
Priority date: 2023-10-23
Filing date: 2023-10-23
Publication date: 2024-03-01

Abstract

The invention discloses man-machine interaction equipment and a use method thereof, and the technical scheme is as follows: the human-computer interaction operation can be performed by utilizing the combination of a plurality of motion detection modes, so that the ambiguity of human-computer interaction operation identification is reduced under the condition that an additional input device, such as a touch screen input device, is not needed, the accuracy of human-computer interaction operation is improved, and for example, the interaction operation of enlarging and reducing a display target can be realized under the condition that the touch screen input device is not adopted, and thus, the motion detection mode of a computer vision technology is fully utilized, and better interaction operation experience is brought to a user.

Description

Man-machine interaction equipment and application method thereof

Technical Field

The invention relates to the technical field of man-machine interaction, in particular to man-machine interaction equipment and a using method thereof.

Background

Human-computer interaction based on computer vision techniques can visually capture user input through various image capture and processing methods. The man-machine interaction mode based on the computer vision technology becomes a hot topic of a new generation man-machine interaction technology, and is widely applied in man-machine interaction aspects of leisure and entertainment. In the interaction mode, the user can interact with the computer through the body gesture, the head gesture, the sight line or the human body action, so that the user can be released from the input modes of the traditional keyboard, the mouse and the like, and the unprecedented human-computer interaction experience is obtained.

Various man-machine interaction modes based on computer vision are proposed at present. In one existing human-machine interaction approach, 3D objects may be generated, modified, and manipulated using touch input and three-dimensional (3D) gesture input. In another approach, interactions with the virtual user interface may be through human gesture detection.

However, the type of motion detection utilized by existing human-machine interaction devices and methods is relatively single, typically requiring touch-based input means and requiring the user to remember a large number of prescribed actions to perform the interaction. For reasons of gestures, postures and depth sensing ranges, preprocessing or various manual operations are often required, for example, various sensors need to be calibrated, interaction spaces are predefined, etc. This makes the user inconvenient. Therefore, there is a need for a human-computer interaction method that can utilize multiple motion detection methods and that does not rely on additional input devices

There is therefore a need to propose a solution to this problem.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide the man-machine interaction equipment and the use method thereof, and the classifier is corrected and positive and negative samples of the target image are supplemented through the image to be detected, so that the stability of man-machine interaction is continuously enhanced in the use process, and the man-machine interaction function with better effect is realized.

The technical aim of the invention is realized by the following technical scheme: a man-machine interaction device comprises a motion detection module for detecting various types of actions and gestures of a user from image data;

the interaction determining module is used for determining the interaction operation which the user wants to perform according to the various types of actions and gestures of the user detected by the motion detecting module and sending a corresponding display operation instruction to the display control module;

the training module is used for responding to the learning instruction, acquiring a target image appointed by a user, acquiring a positive sample and a negative sample, and building a classifier comprising a plurality of decision trees based on random forest training;

the storage module is used for storing a plurality of classifiers comprising a plurality of decision trees and storing the positive samples and the negative samples to form a positive and negative sample set;

the display control module is used for controlling the display equipment to display corresponding interactive operation on the display screen according to the instruction determined by the interactive determination module;

the comparison module is used for analyzing and obtaining the correlation degree of the image to be detected and the positive and negative sample set, and judging whether the image to be detected is a target image or not according to the correlation degree and a preset first correlation threshold value;

the judging module is used for finally judging that the image to be detected is the target image when the image to be detected is judged to be the same as the target image under the judging threshold value and the first correlation threshold value;

the updating module is used for adjusting parameters of the classifiers of the plurality of decision trees by utilizing the finally determined images to be detected which are the same as the target image, adding the images to be detected into the positive and negative sample collection set as positive samples when the finally determined correlation degree of the images to be detected which are the same as the target image meets a preset second correlation threshold value, and adding the images to be detected into the positive and negative sample collection set as negative samples when the finally determined probability value of the images to be detected which are different from the target image reaches the preset correction threshold value.

The invention is further provided with: the training module collects positive samples and negative samples, including rotation, projection, scaling or translation of the target image, and collects positive samples respectively.

The technical aim of the invention is also achieved by the following technical scheme: the application method of the man-machine interaction device comprises the following steps: step S1: the step of determining the interaction comprises: detecting a plurality of types of actions and gestures of a user from the image data;

determining an interactive operation to be performed according to the detected actions and gestures of the user, and sending out a display operation instruction corresponding to the interactive operation;

controlling the display device to display corresponding interactive operation on the display screen according to the determined instruction;

step S2, the step of detecting various types of actions and gestures of the user comprises the following steps: detecting a line of sight direction of a user from image data; tracking and identifying the gesture actions of the parts of the user's body;

step S3: receiving a learning instruction input by a user, and starting a learning mode;

step S4: acquiring a target image appointed by a user, acquiring a positive sample and a negative sample, and establishing a classifier comprising a plurality of decision trees based on random forest training;

and S5, storing the plurality of classifiers comprising the plurality of decision trees, and storing the positive samples and the negative samples to form a positive and negative sample set.

In summary, the invention has the following beneficial effects:

the invention can receive the user-defined target image of the user as the identification object, and provides good flexibility for the use of the user. And the classifier is corrected by using the image to be detected and the positive and negative samples of the target image are supplemented, so that the stability of human-computer interaction is continuously enhanced in the use process, and the human-computer interaction function with better effect is realized.

Drawings

FIG. 1 is a flow chart of steps of an action writing method based on a human-computer interaction method;

fig. 2 is a flowchart of a man-machine interaction method according to an embodiment of the invention.

Detailed Description

In order to better understand the technical solutions of the present invention, the following description will be made in detail with reference to the accompanying drawings and specific embodiments, and it should be noted that, without conflict, the embodiments of the present application and features in the embodiments may be combined with each other.

In the description of the present invention, it should be noted that the positional or positional relationship indicated by the terms such as "upper", "lower", "inner", "outer", "top/bottom", etc. are based on the positional or positional relationship shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.

In the description of the present invention, it should be noted that, unless explicitly specified and limited otherwise, the terms "mounted," "provided," "sleeved," "connected," and the like are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

The present invention will be described in detail below with reference to the accompanying drawings.

A man-machine interaction device, as shown in figures 1-2, comprises a motion detection module for detecting various types of actions and gestures of a user from image data;

the storage module is used for storing a plurality of classifiers comprising a plurality of decision trees and storing positive samples and negative samples to form a positive and negative sample set;

The act of the training module collecting the positive and negative samples includes rotating, projecting, scaling or translating the target image and collecting the positive samples separately.

The application method of the man-machine interaction device comprises the following steps: step S1: the step of determining the interaction comprises: detecting a plurality of types of actions and gestures of a user from the image data;

and S5, storing a plurality of classifiers comprising a plurality of decision trees, and storing positive samples and negative samples to form a positive and negative sample set.

And acquiring a target image designated by the user, namely, a user-defined target image. Assume that a user wishes to use the palm to achieve human-machine interaction. The palm image can be provided as the target image by the camera in the learning mode. In order to make the collection of positive samples more comprehensive, the training unit 402 of the present invention is configured to perform rotation, projection, scaling, and translation on the target image to obtain a richer positive sample. The classifier of the decision trees is characterized in that point feature description is carried out on the target image, then a plurality of classifiers for image recognition are established through a plurality of preset (such as 10) decision trees, and the calculation of the similarity probability is realized.

The user-defined target image of the user can be received and used as an identification object, and good flexibility is provided for the use of the user. And the classifier is corrected by using the image to be detected and the positive and negative samples of the target image are supplemented, so that the stability of human-computer interaction is continuously enhanced in the use process, and the human-computer interaction function with better effect is realized.

The man-machine interaction processing device is used for analyzing the image data acquired by the image acquisition device, so that the gesture and the action of the user are identified and analyzed. And then, the man-machine interaction processing equipment controls the display equipment to display correspondingly according to the analysis result. The display device may be a device such as a Television (TV), projector.

Here, the human-computer interaction processing apparatus may determine an interaction operation that the user wants to perform according to various types of actions and gestures of the detected user. For example, the user may point with a finger to a specific object (OBJ 2) among a plurality of objects (e.g., OBJ1, OBJ2, and OBJ 3) in the content displayed by the display device while looking at the specific object, thereby starting the interactive operation. That is, the human-computer interaction processing device may detect the direction of the user's line of sight, gestures, and actions and gestures of various parts of the body. The user may also operate on a particular object displayed by moving a finger, for example, changing the display position of the object. At the same time, the user may also move a certain part of the body (e.g., an arm) or move the entire body for interactive input. It should be understood that although the image acquisition device, the human-computer interaction processing device, and the display device are shown as separate devices, the three devices may be arbitrarily combined into one or two devices. For example, the image acquisition device and the human-computer interaction processing device may be implemented in one device.

The human-computer interaction operation can be performed by utilizing the combination of a plurality of motion detection modes, so that the ambiguity of human-computer interaction operation identification is reduced and the accuracy of human-computer interaction operation is improved under the condition that an additional input device (such as a touch screen input device) is not needed. For example, without using a touch screen input device, interactive operations of enlarging and reducing a display target can be realized. Therefore, the motion detection mode of the computer vision technology is fully utilized, and better interactive operation experience is brought to the user.

The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the present invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.

Claims

1. A man-machine interaction device comprises a motion detection module for detecting various types of actions and gestures of a user from image data;

2. A human-machine interaction device according to claim 1, wherein: the training module collects positive samples and negative samples, including rotation, projection, scaling or translation of the target image, and collects positive samples respectively.

3. A method of using a human-machine interaction device according to any of claims 1-2, comprising the steps of: step S1: the step of determining the interaction comprises: detecting a plurality of types of actions and gestures of a user from the image data;