CN108829233B - Interaction method and device - Google Patents

Interaction method and device Download PDF

Info

Publication number
CN108829233B
CN108829233B CN201810387822.8A CN201810387822A CN108829233B CN 108829233 B CN108829233 B CN 108829233B CN 201810387822 A CN201810387822 A CN 201810387822A CN 108829233 B CN108829233 B CN 108829233B
Authority
CN
China
Prior art keywords
target person
key point
image
human skeleton
skeleton key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810387822.8A
Other languages
Chinese (zh)
Other versions
CN108829233A (en
Inventor
陈圆
黄亮
彭中兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN TONGWEI COMMUNICATION TECHNOLOGY Co.,Ltd.
Original Assignee
Shenzhen Tongwei Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tongwei Communication Technology Co ltd filed Critical Shenzhen Tongwei Communication Technology Co ltd
Priority to CN201810387822.8A priority Critical patent/CN108829233B/en
Publication of CN108829233A publication Critical patent/CN108829233A/en
Application granted granted Critical
Publication of CN108829233B publication Critical patent/CN108829233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Engineering & Computer Science (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of human-computer interaction, and discloses an interaction method and a device, wherein the method comprises the following steps: identifying a human skeleton key point coordinate set of all people in the image collected by the camera; acquiring coordinate data of a target person from the human skeleton key point coordinate set; tracking the target person, and performing interaction according to the real-time coordinate data of the target person; the convolutional neural network algorithm based on deep learning improves the speed and accuracy of human-computer interaction, achieves the effect of automatically tracking the target person, only needs a common camera to acquire images, and is low in cost and high in compatibility.

Description

Interaction method and device
Technical Field
The invention relates to the technical field of human-computer interaction, in particular to an interaction method and device.
Background
The motion sensing game is a novel game in which a player interacts with an intelligent device by changing body actions, and compared with a traditional game depending on interaction modes such as keys or touch, the motion sensing game can enhance participation of the player and is more and more accepted by the game player and the market.
With the increasing significance of artificial intelligence, various technologies and applications emerge endlessly, and deep learning is taken as a popular research direction in the field of artificial intelligence, so that a machine can simulate a human brain learning mechanism to process data such as images and sounds, and particularly in the aspect of images, the processing effect based on the deep learning algorithm is obviously superior to that of the traditional image processing algorithm.
The core technology of the motion sensing game is how a computer acquires body action information of a player, and at present, two main implementation modes exist, one mode is a Kinect camera of Microsoft, the Kinect camera can acquire the body action information of the player, and the method has a good identification effect and has the defects of high hardware cost and complex equipment configuration. The other mode is that after the player images are collected by using a common camera, the player action information is obtained through a deep learning algorithm of body action recognition, and the common camera and the Kinect camera are different in that the common camera can only collect two-dimensional images, and the Kinect camera can collect three-dimensional images. The method is low in hardware cost, but the experience effect of the player is poor, and the player is easily interfered by the field environment, for example, after the player body is temporarily shielded, the body motion information of the player cannot be identified again by the motion sensing game in the reappearance of the image.
Disclosure of Invention
The invention mainly aims to provide an interaction method and an interaction device, which improve the speed and accuracy of man-machine interaction and achieve the effect of automatically tracking a target person through a convolutional neural network algorithm based on deep learning, and only a common camera is needed to acquire an image, so that the cost is low and the compatibility is strong.
In order to achieve the above object, an interaction method provided by the present invention includes:
identifying a human skeleton key point coordinate set of all people in the image collected by the camera;
acquiring coordinate data of a target person from the human skeleton key point coordinate set;
and tracking the target person, and performing interaction according to the real-time coordinate data of the target person.
Optionally, the identifying the human skeleton key point coordinate set of all people in the image acquired by the camera includes:
acquiring an image acquired by a camera;
and identifying a human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
Optionally, the tracking the target person, and the interacting according to the real-time coordinate data of the target person includes:
acquiring an image of a rectangular area where a target person is located, and determining an initial tracking area;
estimating a rectangular area of the target person in the image collected by the camera at a certain moment through a target tracking algorithm;
calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
calculating the intersection ratio of the rectangular area corresponding to the real-time coordinate data and the rectangular area obtained by estimation;
judging whether the intersection ratio is larger than a preset threshold value or not, and if so, taking the real-time coordinate data as control data for interaction between the target person and the system;
otherwise, recording the action of the target person at a certain moment as abnormal action, and accumulating abnormal times.
Optionally, the recording the motion of the target person at the certain time as an abnormal motion, and after accumulating the abnormal times, further includes:
judging whether the abnormal times are larger than a preset time threshold value or not;
if so, acquiring a human skeleton key point coordinate set of all people in the image acquired by the camera at the moment, and calculating the similarity between the rectangular area where all people are located and the initial tracking area;
and taking the person with the maximum similarity as a target person.
Optionally, the target person is a person who completes a preset action in the image.
As another aspect of the present invention, there is provided an interactive apparatus, including:
the identification module is used for identifying a human skeleton key point coordinate set of all people in the image acquired by the camera;
the interaction module is used for acquiring coordinate data of a target person from the human skeleton key point coordinate set;
and the tracking module is used for tracking the target person and carrying out interaction according to the real-time coordinate data of the target person.
Optionally, the identification module comprises:
the image acquisition unit is used for acquiring images acquired by the camera;
and the coordinate calculation unit is used for identifying the human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
Optionally, the tracking module comprises:
the initial unit is used for acquiring an image of a rectangular area where a target person is located and determining an initial tracking area;
the estimation unit is used for estimating a rectangular area of the target person in the image acquired by the camera at a certain moment through a target tracking algorithm;
the real-time coordinate calculation unit is used for calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
the intersection ratio calculation unit is used for calculating the intersection ratio of the rectangular area corresponding to the real-time coordinate data and the rectangular area obtained through estimation;
the first judgment unit is used for judging whether the intersection ratio is larger than a preset threshold value or not, and if so, the real-time coordinate data is used as control data for interaction between a target person and a system; otherwise, recording the action of the target person at a certain moment as abnormal action, and accumulating abnormal times.
Optionally, the tracking module further comprises:
the second judging unit is used for judging whether the abnormal times are larger than a preset time threshold value or not;
the similarity calculation unit is used for acquiring a human skeleton key point coordinate set of all people in the image acquired by the camera at the moment and calculating the similarity between the rectangular area where all people are located and the initial tracking area when the abnormal times are larger than a preset time threshold;
and the re-acquisition target unit is used for taking the person with the maximum similarity as a target person.
Optionally, the target person is a person who completes a preset action in the image.
The invention provides an interaction method and an interaction device, wherein the method comprises the following steps: identifying a human skeleton key point coordinate set of all people in the image collected by the camera; acquiring coordinate data of a target person from the human skeleton key point coordinate set; tracking the target person, and performing interaction according to the real-time coordinate data of the target person; the convolutional neural network algorithm based on deep learning improves the speed and accuracy of human-computer interaction, achieves the effect of automatically tracking the target person, only needs a common camera to acquire images, and is low in cost and high in compatibility.
Drawings
Fig. 1 is a flowchart of an interaction method according to an embodiment of the present invention;
FIG. 2 is a flowchart of the method of step S10 in FIG. 1;
FIG. 3 is a flowchart of a method of step S30 of FIG. 1;
FIG. 4 is a flowchart of another method of step S30 of FIG. 1;
fig. 5 is a block diagram illustrating an exemplary structure of an interactive apparatus according to a second embodiment of the present invention;
FIG. 6 is a block diagram of an exemplary structure of the identification module of FIG. 5;
FIG. 7 is a block diagram of an exemplary structure of the tracking module of FIG. 5;
fig. 8 is a block diagram of another exemplary structure of the tracking module of fig. 5.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in themselves. Thus, "module" and "component" may be used in a mixture.
Example one
As shown in fig. 1, in this embodiment, an interaction method includes:
s10, recognizing a human skeleton key point coordinate set of all people in the image collected by the camera;
s20, acquiring coordinate data of a target person from the human skeleton key point coordinate set;
and S30, tracking the target person, and performing interaction according to the real-time coordinate data of the target person.
In the embodiment, the speed and the accuracy of human-computer interaction are improved through the convolutional neural network algorithm based on deep learning, the effect of automatically tracking the target person is achieved, only a common camera is needed to collect images, the cost is low, and the compatibility is strong.
In this embodiment, the key points of the human skeleton generally refer to the key points of the eyes, nose, neck, left shoulder, left elbow, left wrist, right shoulder, right elbow, right wrist, left hip, left knee, left ankle, right hip, right knee, right ankle, and the like, and the coordinate data of these key points is related to the collected image, for example, an image with dimension H × W, with the left vertex of the image as the origin coordinate (0,0), the direction from the origin to the right vertex as the X axis, and the direction from the origin to the lower left point as the Y axis, the coordinate of the key point is (X, Y), where X is greater than or equal to 0 and less than or equal to W, and Y is greater than or equal to 0 and less than or equal to H. Zero or more persons may exist in an image, and the human skeleton key point coordinate data set refers to that the coordinate data of the zero or more persons are put in a set.
In this embodiment, the target person is a person who completes a preset action in the image.
When the motion sensing game is started, a player is prompted to perform preset appointed actions to control to start the game, the appointed actions such as raising hands, standing and the like have distinguishable characteristics when being mapped to the coordinates of the human skeleton key points, and therefore the actions can be selected from the coordinate data set of the human skeleton key points.
As shown in fig. 2, in the present embodiment, the step S10 includes:
s11, acquiring an image acquired by the camera;
and S12, identifying the human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
In the present embodiment, the human bone key point identification algorithm includes an openpos algorithm based on deep learning.
In the embodiment, a common camera is used as image data acquisition equipment, so that the hardware cost is low, the compatibility is strong, and the popularization and the use are convenient; an ordinary camera acquires an image Ft, and a human skeleton key point coordinate data set { P1, P2, …, Pn } of all people in the Ft is identified by using a human skeleton key point identification algorithm. Ft represents the image collected at the current moment, the frame rate of the camera determines the unit of the moment, the resolution of the camera determines the size of the Ft image, and Pn represents the human skeleton key point coordinate data of the nth person. The human body skeleton key point coordinate data Pk of the target player of the designated action is selected from { P1, P2, …, Pn }, wherein the designated action comprises raising hands and standing, but is not limited to specific actions.
As shown in fig. 3, in the present embodiment, the step S30 includes:
s31, acquiring an image of a rectangular area where the target person is located, and determining an initial tracking area;
and intercepting and storing a rectangular area O where the target player k is located in the image Ft according to the coordinate data Pk of the key points of the human skeleton, and initializing the target tracking algorithm by taking the image of the rectangular area as an initialization parameter of the target tracking algorithm.
S32, estimating a rectangular area of the target person in the image collected by the camera at a certain moment through a target tracking algorithm;
in the present embodiment, the target tracking algorithm includes a deep learning based GOTURN algorithm.
S33, calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
s34, calculating the intersection ratio of the rectangular area corresponding to the real-time coordinate data and the rectangular area obtained by estimation;
the camera acquires an image Ft +1, a target tracking algorithm estimates a rectangle M of a region where a target player is located in the image Ft +1, a human bone key point identification algorithm identifies a human bone key point coordinate data set { P1, P2, …, Pn } of all people in the image Ft +1, a region { N1, N2, …, Nn } where a human body corresponding to { P1, P2, …, Pn } is located is calculated, then { IoU1, IoU2, …, IoUn } is calculated, and finally IoUk is taken as max { IoU1, IoU2, …, IoUn }, wherein IoU (Intersection over Unit) is defined as:
Figure BDA0001642671070000071
s35, judging whether the intersection ratio is larger than a preset threshold value or not, if so, S36, taking the real-time coordinate data as control data for interaction between a target person and a system;
otherwise, S37 records the action of the target person at the certain time as an abnormal action, and accumulates the abnormal times.
If the IoUk is larger than or equal to T1 and T1 is a preset threshold value, outputting Pk as control data of actual interaction between the target player and the motion sensing game; if IoUk < T1, recording and accumulating the abnormal times C of IoUk < T1.
As shown in fig. 4, in this embodiment, after step S37, the method further includes:
s38, judging whether the abnormal times are larger than a preset time threshold value or not;
if so, S39, acquiring a human skeleton key point coordinate set of all people in the image acquired by the camera at the moment, and calculating the similarity between the rectangular area where all people are located and the initial tracking area; otherwise, the step S310 is carried out, and the abnormal times are continuously accumulated;
and S311, taking the person with the maximum similarity as a target person.
The camera acquires the current time image Ft 'again, the human skeleton key point identification algorithm identifies a human skeleton key point coordinate data set { P1, P2, …, Pn } of all people in the Ft', calculates a region { N1, N2, …, Nn } where a human body corresponding to the { P1, P2, …, Pn } is located, and then calculates { S1, S2, …, Sn } and Sm represents the similarity between the image of the region O and the image of the region Nm, wherein m belongs to {1,2, …, N }. And traversing { S1, S2, … and Sn }, selecting Sk > T2, wherein T2 is a preset threshold, and taking an Nk area image as an initialization parameter of the target tracking algorithm.
In the embodiment, the method can automatically track the target player, effectively solves the problem that the tracking target is lost after the target player is shielded by the body for a short time in the game, quickly retraces the target player, greatly improves the participation sense of the player in the motion sensing game, and brings better game experience to the player.
Example two
As shown in fig. 5, in the present embodiment, an interaction apparatus includes:
the identification module 10 is used for identifying a human skeleton key point coordinate set of all people in the image acquired by the camera;
the interaction module 20 is used for acquiring coordinate data of a target person from the human skeleton key point coordinate set;
and the tracking module 30 is used for tracking the target person and performing interaction according to the real-time coordinate data of the target person.
In the embodiment, the speed and the accuracy of human-computer interaction are improved through the convolutional neural network algorithm based on deep learning, the effect of automatically tracking the target person is achieved, only a common camera is needed to collect images, the cost is low, and the compatibility is strong.
In this embodiment, the key points of the human skeleton generally refer to the key points of the eyes, nose, neck, left shoulder, left elbow, left wrist, right shoulder, right elbow, right wrist, left hip, left knee, left ankle, right hip, right knee, right ankle, and the like, and the coordinate data of these key points is related to the collected image, for example, an image with dimension H × W, with the left vertex of the image as the origin coordinate (0,0), the direction from the origin to the right vertex as the X axis, and the direction from the origin to the lower left point as the Y axis, the coordinate of the key point is (X, Y), where X is greater than or equal to 0 and less than or equal to W, and Y is greater than or equal to 0 and less than or equal to H. Zero or more persons may exist in an image, and the human skeleton key point coordinate data set refers to that the coordinate data of the zero or more persons are put in a set.
In this embodiment, the target person is a person who completes a preset action in the image.
When the motion sensing game is started, a player is prompted to perform preset appointed actions to control to start the game, the appointed actions such as raising hands, standing and the like have distinguishable characteristics when being mapped to the coordinates of the human skeleton key points, and therefore the actions can be selected from the coordinate data set of the human skeleton key points.
As shown in fig. 6, in this embodiment, the identification module includes:
the image acquisition unit 11 is used for acquiring images acquired by the camera;
and the coordinate calculation unit 12 is used for identifying the human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
In the present embodiment, the human bone key point identification algorithm includes an openpos algorithm based on deep learning.
In the embodiment, a common camera is used as image data acquisition equipment, so that the hardware cost is low, the compatibility is strong, and the popularization and the use are convenient; an ordinary camera acquires an image Ft, and a human skeleton key point coordinate data set { P1, P2, …, Pn } of all people in the Ft is identified by using a human skeleton key point identification algorithm. Ft represents the image collected at the current moment, the frame rate of the camera determines the unit of the moment, the resolution of the camera determines the size of the Ft image, and Pn represents the human skeleton key point coordinate data of the nth person. The human body skeleton key point coordinate data Pk of the target player of the designated action is selected from { P1, P2, …, Pn }, wherein the designated action comprises raising hands and standing, but is not limited to specific actions.
As shown in fig. 7, in this embodiment, the tracking module includes:
an initial unit 31, configured to acquire an image of a rectangular area where a target person is located, and determine an initial tracking area;
and intercepting and storing a rectangular area O where the target player k is located in the image Ft according to the coordinate data Pk of the key points of the human skeleton, and initializing the target tracking algorithm by taking the image of the rectangular area as an initialization parameter of the target tracking algorithm.
The estimation unit 32 is used for estimating a rectangular area of the target person in the image acquired by the camera at a certain moment through a target tracking algorithm;
in the present embodiment, the target tracking algorithm includes a deep learning based GOTURN algorithm.
The real-time coordinate calculation unit 33 is used for calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
an intersection ratio calculation unit 34, configured to calculate an intersection ratio between a rectangular region corresponding to the real-time coordinate data and the estimated rectangular region;
the camera acquires an image Ft +1, a target tracking algorithm estimates a rectangle M of a region where a target player is located in the image Ft +1, a human bone key point identification algorithm identifies a human bone key point coordinate data set { P1, P2, …, Pn } of all people in the image Ft +1, a region { N1, N2, …, Nn } where a human body corresponding to { P1, P2, …, Pn } is located is calculated, then { IoU1, IoU2, …, IoUn } is calculated, and finally IoUk is taken as max { IoU1, IoU2, …, IoUn }, wherein IoU (Intersection over Unit) is defined as:
Figure BDA0001642671070000101
the first judging unit 35 is configured to judge whether the intersection ratio is greater than a preset threshold, and if so, use the real-time coordinate data as control data for interaction between the target person and the system; otherwise, recording the action of the target person at a certain moment as abnormal action, and accumulating abnormal times.
If the IoUk is larger than or equal to T1 and T1 is a preset threshold value, outputting Pk as control data of actual interaction between the target player and the motion sensing game; if IoUk < T1, recording and accumulating the abnormal times C of IoUk < T1.
As shown in fig. 8, in this embodiment, the tracking module further includes:
a second judging unit 36, configured to judge whether the abnormal number is greater than a preset number threshold;
the similarity calculation unit 37 is configured to, when the abnormal times are greater than a preset time threshold, obtain a human skeleton key point coordinate set of all people in the image acquired by the camera at the time, and calculate a similarity between a rectangular region where all people are located and the initial tracking region;
and a re-acquisition target unit 38 for setting the person with the largest similarity as the target person.
The camera acquires the current time image Ft 'again, the human skeleton key point identification algorithm identifies a human skeleton key point coordinate data set { P1, P2, …, Pn } of all people in the Ft', calculates a region { N1, N2, …, Nn } where a human body corresponding to the { P1, P2, …, Pn } is located, and then calculates { S1, S2, …, Sn } and Sm represents the similarity between the image of the region O and the image of the region Nm, wherein m belongs to {1,2, …, N }. And traversing { S1, S2, … and Sn }, selecting Sk > T2, wherein T2 is a preset threshold, and taking an Nk area image as an initialization parameter of the target tracking algorithm.
In the embodiment, the method can automatically track the target player, effectively solves the problem that the tracking target is lost after the target player is shielded by the body for a short time in the game, quickly retraces the target player, greatly improves the participation sense of the player in the motion sensing game, and brings better game experience to the player.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (8)

1. An interaction method, comprising:
identifying a human skeleton key point coordinate set of all people in the image collected by the camera;
acquiring coordinate data of a target person from the human skeleton key point coordinate set;
tracking the target person, and performing interaction according to the real-time coordinate data of the target person;
the step of tracking the target person and interacting according to the real-time coordinate data of the target person comprises the following steps:
acquiring an image of a rectangular area where a target person is located, and determining an initial tracking area;
estimating a rectangular area of the target person in the image collected by the camera at a certain moment through a target tracking algorithm;
calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
calculating the intersection ratio of the rectangular area corresponding to the real-time coordinate data and the rectangular area obtained by estimation;
judging whether the intersection ratio is larger than a preset threshold value or not, and if so, taking the real-time coordinate data as control data for interaction between the target person and the system;
otherwise, recording the action of the target person at a certain moment as abnormal action, and accumulating abnormal times.
2. The interaction method according to claim 1, wherein the identifying the set of human skeleton key point coordinates of all people in the image captured by the camera comprises:
acquiring an image acquired by a camera;
and identifying a human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
3. The interaction method according to claim 1, wherein the recording the action of the target person at the certain moment as an abnormal action, and after accumulating the abnormal times further comprises:
judging whether the abnormal times are larger than a preset time threshold value or not;
if so, acquiring a human skeleton key point coordinate set of all people in the image acquired by the camera at the moment, and calculating the similarity between the rectangular area where all people are located and the initial tracking area;
and taking the person with the maximum similarity as a target person.
4. The interaction method of claim 1, wherein the target person is a person in the image who has completed a predetermined action.
5. An interactive apparatus, comprising:
the identification module is used for identifying a human skeleton key point coordinate set of all people in the image acquired by the camera;
the interaction module is used for acquiring coordinate data of a target person from the human skeleton key point coordinate set;
the tracking module is used for tracking the target person and carrying out interaction according to the real-time coordinate data of the target person;
wherein the tracking module comprises:
the initial unit is used for acquiring an image of a rectangular area where a target person is located and determining an initial tracking area;
the estimation unit is used for estimating a rectangular area of the target person in the image acquired by the camera at a certain moment through a target tracking algorithm;
the real-time coordinate calculation unit is used for calculating real-time coordinate data of the target person through the human skeleton key point identification algorithm;
the intersection ratio calculation unit is used for calculating the intersection ratio of the rectangular area corresponding to the real-time coordinate data and the rectangular area obtained through estimation;
the first judgment unit is used for judging whether the intersection ratio is larger than a preset threshold value or not, and if so, the real-time coordinate data is used as control data for interaction between a target person and a system; otherwise, recording the action of the target person at a certain moment as abnormal action, and accumulating abnormal times.
6. The interaction device of claim 5, wherein the identification module comprises:
the image acquisition unit is used for acquiring images acquired by the camera;
and the coordinate calculation unit is used for identifying the human skeleton key point coordinate set of all people in the image by using a human skeleton key point identification algorithm.
7. The interaction device of claim 5, wherein the tracking module further comprises:
the second judging unit is used for judging whether the abnormal times are larger than a preset time threshold value or not;
the similarity calculation unit is used for acquiring a human skeleton key point coordinate set of all people in the image acquired by the camera at the moment and calculating the similarity between the rectangular area where all people are located and the initial tracking area when the abnormal times are larger than a preset time threshold;
and the re-acquisition target unit is used for taking the person with the maximum similarity as a target person.
8. The interactive device as claimed in claim 5, wherein the target person is a person in the image who performs a predetermined action.
CN201810387822.8A 2018-04-26 2018-04-26 Interaction method and device Active CN108829233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810387822.8A CN108829233B (en) 2018-04-26 2018-04-26 Interaction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810387822.8A CN108829233B (en) 2018-04-26 2018-04-26 Interaction method and device

Publications (2)

Publication Number Publication Date
CN108829233A CN108829233A (en) 2018-11-16
CN108829233B true CN108829233B (en) 2021-06-15

Family

ID=64154163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810387822.8A Active CN108829233B (en) 2018-04-26 2018-04-26 Interaction method and device

Country Status (1)

Country Link
CN (1) CN108829233B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502986A (en) * 2019-07-12 2019-11-26 平安科技(深圳)有限公司 Identify character positions method, apparatus, computer equipment and storage medium in image
CN110555404A (en) * 2019-08-29 2019-12-10 西北工业大学 Flying wing unmanned aerial vehicle ground station interaction device and method based on human body posture recognition
CN113450534A (en) * 2020-03-27 2021-09-28 海信集团有限公司 Device and method for detecting approach of children to dangerous goods
CN113362324B (en) * 2021-07-21 2023-02-24 上海脊合医疗科技有限公司 Bone health detection method and system based on video image
CN115955603B (en) * 2022-12-06 2024-05-03 广州紫为云科技有限公司 Intelligent camera device based on intelligent screen somatosensory interaction and implementation method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129293A (en) * 2010-01-15 2011-07-20 微软公司 Tracking groups of users in motion capture system
CN103559491A (en) * 2013-10-11 2014-02-05 北京邮电大学 Human body motion capture and posture analysis system
CN104978029A (en) * 2015-06-30 2015-10-14 北京嘿哈科技有限公司 Screen manipulation method and apparatus
CN105469113A (en) * 2015-11-19 2016-04-06 广州新节奏智能科技有限公司 Human body bone point tracking method and system in two-dimensional video stream

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012181688A (en) * 2011-03-01 2012-09-20 Sony Corp Information processing device, information processing method, information processing system, and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129293A (en) * 2010-01-15 2011-07-20 微软公司 Tracking groups of users in motion capture system
CN103559491A (en) * 2013-10-11 2014-02-05 北京邮电大学 Human body motion capture and posture analysis system
CN104978029A (en) * 2015-06-30 2015-10-14 北京嘿哈科技有限公司 Screen manipulation method and apparatus
CN105469113A (en) * 2015-11-19 2016-04-06 广州新节奏智能科技有限公司 Human body bone point tracking method and system in two-dimensional video stream

Also Published As

Publication number Publication date
CN108829233A (en) 2018-11-16

Similar Documents

Publication Publication Date Title
CN108829233B (en) Interaction method and device
Ibrahim et al. An automatic Arabic sign language recognition system (ArSLRS)
CN110570455B (en) Whole body three-dimensional posture tracking method for room VR
CN105389539B (en) A kind of three-dimension gesture Attitude estimation method and system based on depth data
CN107688391B (en) Gesture recognition method and device based on monocular vision
CN112926423B (en) Pinch gesture detection and recognition method, device and system
CN106527709B (en) Virtual scene adjusting method and head-mounted intelligent device
CN109325456B (en) Target identification method, target identification device, target identification equipment and storage medium
US20140177944A1 (en) Method and System for Modeling Subjects from a Depth Map
US20100208038A1 (en) Method and system for gesture recognition
CN109176512A (en) A kind of method, robot and the control device of motion sensing control robot
WO2008007471A1 (en) Walker tracking method and walker tracking device
US20120163661A1 (en) Apparatus and method for recognizing multi-user interactions
CN113658211B (en) User gesture evaluation method and device and processing equipment
CN104933734A (en) Multi-Kinect-based human body gesture data fusion method
CN109308437B (en) Motion recognition error correction method, electronic device, and storage medium
WO2022174594A1 (en) Multi-camera-based bare hand tracking and display method and system, and apparatus
CN106030610A (en) Real-time 3D gesture recognition and tracking system for mobile devices
CN112200074A (en) Attitude comparison method and terminal
CN111596767A (en) Gesture capturing method and device based on virtual reality
CN111222379A (en) Hand detection method and device
CN104898971B (en) A kind of mouse pointer control method and system based on Visual Trace Technology
CN111368787A (en) Video processing method and device, equipment and computer readable storage medium
Tiwari et al. Sign language recognition through kinect based depth images and neural network
Carrasco et al. Exploiting eye–hand coordination to detect grasping movements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20200629

Address after: Building 1, No.2, Danzi North Road, Kengzi street, Pingshan District, Shenzhen City, Guangdong Province

Applicant after: SHENZHEN TONGWEI COMMUNICATION TECHNOLOGY Co.,Ltd.

Address before: 518000 A 305-307, Nanshan medical instrument Park, 1019 Nanhai Road, Nanshan District merchants street, Shenzhen, Guangdong.

Applicant before: SHENZHEN DEEPCONV TECHNOLOGIES Co.,Ltd.

GR01 Patent grant
GR01 Patent grant