CN113377192B

CN113377192B - Somatosensory game tracking method and device based on deep learning

Info

Publication number: CN113377192B
Application number: CN202110551769.2A
Authority: CN
Inventors: 顾友良; 张哲为; 李观喜; 程煜均; 张磊; 林伟; 苏鹏; 赵乾; 丁博文
Original assignee: Guangzhou Ziweiyun Technology Co ltd
Current assignee: Guangzhou Ziweiyun Technology Co ltd
Priority date: 2021-05-20
Filing date: 2021-05-20
Publication date: 2023-06-20
Anticipated expiration: 2041-05-20
Also published as: CN113377192A

Abstract

The invention discloses a motion sensing game tracking method based on deep learning, which is used for continuously collecting pictures by using a depth or monocular motion sensing camera so as to realize pedestrian detection and human body key point detection; constructing a filter, establishing a state vector x_ (t-1) for each pedestrian detection frame of the picture with the frame number of t-1 and corresponding human body key point coordinates, predicting a state result x_ (t) at the current t moment by using the filter, and finally updating the state at the t-1 moment by using a real observation result at the t moment; determining pedestrian tracking and all pedestrian ID numbers; selecting an ID number of a game operator, and judging whether the ID number of the game operator exists in a tracking result; and determining a game operator, selecting pedestrians corresponding to the ID numbers of the game operator as the game operator according to the result, and tracking the game operator through continuous circulation after the game operator is selected, so that the somatosensory game system can stably and accurately identify the game operator.

Description

Somatosensory game tracking method and device based on deep learning

Technical Field

The invention relates to the technical field of computer interaction, in particular to a motion sensing game tracking method and device based on deep learning.

Background

With the increasing popularity of digitization, networking and intelligent life concepts, somatosensory games based on artificial intelligence are also increasingly accepted. However, when people perform a motion sensing game process of single person or multi-person interaction, after other pedestrians who do not participate in the game enter the motion sensing game camera lens, how the motion sensing game system stably and accurately identifies the operator of the game without being interfered by other pedestrians is of great importance to the experience of the motion sensing game.

Most motion sensing games currently on the market use a depth camera to estimate how far a pedestrian is from the motion sensing game camera (e.g., the human player closest to the motion sensing game camera lens) appears in the lens while the motion sensing game experience is being played, use other auxiliary devices such as a game handle, or use a face recognition tracking module to determine the player. The scheme of estimating the distance of pedestrians from the camera using the depth camera to determine the players is very easily disturbed during the game, and the cost of purchasing somatosensory games is increased as with the scheme of purchasing auxiliary devices. The face recognition tracking module is added in the system, so that the requirements on the quality of the obtained face image and the head gesture of the face are high, and the face image and the head gesture are very easily affected in the actual experience process. It is therefore an urgent need to develop a body-feel game system that can stably and accurately identify the operator device of the game without increasing the cost of purchasing the body-feel game.

Disclosure of Invention

Aiming at the problem that the scheme of a game operator is identified when multiple persons appear in the existing somatosensory game system, the invention provides a tracking device of a somatosensory game to solve the problem that the game operator is accurately identified when the multiple persons appear in the somatosensory game lens.

The invention mainly combines pictures acquired by the somatosensory game camera, and determines an operator of the game through an algorithm device based on pedestrian detection and pedestrian tracking of deep learning. Firstly, the scheme of utilizing the camera of the somatosensory game to collect pictures does not need a specific auxiliary device, so that the purchase cost of the somatosensory game is saved. Meanwhile, the algorithm device which combines the pictures acquired by the somatosensory game camera with pedestrian detection and pedestrian tracking is more stable and accurate than the algorithm device which uses the distance between the pedestrians and the camera to determine the game operator.

The method solves the defects and the shortcomings of the existing somatosensory game system in determining the scheme of a game operator to a great extent, and is applicable to somatosensory games based on a depth camera and a monocular RGB (red, green and blue) camera.

The present invention aims to solve at least one of the technical problems existing in the prior art. Therefore, the invention discloses a motion sensing game tracking method based on deep learning, which comprises the following steps:

step 1, continuously collecting pictures by using a depth or monocular somatosensory camera to realize pedestrian detection and human body key point detection;

step 2, constructing a filter, establishing a state vector x_ (t-1) for each pedestrian detection frame of the picture with the frame number of t-1 and corresponding human body key point coordinates, predicting a state result x_ (t) at the current t moment by using the filter, and finally updating the state at the t-1 moment by using a real observation result at the t moment;

step 3, determining pedestrian tracking and all pedestrian ID numbers;

step 4, selecting an ID number of the game operator, and judging whether the ID number of the game operator exists in the tracking result in the step 3;

and 5, determining a game operator, selecting pedestrians corresponding to the ID numbers of the game operator as the game operator according to the result of the step 4, and tracking the game operator through continuous circulation after the game operator is selected, so that the motion sensing game system can stably and accurately identify the game operator.

Still further, the step 1 further includes: in the process of starting a motion sensing game by using a depth or monocular motion sensing camera, the motion sensing game camera continuously collects pictures, the collected pictures are preprocessed and then sent to a pedestrian detector based on deep learning, the pedestrian detector calculates coordinates of all pedestrian rectangular frames in the pictures shot by the current motion sensing game camera, and then the coordinates of key points of each pedestrian are detected by using the obtained coordinates of the pedestrian rectangular frames and a human body key point detector based on deep learning.

Still further, the step 2 further includes a filtering algorithm prediction model as follows:

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

wherein, each parameter in the formula has the meaning of: a: a system state transition matrix; x is x _t-1 : a system state matrix at time t-1; b: controlling an input matrix; u (u) _t-1 : a controller vector at time t-1; w (w) _t-1 : process noise at time t-1.

Still further, the step 3 further includes: after the coordinates of the detection frames of all pedestrians in the current lens picture and the coordinates of the key points corresponding to each pedestrian are obtained in the step 1 and the step 2, filtering prediction is carried out on the coordinates, modeling is carried out on each pedestrian detection frame and the corresponding coordinates of the key points of the human body by using a tracking algorithm based on deep learning, each pedestrian in the current lens picture is associated with a unique corresponding ID number, and then the corresponding ID number is built or updated by continuously tracking pedestrians appearing in the lens picture by using the tracking algorithm.

Still further, the step 4 further includes: if the ID number of the current player exists in the tracking result obtained in the step 3, the ID number of the current player is not changed at the moment when the current player is still in front of the somatosensory camera lens, and if the ID number of the current player is not selected yet or the ID number corresponding to the current player cannot be found in the tracking result of the third process, one ID is selected as the ID number of the player in combination with a strategy in the tracking result of the step 3.

The invention also discloses a motion sensing game tracking device based on deep learning, a depth or monocular motion sensing camera, and the depth or monocular motion sensing camera is used for continuously collecting pictures so as to realize pedestrian detection and human body key point detection; the filter is used for establishing a state vector x_ (t-1) for each pedestrian detection frame of the picture with the frame number of t-1 and corresponding human body key point coordinates, predicting a state result x_ (t) at the current t moment by using the filter, and finally updating the state at the t moment by using a real observation result at the t moment; the pedestrian ID confirming module is used for determining pedestrian tracking and all pedestrian ID numbers; the player selection module is used for selecting the ID number of the player and judging whether the ID number of the player exists in the result tracked by the pedestrian ID confirmation module; and determining a game operator, selecting pedestrians corresponding to the ID numbers of the game operator as the game operator according to the result of the game operator selection module, and tracking the game operator through continuous circulation after the game operator is selected, so that the motion sensing game system can stably and accurately identify the game operator.

Still further, the depth or monocular motion camera further comprises: in the process of starting a motion sensing game by using a depth or monocular motion sensing camera, the motion sensing game camera continuously collects pictures, the collected pictures are preprocessed and then sent to a pedestrian detector based on deep learning, the pedestrian detector calculates coordinates of all pedestrian rectangular frames in the pictures shot by the current motion sensing game camera, and then the coordinates of key points of each pedestrian are detected by using the obtained coordinates of the pedestrian rectangular frames and a human body key point detector based on deep learning.

Still further, the filter further comprises a filtering algorithm prediction model as follows:

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

Still further, the pedestrian ID confirmation module further includes: after the depth or monocular somatosensory camera and the filter obtain the detection frame coordinates of all pedestrians in the current lens picture and the corresponding key point coordinates of each pedestrian, filtering and predicting the detection frame coordinates, modeling each pedestrian detection frame and the corresponding key point coordinates of a human body by using a tracking algorithm based on deep learning, associating each pedestrian in the current lens picture with a unique corresponding ID number, and then tracking pedestrians appearing in the lens picture by using the tracking algorithm continuously so as to establish or update the corresponding ID numbers.

Still further, the game operator selection module further comprises: if the ID number of the current player exists in the tracking result of the pedestrian ID confirmation module, the current player is indicated to be in front of the somatosensory camera lens, the ID number of the player is unchanged at the moment, and if the ID number of the player is not selected or the ID number corresponding to the current player cannot be found in the tracking result of the third process, one ID is selected as the ID number of the player in the tracking result of the pedestrian ID confirmation module in combination with a strategy.

Drawings

The invention will be further understood from the following description taken in conjunction with the accompanying drawings. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the embodiments. In the figures, like reference numerals designate corresponding parts throughout the different views.

FIG. 1 is a schematic diagram of a logic flow of the present invention.

Detailed Description

Example 1

As shown in fig. 1, the technology of the present embodiment is mainly divided into five processes: pedestrian detection, human body key point detection, filter construction, pedestrian tracking, all pedestrian ID number determination, selecting an ID number of a game operator, and determining the game operator.

The first process: pedestrian detection and human body key point detection. In the process of starting the somatosensory game by using the depth or monocular somatosensory camera, the somatosensory game camera continuously collects pictures, the collected pictures are preprocessed and then sent to a pedestrian detector based on deep learning, and the pedestrian detector calculates coordinates of all pedestrian rectangular frames in the pictures shot by the current somatosensory game camera. And then detecting the key point coordinates of each pedestrian by using the obtained rectangular frame coordinates of the pedestrians and a human key point detector based on deep learning.

The second process: and (3) constructing a filter. Due to the fact that noise interference exists in the lens pictures in the real-time game process, some picture frames are not output with corresponding detection results, and game experience is finally affected. The invention establishes a state vector x for each pedestrian detection frame and corresponding human body key point coordinates of a picture with the frame number of t-1 moment _t-1 And predicts the state result x at the current t moment by using a filter _t And finally, updating the state at the time t by using the real observation result at the time t. Preventing the game experience from being degraded because a certain frame has no detection result. Wherein the filtering algorithm prediction model is as follows.

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

The third process: pedestrian tracking and all pedestrian ID number determination. After the detection frame coordinates of all pedestrians in the current lens picture and the corresponding key point coordinates of each pedestrian are obtained through the first and second processes, filtering prediction is carried out on the detection frame coordinates, and modeling is carried out on each pedestrian detection frame and the corresponding key point coordinates of the human body by using a tracking algorithm based on deep learning, so that each pedestrian in the current lens picture is associated with a unique corresponding ID number. And then continuously tracking pedestrians appearing in the lens picture by using a tracking algorithm so as to establish or update the corresponding ID number.

Fourth process: the game operator's ID number is selected. Judging whether the ID number of the player exists in the result of the third process tracking, if the ID number of the current player exists in the result obtained by the third process tracking, indicating that the current player is still in front of the somatosensory camera lens, and the ID number of the player is unchanged at the moment. If the ID number of the game operator is not selected or the ID number corresponding to the current player is not found in the tracking result of the third process, selecting an ID as the ID number of the game operator by combining strategies in the tracking result of the third process.

A fifth process: determination of the game operator. According to the result of the fourth process, the pedestrian corresponding to the ID number of the game operator is selected as the game operator, and the game operator can be tracked through continuous circulation after the game operator is selected, so that the somatosensory game system can stably and accurately identify the game operator.

Example two

The embodiment provides a motion sensing game tracking method based on deep learning, which comprises the following steps:

step 2, constructing a filter, establishing a state vector x_ (t-1) for each pedestrian detection frame and corresponding human body key point coordinates of a picture with the frame number of t-1, predicting a state result x_ t at the current t moment by using the filter, and finally updating the state at the t moment by using a real observation result at the t moment;

step 3, determining pedestrian tracking and all pedestrian ID numbers;

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

While the invention has been described above with reference to various embodiments, it should be understood that many changes and modifications can be made without departing from the scope of the invention. It is therefore intended that the foregoing detailed description be regarded as illustrative rather than limiting, and that it be understood that it is the following claims, including all equivalents, that are intended to define the spirit and scope of this invention. The above examples should be understood as illustrative only and not limiting the scope of the invention. Various changes and modifications to the present invention may be made by one skilled in the art after reading the teachings herein, and such equivalent changes and modifications are intended to fall within the scope of the invention as defined in the appended claims.

Claims

1. A motion sensing game tracking method based on deep learning is characterized by comprising the following steps:

step 1, continuously collecting pictures by using a depth or monocular somatosensory camera to realize pedestrian detection and human body key point detection, wherein the somatosensory game camera continuously collects pictures in the process of starting the somatosensory game by using the depth or monocular somatosensory camera, preprocessing the collected pictures and then sending the preprocessed pictures to a pedestrian detector based on deep learning, the pedestrian detector calculates all pedestrian rectangular frame coordinates in the pictures shot by the current somatosensory game camera, and then detecting the key point coordinates of each pedestrian by using the obtained pedestrian rectangular frame coordinates and the human body key point detector based on deep learning;

step 2, constructing a filter, establishing a state vector x_ (t-1) for each pedestrian detection frame of a picture with the frame number of t-1 and corresponding human body key point coordinates, predicting a state result x_ (t) at the current t moment by using the filter, and finally updating the state at the t-1 moment by using a real observation result at the t moment, wherein a filtering algorithm prediction model is as follows:

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

wherein, each parameter in the formula has the meaning of: a: a system state transition matrix; x is x _t-1 : a system state matrix at time t-1; b: controlling an input matrix; u (u) _t-1 : a controller vector at time t-1; w (w) _t-1 : process noise at time t-1;

step 3, determining pedestrian tracking and all pedestrian ID numbers, wherein after the detection frame coordinates of all pedestrians in the current lens picture and the corresponding key point coordinates of each pedestrian are obtained in the step 1 and the step 2, and filtering and predicting the detection frame coordinates, modeling each pedestrian detection frame and the corresponding key point coordinates of the human body by using a tracking algorithm based on deep learning, associating each pedestrian in the current lens picture with a unique corresponding ID number, and then continuously tracking the pedestrians appearing in the lens picture by using the tracking algorithm so as to establish or update the corresponding ID numbers;

step 4, selecting an ID number of a game operator, judging whether the ID number of the game operator exists in the tracking result in the step 3, wherein if the ID number of the current game operator exists in the tracking result in the step 3, the ID number of the game operator is unchanged at the moment when the current game operator is still in front of a somatosensory camera lens, and if the ID number of the game operator is not selected or the ID number of the current game operator cannot find the corresponding ID number in the tracking result, selecting an ID as the ID number of the game operator in the tracking result in the step 3 by combining strategies;

2. The motion sensing game tracking device based on the deep learning is characterized in that a depth or monocular motion sensing camera is used for continuously collecting pictures to realize pedestrian detection and human body key point detection; the filter is used for establishing a state vector x_ (t-1) for each pedestrian detection frame of the picture with the frame number of t-1 and corresponding human body key point coordinates, predicting a state result x_ (t) at the current t moment by using the filter, and finally updating the state at the t moment by using a real observation result at the t moment; the pedestrian ID confirming module is used for determining pedestrian tracking and all pedestrian ID numbers; the player selection module is used for selecting the ID number of the player and judging whether the ID number of the player exists in the result tracked by the pedestrian ID confirmation module; determining a game operator, selecting pedestrians corresponding to the ID numbers of the game operator as the game operator according to the result of the game operator selection module, and tracking the game operator through continuous circulation after the game operator is selected, so that the motion sensing game system can stably and accurately identify the game operator, wherein the depth or monocular motion sensing camera further comprises: in the process of starting a somatosensory game by using a depth or monocular somatosensory camera, continuously acquiring pictures by the somatosensory game camera, preprocessing the acquired pictures, sending the preprocessed pictures to a pedestrian detector based on deep learning, calculating coordinates of all pedestrian rectangular frames in the pictures shot by the current somatosensory game camera by the pedestrian detector, and detecting the coordinates of key points of each pedestrian by using the coordinates of the obtained pedestrian rectangular frames and a human body key point detector based on deep learning; the filter further includes a filtering algorithm prediction model as follows:

x _t ＝Ax _t-1 +Bu _t-1 +w _t-1

the pedestrian ID confirmation module further includes: after the depth or monocular somatosensory camera and the filter obtain the detection frame coordinates of all pedestrians in the current lens picture and the corresponding key point coordinates of each pedestrian, filtering and predicting the detection frame coordinates, modeling each pedestrian detection frame and the corresponding key point coordinates of a human body by using a tracking algorithm based on deep learning, associating each pedestrian in the current lens picture with a unique corresponding ID number, and then continuously tracking pedestrians appearing in the lens picture by using the tracking algorithm so as to establish or update the corresponding ID number; the game operator selection module further comprises: if the ID number of the current player exists in the tracking result of the pedestrian ID confirmation module, the current player is indicated to be in front of the somatosensory camera lens, the ID number of the player is unchanged at the moment, and if the ID number of the player is not selected or the ID number corresponding to the current player cannot be found in the tracking result, an ID is selected as the ID number of the player in the tracking result of the pedestrian ID confirmation module in combination with a strategy.