CN107194361B

CN107194361B - Two-dimensional posture detection method and device

Info

Publication number: CN107194361B
Application number: CN201710391398.XA
Authority: CN
Inventors: 陈志超; 李轩; 杨贤立; 周剑
Original assignee: Chengdu Topplusvision Science & Technology Co ltd
Current assignee: Chengdu Topplusvision Science & Technology Co ltd
Priority date: 2017-05-27
Filing date: 2017-05-27
Publication date: 2021-04-02
Anticipated expiration: 2037-05-27
Also published as: CN107194361A

Abstract

The invention provides a two-dimensional posture detection method and device, and relates to the field of computer vision. The two-dimensional gesture detection method includes: obtaining a plurality of key points and position information of each key point from two-dimensional human body image data; mapping the plurality of key points to a pre-established characteristic graph according to a preset rule; evaluating an affinity between any two of the keypoints mapped into the feature map; and connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body. The posture skeleton line can be accurately and quickly generated without depending on the depth information of the image, and the requirement on hardware equipment is reduced. The processing speed is high, and the detection result accuracy is high.

Description

Two-dimensional posture detection method and device

Technical Field

The invention relates to the field of computer vision, in particular to a two-dimensional posture detection method and device.

Background

The human body posture detection has wide application prospects in the fields of fitness, security detection, medical treatment and national defense, and the research on the human body posture detection technology is one of the current fire and heat topics.

The gesture skeleton line can be used for recognizing human gestures. The result obtained by detecting the human body posture by utilizing the posture skeleton line is more accurate and is not easy to expose privacy.

However, even after these years of development, human posture detection remains a challenge in computer vision. The current mainstream detection method is to detect a human body in an image and then detect key points in the detected human body, and has complex process, large calculation amount and extremely high hardware condition: in terms of GPU, a plurality of display cards are mostly adopted, and the model number of each display card is no less than NVIDIA Tesla K80; in terms of CPU, most of them adopt

Core^TMi5, the general application of life under such conditions is difficult, and the popularization and the application are influenced.

Disclosure of Invention

In order to solve the above problems, the embodiments of the present invention adopt the following technical solutions:

the embodiment of the invention provides a two-dimensional gesture detection method. The method comprises the following steps: obtaining a plurality of key points and position information of each key point from two-dimensional human body image data; mapping the plurality of key points to a pre-established characteristic graph according to a preset rule; evaluating an affinity between any two of the keypoints mapped into the feature map; and connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body.

An embodiment of the present invention further provides a two-dimensional posture detection apparatus, including: the device comprises an acquisition module, a processing module, a key point extraction module, a first mapping module, an evaluation module, a generation module and a second mapping module. The acquisition module is used for acquiring a frame of RGB image data including a human body according to a preset time interval. And the processing module is used for carrying out normalization processing on the RGB image data to generate the two-dimensional human body image data. And the key point extraction module is used for obtaining a plurality of key points and position information of each key point from the two-dimensional human body image data. And the first mapping module is used for mapping the plurality of key points to a pre-created feature map according to a preset rule. And the evaluation module is used for evaluating the intimacy relationship between any two key points mapped in the feature map. And the generating module is used for connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body. And the second mapping module is used for mapping the posture skeleton line back to the RGB image data by utilizing a preset transformation relation matrix so as to obtain the posture of the human body in the RGB image.

Compared with the prior art, the two-dimensional posture detection method and device provided by the invention are provided. Wherein the method comprises the following steps: obtaining a plurality of key points and position information of each key point from two-dimensional human body image data; mapping the plurality of key points to a pre-established characteristic graph according to a preset rule; evaluating an affinity between any two of the keypoints mapped into the feature map; and connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body. The processing speed based on the two-dimensional image is high. The accuracy is guaranteed, meanwhile, the calculation amount can be reduced, and the real-time performance is good. The requirement on hardware equipment is reduced, and the popularization is facilitated.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a block diagram of a server according to a preferred embodiment of the present invention.

Fig. 2 shows a flowchart of a two-dimensional gesture detection method provided by an embodiment of the present invention.

Fig. 3 is a flowchart illustrating sub-steps of step S103 in fig. 2.

Fig. 4 is a flowchart illustrating sub-steps of step S104 in fig. 2.

FIG. 5 is an exemplary diagram of keypoints mapped to a feature map.

Fig. 6 is a flowchart illustrating sub-steps of step S105 in fig. 2.

Fig. 7 is a schematic diagram illustrating a two-dimensional gesture detection apparatus according to an embodiment of the present invention.

Fig. 8 is a functional sub-module diagram of the key point extraction module in fig. 7.

FIG. 9 is a functional sub-block diagram of the evaluation module of FIG. 7.

Icon: 100-a server; 111-a memory; 112-a processor; 113-a communication unit; 200-two-dimensional gesture detection means; 201-an acquisition module; 202-a processing module; 203-key point extraction module; 2031-extracting the submodule; 2032-a first obtaining submodule; 2033-a comparison submodule; 204-a first mapping module; 205-an evaluation module; 2051-a second acquisition submodule; 2052-a judgment submodule; 2053-a calculation submodule; 2054-evaluation submodule; 206-a generation module; 207-second mapping module.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is a block diagram of a server 100. The server 100 includes a two-dimensional gesture detection apparatus 200, a memory 111, a processor 112, and a communication unit 113.

The memory 111, the processor 112 and the communication unit 113 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The two-dimensional gesture detection apparatus 200 includes at least one software function module which may be stored in the memory 111 in the form of software or Firmware (Firmware) or solidified in an Operating System (OS) of the server 100. The processor 112 is used for executing executable modules stored in the memory 111, such as software functional modules and computer programs included in the two-dimensional gesture detection apparatus 200.

The Memory 111 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 111 is used to store programs or data. The communication unit 113 is configured to establish a communication connection between the server 100 and another communication terminal via the network, and to transceive data via the network.

It should be understood that the configuration shown in fig. 1 is merely a schematic diagram of the configuration of the server 100, and that the server 100 may include more or less components than those shown in fig. 1, or have a different configuration than that shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

First embodiment

Referring to fig. 2, fig. 2 is a flowchart of a two-dimensional gesture detection method according to a preferred embodiment of the invention. The two-dimensional gesture detection method comprises the following steps:

step S101, acquiring a frame of RGB image data including a human body according to a preset time interval.

In the present embodiment, the RGB image data may be RGB image data transmitted by a terminal communicatively connected to the server 100 received through the communication unit 113. The terminal can be an image acquisition device (e.g., a camera, a mobile phone) and also can be an electronic device capable of storing images. In this embodiment, the terminal is a camera, and the server 100 acquires a frame of real-time RGB image data from the camera at preset time intervals.

And S102, carrying out normalization processing on the RGB image data to generate the two-dimensional human body image data.

In this embodiment, the standard deviation of the pixels of the RGB image data can be obtained by calculating the mean value of all the pixels on the RGB image data. The pixel values are normalized to the [ -1,1] interval by subtracting the mean divided by the standard deviation from the pixels on the RGB image data. To form a specific type of image data, which is referred to as image data for facilitating the extraction of feature point types, and using the specific type of image data as model training set data. The RGB image data is normalized and converted into the image data of a specific type, so that the complexity of subsequent calculation can be reduced, and a good promoting effect is achieved for improving the calculation speed. For example, the specific type of image data may be LMDB-formatted image data, but is not limited thereto.

Step S103, a plurality of key points and position information of each key point are obtained from two-dimensional human body image data.

In this embodiment, two-dimensional human body image data is input into a human body detection network to obtain key points appearing in the two-dimensional human body image data and position information corresponding to the key points. The human body detection network can be a convolutional neural network and can be a pre-trained model. Keypoints may include, but are not limited to including, nose, left eye, right eye, left shoulder, right shoulder, mid-clavicular point, left elbow, right elbow, left wrist, right wrist, left crotch, right crotch, left knee, right knee, left ankle, right ankle. As shown in fig. 3, step S103 includes the following sub-steps:

and a substep S1031 of extracting human body feature points from the two-dimensional human body image data.

In this embodiment, all the human body feature points appearing in the two-dimensional human body image data and the position information of the human body feature points are extracted through the human body detection network. One key point may correspond to a plurality of human body feature points, and the position information of each human body feature point can accurately represent the position information of the corresponding key point.

And a substep S1032 of obtaining a characteristic output function and key points corresponding to the human body characteristic points by utilizing a neural convolution network according to the human body characteristic points.

In this embodiment, the key points of each human body feature point are typically divided by using the label category C in the neural convolutional network in turn. Specifically, a typical classification for label category C may be: nose, left eye, right eye, left shoulder, right shoulder, mid-clavicle, left elbow, right elbow, left wrist, right wrist, left crotch, right crotch, left knee, right knee, left ankle, right ankle. Inputting each human body feature point into the neural convolution network correspondingly outputs a feature output function.

And a substep S1033 of comparing the characteristic output function with a preset corresponding key point ideal output.

In this embodiment, the ideal output of the keypoint is obtained according to the keypoint and the ideal output matrix. The ideal output matrix is established in advance according to the neural convolution network and the human characteristic points capable of accurately representing the key points. Inputting the key points into the ideal output matrix can obtain a corresponding ideal output of the key points. The ideal output of the key points can be understood as a characteristic output function obtained after human characteristic points capable of accurately representing the key points are input into the neural convolution network. When the difference value between the feature output function of a human body feature point and the ideal output of the key point corresponding to the human body feature point is within a preset error range, the process goes to the substep S1034; if the difference value is not within the preset error range, the position information of the human body feature point is determined to be incapable of representing the position of the key point, and only the human body feature point capable of accurately representing the key point needs to be continuously searched from other human body feature points representing the key point.

And a substep S1034, acquiring the position information of the human body characteristic points as the position information of the key points. To enable the localization of key points.

And step S104, mapping the plurality of key points to a pre-created characteristic diagram according to a preset rule.

In this embodiment, the obtained key points are mapped to a feature map that is much smaller than the two-dimensional human body image data specification. The workload of the process is reduced. As shown in fig. 4, step S104 includes the following sub-steps:

and a substep S1041 of mapping the obtained key points on the central axis of the human body to the central line in the characteristic diagram.

In this embodiment, the key points located on the central axis of the human body according to the human body structure include, but are not limited to, the nose, the neck, the center of the clavicle, etc. In this embodiment, the key points on the central axis of the body are selected from the nose and the center of the clavicle. The nose is connected with the center of the clavicle to form the central axis of the human body. Specifically, affine transformation may be employed for mapping.

And a substep S1042 of mapping the obtained key point located at one side of the central axis of the human body in the two-dimensional human body image data to a first region at one side of the central line of the characteristic diagram, and mapping the key point located at the other side of the central axis of the human body in the two-dimensional human body image data to a second region at the other side of the central line of the characteristic diagram.

In this embodiment, the centerline of the feature map divides the feature map equally into two regions. One side of the line of the characteristic diagram is a first area, and the other side of the line of the characteristic diagram is a second area. All key points on one side of the central axis in the two-dimensional human body image are mapped to a first area, all key points on the other side of the central axis in the two-dimensional human body image are mapped to a second area, as shown in a characteristic diagram shown in fig. 5, a dotted line in the middle of the characteristic diagram is a central line, an ellipse in the characteristic diagram is a joint point, and two key points on the central line are respectively a nose and a clavicle center from top to bottom. Specifically, affine transformation may be employed for mapping.

Step S105, evaluating the intimacy relationship between any two key points mapped in the feature map.

In this embodiment, for two keypoints in the same region, if their weighted euler distances are closest to those of other joint points in the same region, they are considered to be close. Therefore, the intimacy is not only influenced by the distance, but also only the key points on the same side have intimacy in consideration of whether the two points are on the same side. And a point with close contact of a certain point is searched in the same area, so that the traversal distance is reduced, and the traversal time is shortened. As shown in fig. 6, step S105 includes the following sub-steps:

and a substep S1051 of sequentially acquiring each of the key points.

In this embodiment, the key points that are closely related to each key point need to be found by evaluation, so each key point is processed by sequentially acquiring each key point.

And a substep S1052 that determines the region to which the acquired key point belongs.

In the present embodiment, it is determined whether the key point is mapped in the first region or the second region in the feature map. When the acquired key point belongs to the first region, the process goes to substep S1053; when the acquired key point belongs to the second region, the process proceeds to substep S1054.

It should be noted that, if the key point is located in the central line of the feature map, the weighted euler distances between the obtained key point and other key points in the feature map are sequentially calculated.

And a substep S1053 of calculating the weighted Euler distances between the obtained key points and other key points in the first region in sequence.

In this embodiment, the key points in the first region are sequentially traversed, and the weighted euler distance between the obtained key point and the traversed key point is calculated. For example, the key points in the first region include a left eye, a left shoulder, a left elbow, a left wrist, a left crotch, a left knee, and a left ankle, and when the obtained key point is the left elbow, the weighted euler distances between the left elbow and the left eye, the left shoulder, the left wrist, the left crotch, the right eye, the left knee, and the left ankle are sequentially calculated.

And a substep S1054 of calculating the weighted Euler distances between the obtained key points and other key points in the second region in sequence.

In this embodiment, the key points in the second region are sequentially traversed, and the weighted euler distance between the obtained key point and the traversed key point is calculated. For example, the key points in the second region include a right eye, a right shoulder, a right elbow, a right wrist, a right crotch, a right knee, and a right ankle, and when the obtained key point is the right elbow, the weighted euler distances between the right elbow and the right eye, the right shoulder, the right wrist, the right crotch, the right knee, and the right ankle are sequentially calculated.

And a substep S1055 of evaluating the keypoint corresponding to the minimum weighted Euler distance as the intimate relationship with the obtained keypoint. For example, if the weighted euler distances between the left elbow in the first region and other keypoints in the first region are closest to the weighted euler distances between the left wrist in the first region, then it is assessed that there is an intimate relationship between the left wrist and the left elbow.

And S106, connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body.

In this embodiment, all the keypoints are expanded into a keypoint region having a keypoint center radius R. The radius R of the keypoint region is preset. And then, taking the key point areas respectively corresponding to the two closely related key points as fitting targets of the posture skeleton line, and performing fitting connection to generate the posture skeleton line. Therefore, the fitting area is changed from one point to one area, the feasibility of straight line fitting is increased, and the time for completing the straight line fitting is shortened.

And S107, mapping the posture skeleton line back to the RGB image data by using a preset transformation relation matrix to obtain the posture of the human body in the RGB image.

In this embodiment, because the gesture skeleton line and the RGB image data are not in the same coordinate system, the gesture skeleton line needs to be mapped to the portrait coordinate where the human body appears in the RGB image data, so that the human body skeleton line and the human body are fused together, and the display is convenient. Specifically, the human body key points and the posture skeleton lines are mapped to the RGB image data through a preset transformation relation matrix of feature map coordinates and RGB image data coordinates and then displayed. And fitting the key points and the posture skeleton lines with the human body image.

In this embodiment, each frame of RGB image data is acquired by the camera in real time and processed to generate and display RGB image data to which corresponding key points and gesture skeleton lines are attached. And obtaining the motion trend of the posture skeleton line according to the RGB image data of each frame, which are attached with the corresponding key points and the posture skeleton lines, and further predicting the postures in the RGB image data.

Second embodiment

Referring to fig. 7, fig. 7 is a functional module schematic diagram of a two-dimensional gesture detection apparatus 200 according to an embodiment of the present invention. The two-dimensional gesture detection apparatus 200 includes: the system comprises an acquisition module 201, a processing module 202, a key point extraction module 203, a first mapping module 204, an evaluation module 205, a generation module 206 and a second mapping module 207.

The acquiring module 201 is configured to acquire a frame of RGB image data including a human body at preset time intervals.

In this embodiment of the present invention, the step S101 may be executed by the obtaining module 201.

And the processing module 202 is configured to perform normalization processing on the RGB image data to generate the two-dimensional human body image data.

In the embodiment of the present invention, the step S102 may be executed by the processing module 202.

The key point extracting module 203 is configured to obtain a plurality of key points and position information of each key point from the two-dimensional human body image data.

In this embodiment of the present invention, the step S103 may be performed by the key point extracting module 203. As shown in fig. 8, the keypoint extraction module 203 comprises the following sub-modules:

the extracting sub-module 2031 is configured to extract human feature points from the two-dimensional human image data.

In the embodiment of the present invention, the sub-step S1031 may be performed by the extracting sub-module 2031.

The first obtaining submodule 2032 is configured to obtain, according to the human body feature point, a feature output function and a key point corresponding to the human body feature point by using a neural convolution network.

In this embodiment of the present invention, the sub-step S1032 may be executed by the first obtaining sub-module 2032.

The comparison sub-module 2033 is configured to compare the characteristic output function with a preset corresponding ideal output of the key point.

In an embodiment of the present invention, the sub-step S1033 may be performed by the comparison sub-module 2033.

The first obtaining sub-module 2032 is further configured to obtain, when a difference between the feature output function and the ideal output of the key point is within a preset error range, position information of the human body feature point as position information of the key point.

In this embodiment of the present invention, the sub-step S1034 may be performed by the first obtaining sub-module 2032.

The first mapping module 204 is configured to map the plurality of key points to a pre-created feature map according to a preset rule.

In the embodiment of the present invention, the step S104, the sub-step S1041 and the sub-step S1042 can be performed by the first mapping module 204. The first mapping module 204 performs the mapping of the plurality of keypoints into the pre-created feature map by: mapping the obtained key points on the central axis of the human body to the central line in the characteristic diagram; and mapping the obtained key point positioned on one side of the central axis of the human body in the two-dimensional human body image data to a first area on one side of the central line of the characteristic diagram, and mapping the key point positioned on the other side of the central axis of the human body in the two-dimensional human body image data to a second area on the other side of the central line of the characteristic diagram.

An evaluation module 205, configured to evaluate an affinity between any two of the keypoints mapped to the feature map.

In the embodiment of the present invention, the step S105 may be performed by the evaluation module 205. As shown in FIG. 9, the evaluation module 205 includes the following sub-modules:

and a second obtaining sub-module 2051, configured to obtain each of the key points in turn.

In this embodiment of the present invention, the sub-step S1051 may be performed by the second obtaining sub-module 2051.

And a judgment sub-module 2052, configured to judge an area to which the acquired key point belongs.

In the embodiment of the present invention, the sub-step S1052 may be executed by the determining sub-module 2052.

And a calculating submodule 2053, configured to, when the obtained key point belongs to the first region, sequentially calculate weighted euler distances between the obtained key point and other key points in the first region.

In the embodiment of the present invention, the sub-step S1053 may be performed by the calculation sub-module 2053.

The calculating sub-module 2053 is further configured to, when the obtained key point belongs to the second region, sequentially calculate weighted euler distances between the obtained key point and other key points in the second region.

In the embodiment of the present invention, the sub-step S1054 may be performed by the calculation sub-module 2053.

An evaluation sub-module 2054, configured to evaluate the keypoint corresponding to the minimum weighted euler distance as the intimate relationship with the obtained keypoint.

In an embodiment of the present invention, the sub-step S1055 may be performed by the evaluation sub-module 2054.

A generating module 206, configured to connect the two key points evaluated to have the close connection, and generate a gesture skeleton line to detect a gesture of a human body.

In this embodiment of the present invention, the step S106 may be performed by the generating module 206.

And the second mapping module 207 is configured to map the posture skeleton line back to the RGB image data by using a preset transformation relation matrix to obtain the posture of the human body in the RGB image.

In this embodiment of the present invention, the step S107 may be performed by the second mapping module 207.

In summary, the present invention provides a two-dimensional gesture detection method and apparatus. The method comprises the following steps: obtaining a plurality of key points and position information of each key point from two-dimensional human body image data; mapping the plurality of key points to a pre-established characteristic graph according to a preset rule; evaluating an affinity between any two of the keypoints mapped into the feature map; and connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body. On the one hand, the processing speed based on the two-dimensional image is high. The accuracy is guaranteed, and meanwhile the calculation amount can be reduced. Meanwhile, the method is good in instantaneity. The gesture skeleton line can be quickly processed without the aid of depth information of images, the gesture skeleton line is generated, requirements on hardware equipment are reduced, and the gesture skeleton line is convenient to popularize.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A two-dimensional gesture detection method, the method comprising:

extracting human body characteristic points from two-dimensional human body image data;

acquiring a characteristic output function and key points corresponding to the human body feature points by utilizing a neural convolution network according to the human body feature points;

comparing the characteristic output function with preset corresponding ideal output of the key points; the ideal output of the key point is obtained according to the key point and an ideal output matrix; the ideal output matrix is established in advance according to the neural convolution network and the human body characteristic points capable of accurately representing the key points, and the key points are input into the ideal output matrix to obtain the corresponding ideal output of the key points; the ideal output of the key points is a characteristic output function obtained after human body characteristic points capable of accurately representing the key points are input into the neural convolution network;

when the difference value between the characteristic output function and the ideal output of the key point is within a preset error range, acquiring the position information of the human body characteristic point as the position information of the key point;

mapping the plurality of key points to a pre-established characteristic graph according to a preset rule;

evaluating an affinity between any two of the keypoints mapped into the feature map;

and connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body.

2. A two-dimensional gesture detection method according to claim 1, characterized in that the method further comprises the steps of:

acquiring a frame of RGB image data comprising a human body according to a preset time interval;

and carrying out normalization processing on the RGB image data to generate the two-dimensional human body image data.

3. A two-dimensional gesture detection method according to claim 2, characterized in that the method further comprises the steps of:

and mapping the posture of the human body in the RGB image data back by using a preset transformation relation matrix according to the posture skeleton line to obtain the posture of the human body in the RGB image.

4. A two-dimensional gesture detection method according to claim 1, wherein the step of mapping the keypoints into a pre-created feature map according to a preset rule comprises;

mapping the obtained key points on the central axis of the human body to the central line in the characteristic diagram;

and mapping the obtained key point positioned on one side of the central axis of the human body in the two-dimensional human body image data to a first area on one side of the central line of the characteristic diagram, and mapping the key point positioned on the other side of the central axis of the human body in the two-dimensional human body image data to a second area on the other side of the central line of the characteristic diagram.

5. A two-dimensional gesture detection method according to claim 4, wherein said step of evaluating the intimacy between any two of said keypoints mapping into said feature map comprises:

sequentially acquiring each key point;

judging the area of the acquired key point;

when the obtained key point belongs to the first area, sequentially calculating weighted Euler distances between the obtained key point and other key points in the first area;

when the obtained key point belongs to the second area, sequentially calculating the weighted Euler distances between the obtained key point and other key points in the second area;

and evaluating the corresponding key point when the weighted Euler distance is minimum as that the close relation exists between the key point and the acquired key point.

6. A two-dimensional gesture detection apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring a frame of RGB image data comprising a human body according to a preset time interval;

the processing module is used for carrying out normalization processing on the RGB image data to generate two-dimensional human body image data;

the key point extraction module is used for obtaining a plurality of key points and position information of each key point from two-dimensional human body image data; wherein, the key point extraction module comprises:

the extraction submodule is used for extracting human body characteristic points from the two-dimensional human body image data;

the first obtaining submodule is used for obtaining a characteristic output function and key points corresponding to the human body characteristic points by utilizing a neural convolution network according to the human body characteristic points;

the comparison submodule is used for comparing the characteristic output function with preset corresponding ideal output of the key point; the ideal output of the key point is obtained according to the key point and an ideal output matrix; the ideal output matrix is established in advance according to the neural convolution network and the human body characteristic points capable of accurately representing the key points, and the key points are input into the ideal output matrix to obtain the corresponding ideal output of the key points; the ideal output of the key points is a characteristic output function obtained after human body characteristic points capable of accurately representing the key points are input into the neural convolution network;

the first obtaining submodule is further used for obtaining the position information of the human body feature point as the position information of the key point when the difference value between the feature output function and the ideal output of the key point is within a preset error range;

the first mapping module is used for mapping the plurality of key points to a pre-established feature map according to a preset rule;

the evaluation module is used for evaluating the intimacy relationship between any two key points mapped in the feature map;

the generating module is used for connecting the two key points which are evaluated to have the close connection to generate a posture skeleton line so as to detect the posture of the human body;

and the second mapping module is used for mapping the posture skeleton line back to the RGB image data by utilizing a preset transformation relation matrix so as to obtain the posture of the human body in the RGB image.

7. A two-dimensional gesture detection apparatus according to claim 6, wherein said first mapping module performs said mapping of said plurality of keypoints into a pre-created feature map by: mapping the obtained key points on the central axis of the human body to the central line in the characteristic diagram; and mapping the obtained key point positioned on one side of the central axis of the human body in the two-dimensional human body image data to a first area on one side of the central line of the characteristic diagram, and mapping the key point positioned on the other side of the central axis of the human body in the two-dimensional human body image data to a second area on the other side of the central line of the characteristic diagram.

8. The two-dimensional gesture detection apparatus of claim 7, wherein the evaluation module comprises:

the second obtaining submodule is used for sequentially obtaining each key point;

the judgment submodule is used for judging the area of the acquired key point;

the calculation submodule is used for sequentially calculating the weighted Euler distances between the acquired key points and other key points in the first area when the acquired key points belong to the first area;

the calculation submodule is further used for sequentially calculating weighted Euler distances between the acquired key points and other key points in the second region when the acquired key points belong to the second region;

and the evaluation submodule is used for evaluating the corresponding key point when the weighted Euler distance is minimum as that the close relation exists between the key point and the acquired key point.