CN112464791B - Gesture recognition method, device, equipment and storage medium based on two-dimensional camera - Google Patents

Gesture recognition method, device, equipment and storage medium based on two-dimensional camera Download PDF

Info

Publication number
CN112464791B
CN112464791B CN202011339565.4A CN202011339565A CN112464791B CN 112464791 B CN112464791 B CN 112464791B CN 202011339565 A CN202011339565 A CN 202011339565A CN 112464791 B CN112464791 B CN 112464791B
Authority
CN
China
Prior art keywords
human body
information
dimensional camera
node
extracting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011339565.4A
Other languages
Chinese (zh)
Other versions
CN112464791A (en
Inventor
颜泽龙
王健宗
吴天博
程宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011339565.4A priority Critical patent/CN112464791B/en
Publication of CN112464791A publication Critical patent/CN112464791A/en
Priority to PCT/CN2021/084543 priority patent/WO2021208740A1/en
Application granted granted Critical
Publication of CN112464791B publication Critical patent/CN112464791B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects

Abstract

The application relates to the field of artificial intelligence, and provides a gesture recognition method, a gesture recognition device, gesture recognition equipment and a gesture recognition storage medium based on a two-dimensional camera, wherein a human body picture of a human body to be recognized is obtained; extracting body contour information, posture information and sex information of a human body to be identified according to the human body picture; generating an SMPL model according to the body contour information, the posture information and the gender information; acquiring 3D node characteristics in an SMPL model; extracting 2D node characteristics of the human body to be identified in the human body picture; generating an error function according to the 3D node characteristics and the 2D node characteristics; adjusting the SMPL model according to the error function, and extracting target 3D joint characteristics of the adjusted SMPL model; acquiring skeleton information according to the characteristics of the target 3D joint; and carrying out gesture analysis according to the skeleton information. The gesture recognition method, the gesture recognition device, the gesture recognition equipment and the storage medium based on the two-dimensional camera can be applied to the field of blockchain, and can accurately recognize the gesture of the human body to be recognized.

Description

Gesture recognition method, device, equipment and storage medium based on two-dimensional camera
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a gesture recognition method, device, equipment and storage medium based on a two-dimensional camera.
Background
As the three-dimensional human body gesture recognition technology has wide application scenes and values, more and more attention is paid in recent years. During gesture analysis, human limbs can be irregularly interwoven, and meanwhile, the problem that a plurality of gestures are mutually shielded exists, particularly when a plurality of people take pictures together. The prior human body gesture analysis mainly locates the position of a human body according to a two-dimensional image or a three-dimensional image, and extracts a skeleton so as to carry out gesture recognition. With the progress of technology, particularly the development of computer technology, the human three-dimensional gesture recognition technology is widely applied to a plurality of fields, and the connection between a real object and a mathematical model and the connection between a virtual object and reality are tightly achieved. Estimating the complete three-dimensional human morphology and pose (motion) from images or video has been a challenge for the computer field for decades.
Currently, the most widely used method of human body pose estimation is the marker tracking method, which requires a plurality of calibrated cameras and markers carefully attached to the body of the subject. This technique can achieve higher accuracy but is costly. In view of cost, marker-free capture methods based on multi-view two-dimensional camera human three-dimensional reconstruction techniques or depth cameras have been developed over the last 20 years. However, these new methods still need to use additional equipment to evaluate the three-dimensional posture of the human body, so that the technology cannot be popularized in many application scenes. In addition, the gesture analysis based on the human body three-dimensional model depends on a deep learning technology, and the existing training data volume of three-dimensional human body motion is very deficient, which means that the accuracy of gesture analysis by using deep learning directly based on the human body three-dimensional model is very low.
Disclosure of Invention
The application mainly aims to provide a gesture recognition method, device, equipment and storage medium based on a two-dimensional camera, and aims to solve the technical problem of low gesture recognition precision in the prior art.
In order to achieve the above object, the present application provides a gesture recognition method based on a two-dimensional camera, comprising the following steps:
acquiring a human body picture of a human body to be identified through a two-dimensional camera;
extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture;
generating an SMPL model according to the body contour information, the posture information and the gender information;
acquiring 3D node characteristics in the SMPL model;
extracting 2D node characteristics of the human body to be identified in the human body picture;
generating an error function according to the 3D node characteristics and the 2D node characteristics;
adjusting the SMPL model according to the error function, and extracting target 3D joint characteristics of the adjusted SMPL model;
acquiring skeleton information according to the characteristics of the target 3D joint;
and carrying out gesture analysis according to the skeleton information.
Further, the step of extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture includes:
performing mask processing on the human body picture to obtain a mask of the human body to be identified;
extracting potential edge points of the mask by adopting a Sobel operator;
connecting each potential edge point by using one potential edge point as a starting point and adopting an edge tracking algorithm to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
inputting the human body picture into a pre-trained gender sorter to extract the gender information.
Further, the step of generating an error function from the 3D node feature and the 2D node feature comprises:
acquiring the parameters of the two-dimensional camera;
parameters of the two-dimensional cameraInputting the 3D node characteristics and the 2D node characteristics into a preset error formula to generate the error function, wherein the error function is E all =E j (a,b;k,J)+E 1 (b)+E 2 (b;a)+E 3 (a) Wherein a is the body contour information, b is the pose information, k is the two-dimensional camera parameters, and J is the 2D node feature.
Further, the step of adjusting the SMPL model according to the error function and extracting the target 3D joint characteristics of the adjusted SMPL model includes:
optimizing the error function through a Baweil dog leg algorithm to obtain an optimal solution of the error function;
and adjusting the SMPL model according to the optimal solution, and extracting the target 3D joint characteristics of the adjusted SMPL model.
Further, the step of acquiring the two-dimensional camera parameters includes:
acquiring EXIF information of the human body picture;
determining a focal length of the two-dimensional camera according to the EXIF information;
the shooting depth of the two-dimensional camera is determined by the principle of similar triangles.
Further, the step of extracting the 2D node characteristics of the human body to be identified in the human body picture includes:
extracting the 2D node characteristics through a pre-trained node characteristic extraction model; the node characteristic extraction model is trained based on a fully-connected convolutional neural network model.
The application also provides a gesture recognition device based on the two-dimensional camera, which comprises: the first acquisition unit is used for acquiring a human body picture of a human body to be identified through a two-dimensional camera;
a first extracting unit for extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture;
a generation unit for generating an SMPL model from the body profile information, the pose information and the gender information;
a second obtaining unit, configured to obtain a 3D node characteristic in the SMPL model;
the second extraction unit is used for extracting the 2D node characteristics of the human body to be identified in the human body picture;
a generating unit, configured to generate an error function according to the 3D node characteristic and the 2D node characteristic;
the adjusting unit is used for adjusting the SMPL model according to the error function and extracting target 3D joint characteristics of the adjusted SMPL model;
the third acquisition unit is used for acquiring skeleton information according to the characteristics of the target 3D joint;
and the analysis unit is used for carrying out gesture analysis according to the skeleton information.
Further, the first extraction unit includes:
a mask subunit, configured to perform mask processing on the human body picture to obtain a mask of the human body to be identified;
a first extraction subunit, configured to extract potential edge points of the mask by using a Sobel operator;
the connection subunit is used for connecting each potential edge point by adopting an edge tracking algorithm by taking one of the potential edge points as a starting point to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
the second extraction subunit is used for inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
and the third extraction subunit is used for inputting the human body picture into a pre-trained sex classifier to extract the sex information.
The application also provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the gesture recognition method based on the two-dimensional camera when executing the computer program.
The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the two-dimensional camera-based gesture recognition method of any one of the above.
According to the gesture recognition method, device, equipment and storage medium based on the two-dimensional camera, the human body picture is acquired through the single two-dimensional camera, the human body picture with multiple angles is not needed, the 3D node characteristic is combined on the 2D node characteristic of the human body picture, an error function is generated, the SMPL model is adjusted according to the error function, the target 3D joint characteristic of the adjusted SMPL model is extracted, the skeleton information is acquired according to the target 3D node characteristic, the acquired three-dimensional skeleton information of the human body can well solve the problem of gesture analysis caused by foreign body shielding and body crossing, and the gesture analysis result is more accurate.
Drawings
FIG. 1 is a schematic diagram of steps of a gesture recognition method based on a two-dimensional camera according to an embodiment of the present application;
FIG. 2 is a block diagram of a gesture recognition apparatus based on a two-dimensional camera according to an embodiment of the present application;
fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Referring to fig. 1, an embodiment of the present application provides a gesture recognition method based on a two-dimensional camera, including the steps of:
step S1, acquiring a human body picture of a human body to be identified through a two-dimensional camera;
step S2, extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture;
step S3, generating an SMPL model according to the body contour information, the posture information and the gender information;
step S4, obtaining 3D node characteristics in the SMPL model;
s5, extracting 2D node characteristics of the human body to be identified in the human body picture;
s6, generating an error function according to the 3D node characteristics and the 2D node characteristics;
step S7, adjusting the SMPL model according to the error function, and extracting target 3D joint characteristics of the adjusted SMPL model;
s8, acquiring skeleton information according to the characteristics of the target 3D joint;
and S9, carrying out gesture analysis according to the skeleton information.
In this embodiment, as described in the above steps S1-S2, a two-dimensional camera is used to obtain a human body picture, the two-dimensional camera is single, the two-dimensional camera can be fixedly placed at a location to take a picture, body contour information, gesture information and gender information in the human body picture are extracted, the gesture information includes standing, running, jumping, squatting, forking legs, and the like, the body contour is a bounding box of the human body to be identified in the human body picture, specifically, the mask_rcnn segmentation algorithm is used to segment the human body to be identified at a pixel level to extract the body contour information, in another embodiment, an edge extraction operator such as a canny operator is used to extract the body contour information, in another embodiment, a flooding algorithm is used to fill a small cavity generated in a small edge detection process, and a morphological processing algorithm is used to connect discontinuous contour boundaries to extract the body contour information.
As described in the above steps S3-S4, the SMPL (a linked Multi-Person Linear Model, multi-person skin linear model) model is a three-dimensional human model, and the joints of the three-dimensional human body are 3D node features; the SMPL model may be expressed in the form of M (a, b, k), where a is body contour information, b is pose information, and k is a two-dimensional camera parameter.
As described in the above steps S5-S6, 2D node features, that is, feature points of a human body part of a human body to be identified, such as wrists, elbows, shoulders, ankles, knees, etc., are extracted. The SMPL model is generated based on body contour information, gesture information and sex information, and the body contour information, the gesture information and the sex information are extracted from a heavy body picture, so that a certain error exists between the generated 3D node characteristic and the 2D node characteristic, an error function can be generated according to the 3D node characteristic and the 2D node characteristic, and the correct 3D joint characteristic can be determined according to the error function.
As described in step S7, the SMPL model is adjusted by the error function, so that the difference between the length of the 3D node features and the length of the 2D node features is reduced, and the generated three-dimensional human model can be more accurate.
As described in the above steps S8-S9, the target 3D joint features are similar to the real joint features, and the skeleton information of the human body to be identified can be accurately obtained according to the target 3D node features, and the posture of the human body to be identified can be accurately analyzed according to the skeleton information, so as to analyze that the human body to be identified is in a certain posture, such as standing, running, jumping, etc.
In this embodiment, the human body picture is obtained only by a single two-dimensional camera, and the human body picture with multiple angles is not needed, and on the 2D node characteristics of the human body picture, the obtained three-dimensional skeleton information of the human body can well solve the gesture analysis caused by the shielding of foreign matters and the crossing of the body by combining with the 3D node characteristics.
In an embodiment, the step S2 of extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture includes:
step S21, performing mask processing on the human body picture to obtain a mask of the human body to be identified;
step S22, extracting potential edge points of the mask by adopting a Sobel operator;
step S23, using one of the potential edge points as a starting point, and adopting an edge tracking algorithm to connect each potential edge point to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
step S24, inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
step S25, inputting the human body picture into a pre-trained gender sorter to extract the gender information.
In this embodiment, as described in the above steps S21 to S22, the connected region with the largest area may be found by using the maximum connected algorithm based on the four neighborhoods, so as to obtain the mask of the human body to be identified. The method has the advantages that the Sobel operator is adopted to extract the potential edge points of the mask, is a discrete differential operator and is used for calculating the approximate value of the gradient of the image brightness function, and because the operation similar to local average is introduced, the method has a smoothing effect on noise, can well eliminate the influence of the noise, and can extract the potential edge points of the mask more accurately by extracting the potential edge points through the Sobel operator.
As described in step S23 above, a potential edge point is selected from the extracted potential edge points as a starting point, and an end-to-end closed curve is obtained from the selected edge point in the condition of eight neighborhoods.
As described in the above steps S24-S25, the preset gesture recognition model can recognize the gesture of the human body to be recognized in the two-dimensional human body picture, and the sex classifier can classify the human body to be recognized to recognize the sex of the human body to be recognized as male or female.
In this embodiment, the body contour information of the human body to be identified can be accurately extracted by using the mask and the edge tracking algorithm. Meanwhile, the gesture recognition model and the gender classifier are preset, and the gesture information and the gender information can be accurately extracted.
In an embodiment, the step S6 of generating an error function according to the 3D node characteristic and the 2D node characteristic includes:
step S61, acquiring the parameters of the two-dimensional camera;
step S62, inputting the two-dimensional camera parameters, the 3D node characteristics and the 2D node characteristics into a preset systemGenerating the error function in an error formula, wherein the error function is E all =E j (a,b;k,J)+E 1 (b)+E 2 (b;a)+E 3 (a) Wherein a is the body contour information, b is the pose information, k is the two-dimensional camera parameters, and J is the 2D node feature.
In the present embodiment, E is as described above j (a,b;k,J)=∑E Jk (R b (J(a)))-J);
E 2 (a,b)=λ b E b (b)+λ a E a (a) The method comprises the steps of carrying out a first treatment on the surface of the Wherein R is a rotation equation.
In this embodiment, by introducing three energy functions E1 (b), E2 (b; a), and E3 (a) to compensate for the error of Ej, the problem that the limbs of the human body to be identified may be abnormally interlaced together and the gestures are blocked from each other can be avoided.
In an embodiment, the step S7 of adjusting the SMPL model according to the error function and extracting the target 3D joint feature of the adjusted SMPL model includes:
step S71, optimizing the error function through a Baweil dog leg algorithm to obtain an optimal solution of the error function;
and step S72, adjusting the SMPL model according to the optimal solution, and extracting the target 3D joint characteristics of the adjusted SMPL model.
In this embodiment, the powell dogleg algorithm can find the minimum point of the quadratic function in a finite step, as described in steps S71-72 above. In particular, there is always one starting point (the starting point of the first round is an optional initial point) and n linearly independent search directions in each round of iteration. And sequentially carrying out one-dimensional searching along n directions from the initial point to obtain the end point. A new search direction is determined by the start and end points. It is determined whether the original vector needs to be replaced with a new search direction. If the vector is replaced, the worst vector in the original vector group is further judged, and then the worst vector is replaced by the newly generated vector so as to ensure that the conjugate direction is generated successively. The optimal solution of the error function can be accurately and rapidly found through the Baweil dog leg algorithm, the SMPL model is integrated through optimal demodulation, the adjusted SMPL model can accurately simulate the three-dimensional image of the human body to be identified, and the target 3D node characteristics in the adjusted SMPL model are extracted.
In an embodiment, the step S61 of acquiring the two-dimensional camera parameters includes:
step S611, acquiring EXIF information of the human body picture;
step S612, determining the focal length of the two-dimensional camera according to the EXIF information;
step S613, determining the shooting depth of the two-dimensional camera by the principle of similar triangles.
In the present embodiment, EXIF information is an abbreviation of exchangeable image files, is set specifically for photographs of digital cameras, and can record attribute information and photographing data of digital photographs. The EXIF may be added to a file such as JPEG, TIFF, RIFF, to which contents of information about photographing of the digital camera and version information of an index map or image processing software are added, and a focal length at the time of photographing of the two-dimensional camera is determined based on the EXIF information. And determining the shooting depth of the two-dimensional camera through a similar triangle principle, namely determining according to the comparison of the length of the trunk between 3D node features and the length of the trunk between corresponding 2D node features, namely determining the distance between the human body to be identified and the two-dimensional camera.
In an embodiment, the step of extracting 2D node features of the human body to be identified in the human body picture includes:
extracting the 2D node characteristics through a pre-trained node characteristic extraction model; the node characteristic extraction model is trained based on a fully-connected convolutional neural network model.
In this embodiment, all the front layers and the rear layers in the fully connected convolutional neural network are densely connected, so that feature reuse is realized through connection of features on a channel, features can be accurately identified, and the feature extraction model trained by the fully connected convolutional neural network model can automatically extract feature point positions of human body parts, namely 2D node features.
The gesture recognition method based on the two-dimensional camera can be applied to the field of block chains, a trained node characteristic extraction model is stored in a block chain network, and the block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
The blockchain underlying platform may include processing modules for user management, basic services, smart contracts, operation monitoring, and the like. The user management module is responsible for identity information management of all blockchain participants, including maintenance of public and private key generation (account management), key management, maintenance of corresponding relation between the real identity of the user and the blockchain address (authority management) and the like, and under the condition of authorization, supervision and audit of transaction conditions of certain real identities, and provision of rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node devices, is used for verifying the validity of a service request, recording the service request on a storage after the effective request is identified, for a new service request, the basic service firstly analyzes interface adaptation and authenticates the interface adaptation, encrypts service information (identification management) through an identification algorithm, and transmits the encrypted service information to a shared account book (network communication) in a complete and consistent manner, and records and stores the service information; the intelligent contract module is responsible for registering and issuing contracts, triggering contracts and executing contracts, a developer can define contract logic through a certain programming language, issue the contract logic to a blockchain (contract registering), invoke keys or other event triggering execution according to the logic of contract clauses to complete the contract logic, and simultaneously provide a function of registering contract upgrading; the operation monitoring module is mainly responsible for deployment in the product release process, modification of configuration, contract setting, cloud adaptation and visual output of real-time states in product operation, for example: alarms, monitoring network conditions, monitoring node device health status, etc.
The embodiment of the application also provides a gesture recognition device based on the two-dimensional camera, which comprises:
a first acquiring unit 10 for acquiring a human body picture of a human body to be identified through a two-dimensional camera;
a first extracting unit 20 for extracting body contour information, posture information, and sex information of the human body to be identified according to the human body picture;
a generation unit 30 for generating an SMPL model from the body contour information, the posture information, and the gender information;
a second obtaining unit 40, configured to obtain a 3D node characteristic in the SMPL model;
a second extracting unit 50, configured to extract 2D node features of the human body to be identified in the human body picture;
a generating unit 60, configured to generate an error function according to the 3D node characteristic and the 2D node characteristic;
an adjustment unit 70, configured to adjust the SMPL model according to the error function, and extract a target 3D joint feature of the adjusted SMPL model;
a third obtaining unit 80, configured to obtain skeleton information according to the target 3D joint feature;
and an analysis unit 90, configured to perform gesture analysis according to the skeleton information.
In an embodiment, the first extraction unit 20 includes:
a mask subunit, configured to perform mask processing on the human body picture to obtain a mask of the human body to be identified;
a first extraction subunit, configured to extract potential edge points of the mask by using a Sobel operator;
the connection subunit is used for connecting each potential edge point by adopting an edge tracking algorithm by taking one of the potential edge points as a starting point to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
the second extraction subunit is used for inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
and the third extraction subunit is used for inputting the human body picture into a pre-trained sex classifier to extract the sex information.
In an embodiment, the generating unit 60 includes:
the first acquisition subunit is used for acquiring the two-dimensional camera parameters;
a generating subunit, configured to input the two-dimensional camera parameter, the 3D node feature, and the 2D node feature into a preset error formula to generate the error function, where the error function is E all =E j (a,b;k,J)+E 1 (b)+E 2 (b;a)+E 3 (a) Wherein a is the body contour information, b is the pose information, k is the two-dimensional camera parameters, and J is the 2D node feature.
In one embodiment, the adjusting unit 70 includes:
the optimizing subunit is used for optimizing the error function through a Baweil dog leg algorithm to obtain an optimal solution of the error function;
and the second acquisition subunit is used for adjusting the SMPL model according to the optimal solution and extracting the target 3D joint characteristics of the adjusted SMPL model.
In an embodiment, the first acquisition subunit includes:
the acquisition module is used for acquiring EXIF information of the human body picture;
the first determining module is used for determining the focal length of the two-dimensional camera according to the EXIF information;
and the second determining module is used for determining the shooting depth of the two-dimensional camera through the principle of similar triangles.
In an embodiment, the second extraction unit 50 includes:
a fourth extraction subunit, configured to extract the 2D node features through a pre-trained node feature extraction model; the node characteristic extraction model is trained based on a fully-connected convolutional neural network model.
In this embodiment, the specific implementation of each unit, sub-unit, and module described in the foregoing method embodiment is referred to in the foregoing description, and will not be described in detail herein.
Referring to fig. 3, in an embodiment of the present application, there is further provided a computer device, which may be a server, and an internal structure thereof may be as shown in fig. 3. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the computer is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing pictures of human bodies and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements a two-dimensional camera-based gesture recognition method.
It will be appreciated by those skilled in the art that the architecture shown in fig. 3 is merely a block diagram of a portion of the architecture in connection with the present inventive arrangements and is not intended to limit the computer devices to which the present inventive arrangements are applicable.
An embodiment of the present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements a two-dimensional camera-based gesture recognition method.
In summary, in the gesture recognition method, device, equipment and storage medium based on the two-dimensional camera provided by the embodiment of the application, a human body picture of a human body to be recognized is obtained through the two-dimensional camera; extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture; generating an SMPL model according to the body contour information, the posture information and the gender information; acquiring 3D node characteristics in the SMPL model; extracting 2D node characteristics of the human body to be identified in the human body picture; generating an error function according to the 3D node characteristics and the 2D node characteristics; adjusting the SMPL model according to the error function, and extracting target 3D joint characteristics of the adjusted SMPL model; acquiring skeleton information according to the characteristics of the target 3D joint; and carrying out gesture analysis according to the skeleton information. According to the scheme provided by the application, the human body picture is obtained only through the single two-dimensional camera, the human body picture with multiple angles is not needed, and on the 2D node characteristics of the human body picture, the obtained three-dimensional skeleton information of the human body can well solve the gesture analysis caused by the shielding of foreign matters and the crossing of the human body by combining with the 3D node characteristics.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by hardware associated with a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium provided by the present application and used in embodiments may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual speed data rate SDRAM (SSRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the scope of the application, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application or direct or indirect application in other related technical fields are included in the scope of the present application.

Claims (8)

1. The gesture recognition method based on the two-dimensional camera is characterized by comprising the following steps of:
acquiring a human body picture of a human body to be identified through a two-dimensional camera;
extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture;
generating an SMPL model according to the body contour information, the posture information and the gender information;
acquiring 3D node characteristics in the SMPL model;
extracting 2D node characteristics of the human body to be identified in the human body picture;
generating an error function according to the 3D node characteristics and the 2D node characteristics;
adjusting the SMPL model according to the error function, and extracting target 3D joint characteristics of the adjusted SMPL model;
acquiring skeleton information according to the characteristics of the target 3D joint;
carrying out gesture analysis according to the skeleton information;
the step of generating an error function from the 3D node features and the 2D node features comprises:
acquiring parameters of the two-dimensional camera;
inputting the parameters of the two-dimensional camera, the 3D node characteristics and the 2D node characteristics into a preset error formulaGenerating the error function, wherein the error function is E all = E j (a,b;k,J)+E 1 (b)+E 2 (b;a)+E 3 (a) Wherein the E 1 (b) Said E 2 (b; a) and said E 3 (a) As a function of energy, said E 1 (b) Said E 2 (b; a) and said E 3 (a) For said E j Compensating for errors in (a, b; k, J), wherein a is the body contour information, b is the pose information, k is the two-dimensional camera parameter, and J is the 2D node feature;
the step of adjusting the SMPL model according to the error function, and extracting the target 3D joint characteristics of the adjusted SMPL model includes:
optimizing the error function through a Baweil dog leg algorithm to obtain an optimal solution of the error function;
and adjusting the SMPL model according to the optimal solution, and extracting the target 3D joint characteristics of the adjusted SMPL model.
2. The two-dimensional camera-based gesture recognition method according to claim 1, wherein the step of extracting body contour information, gesture information, and sex information of the human body to be recognized from the human body picture comprises:
performing mask processing on the human body picture to obtain a mask of the human body to be identified;
extracting potential edge points of the mask by adopting a Sobel operator;
connecting each potential edge point by using one potential edge point as a starting point and adopting an edge tracking algorithm to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
inputting the human body picture into a pre-trained gender sorter to extract the gender information.
3. The two-dimensional camera-based gesture recognition method of claim 1, wherein the step of acquiring the two-dimensional camera parameters comprises:
acquiring EXIF information of the human body picture;
determining a focal length of the two-dimensional camera according to the EXIF information;
the shooting depth of the two-dimensional camera is determined by the principle of similar triangles.
4. The gesture recognition method based on a two-dimensional camera according to claim 1, wherein the step of extracting 2D node features of the human body to be recognized in the human body picture comprises:
extracting the 2D node characteristics through a pre-trained node characteristic extraction model; the node characteristic extraction model is trained based on a fully-connected convolutional neural network model.
5. A two-dimensional camera-based gesture recognition apparatus, comprising:
the first acquisition unit is used for acquiring a human body picture of a human body to be identified through a two-dimensional camera;
a first extracting unit for extracting body contour information, posture information and sex information of the human body to be identified according to the human body picture;
a generation unit for generating an SMPL model from the body profile information, the pose information and the gender information;
a second obtaining unit, configured to obtain a 3D node characteristic in the SMPL model;
the second extraction unit is used for extracting the 2D node characteristics of the human body to be identified in the human body picture;
a generating unit, configured to generate an error function according to the 3D node characteristic and the 2D node characteristic;
the adjusting unit is used for adjusting the SMPL model according to the error function and extracting target 3D joint characteristics of the adjusted SMPL model;
the third acquisition unit is used for acquiring skeleton information according to the characteristics of the target 3D joint;
the analysis unit is used for carrying out gesture analysis according to the skeleton information;
a first acquisition subunit, configured to acquire parameters of the two-dimensional camera;
a generating subunit, configured to input the parameters of the two-dimensional camera, the 3D node features, and the 2D node features into a preset error formula to generate the error function, where the error function is E all = E j (a,b;k,J)+E 1 (b)+E 2 (b;a)+E 3 (a) Wherein the E 1 (b) Said E 2 (b; a) and said E 3 (a) As a function of energy, said E 1 (b) Said E 2 (b; a) and said E 3 (a) For said E j Compensating for errors in (a, b; k, J), wherein a is the body contour information, b is the pose information, k is the two-dimensional camera parameter, and J is the 2D node feature;
the optimizing subunit is used for optimizing the error function through a Baweil dog leg algorithm to obtain an optimal solution of the error function;
and the second acquisition subunit is used for adjusting the SMPL model according to the optimal solution and extracting the target 3D joint characteristics of the adjusted SMPL model.
6. The two-dimensional camera-based gesture recognition apparatus of claim 5, wherein the first extraction unit comprises:
a mask subunit, configured to perform mask processing on the human body picture to obtain a mask of the human body to be identified;
a first extraction subunit, configured to extract potential edge points of the mask by using a Sobel operator;
the connection subunit is used for connecting each potential edge point by adopting an edge tracking algorithm by taking one of the potential edge points as a starting point to obtain a closed curve connected end to end as the body contour information of the human body to be identified;
the second extraction subunit is used for inputting the human body picture into a preset gesture recognition model to extract the gesture information of the human body to be recognized;
and the third extraction subunit is used for inputting the human body picture into a pre-trained sex classifier to extract the sex information.
7. A computer device comprising a memory and a processor, the memory having stored therein a computer program, characterized in that the processor, when executing the computer program, implements the steps of the two-dimensional camera-based gesture recognition method of any one of claims 1 to 4.
8. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the two-dimensional camera-based gesture recognition method of any one of claims 1 to 4.
CN202011339565.4A 2020-11-25 2020-11-25 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera Active CN112464791B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011339565.4A CN112464791B (en) 2020-11-25 2020-11-25 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera
PCT/CN2021/084543 WO2021208740A1 (en) 2020-11-25 2021-03-31 Pose recognition method and apparatus based on two-dimensional camera, and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011339565.4A CN112464791B (en) 2020-11-25 2020-11-25 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera

Publications (2)

Publication Number Publication Date
CN112464791A CN112464791A (en) 2021-03-09
CN112464791B true CN112464791B (en) 2023-10-27

Family

ID=74807909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011339565.4A Active CN112464791B (en) 2020-11-25 2020-11-25 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera

Country Status (2)

Country Link
CN (1) CN112464791B (en)
WO (1) WO2021208740A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464791B (en) * 2020-11-25 2023-10-27 平安科技(深圳)有限公司 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020633A (en) * 2019-04-12 2019-07-16 腾讯科技(深圳)有限公司 Training method, image-recognizing method and the device of gesture recognition model
CN111968217A (en) * 2020-05-18 2020-11-20 北京邮电大学 SMPL parameter prediction and human body model generation method based on picture

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007310707A (en) * 2006-05-19 2007-11-29 Toshiba Corp Apparatus and method for estimating posture
CN102385695A (en) * 2010-09-01 2012-03-21 索尼公司 Human body three-dimensional posture identifying method and device
CN110189397A (en) * 2019-03-29 2019-08-30 北京市商汤科技开发有限公司 A kind of image processing method and device, computer equipment and storage medium
CN112464791B (en) * 2020-11-25 2023-10-27 平安科技(深圳)有限公司 Gesture recognition method, device, equipment and storage medium based on two-dimensional camera

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110020633A (en) * 2019-04-12 2019-07-16 腾讯科技(深圳)有限公司 Training method, image-recognizing method and the device of gesture recognition model
CN111968217A (en) * 2020-05-18 2020-11-20 北京邮电大学 SMPL parameter prediction and human body model generation method based on picture

Also Published As

Publication number Publication date
CN112464791A (en) 2021-03-09
WO2021208740A1 (en) 2021-10-21

Similar Documents

Publication Publication Date Title
Revaud et al. Epicflow: Edge-preserving interpolation of correspondences for optical flow
KR102097016B1 (en) Apparatus and methdo for analayzing motion
CN105138980A (en) Identify authentication method and system based on identity card information and face identification
CN111680672B (en) Face living body detection method, system, device, computer equipment and storage medium
CN110874865A (en) Three-dimensional skeleton generation method and computer equipment
WO2021120961A1 (en) Brain addiction structure map evaluation method and apparatus
CN108229375B (en) Method and device for detecting face image
CN113033519B (en) Living body detection method, estimation network processing method, device and computer equipment
WO2022088572A1 (en) Model training method, image processing and alignment method, apparatus, device, and medium
JP5937823B2 (en) Image collation processing apparatus, image collation processing method, and image collation processing program
Wu et al. Single-shot face anti-spoofing for dual pixel camera
Yu et al. A video-based facial motion tracking and expression recognition system
CN112464791B (en) Gesture recognition method, device, equipment and storage medium based on two-dimensional camera
He et al. Linear approach for initial recovery of the exterior orientation parameters of randomly captured images by low-cost mobile mapping systems
CN114882537A (en) Finger new visual angle image generation method based on nerve radiation field
CN110766077A (en) Method, device and equipment for screening sketch in evidence chain image
Park et al. 3D face reconstruction from stereo video
CN113673308A (en) Object identification method, device and electronic system
CN111259700A (en) Method and apparatus for generating gait recognition model
CN113610969B (en) Three-dimensional human body model generation method and device, electronic equipment and storage medium
Kim Survey on registration techniques of visible and infrared images
CN113807451B (en) Panoramic image feature point matching model training method and device and server
CN112132960B (en) Three-dimensional reconstruction method and device and electronic equipment
CN110490950B (en) Image sample generation method and device, computer equipment and storage medium
CN114648604A (en) Image rendering method, electronic device, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant