CN111950496B - Mask person identity recognition method - Google Patents

Mask person identity recognition method Download PDF

Info

Publication number
CN111950496B
CN111950496B CN202010843398.0A CN202010843398A CN111950496B CN 111950496 B CN111950496 B CN 111950496B CN 202010843398 A CN202010843398 A CN 202010843398A CN 111950496 B CN111950496 B CN 111950496B
Authority
CN
China
Prior art keywords
image
features
mask
person
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010843398.0A
Other languages
Chinese (zh)
Other versions
CN111950496A (en
Inventor
程良伦
杨颖�
黄国恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN202010843398.0A priority Critical patent/CN111950496B/en
Publication of CN111950496A publication Critical patent/CN111950496A/en
Application granted granted Critical
Publication of CN111950496B publication Critical patent/CN111950496B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4053Scaling of whole images or parts thereof, e.g. expanding or contracting based on super-resolution, i.e. the output image resolution being higher than the sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a method for identifying identity of a mask person, which comprises the following steps: inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain an image of the person to be identified; inputting the mask person region image into a preset encoding-decoding model for feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask person region image; carrying out averaging treatment on typical features of the area image of the mask person to obtain static gait features, and inputting the gesture features into a preset LSTM network for treatment to obtain dynamic gait features; the static gait characteristics and the dynamic gait characteristics are input into a preset classifier for classification and identification, so that an identity identification result of the image of the person to be identified is obtained, and the technical problem that the identity of the person to be identified is difficult to identify by the existing face identification method is solved.

Description

Mask person identity recognition method
Technical Field
The application relates to the technical field of identity recognition, in particular to a method for recognizing identity of a mask person.
Background
In the prior art, identity recognition is usually performed through face recognition, but when facing a mask with facial features blocked by a sunglasses, a hat, a mask or the like, the face recognition method cannot accurately extract the facial features of the mask, so that the identity of the mask cannot be recognized. Particularly in the case of a lower resolution photographing device, it is more difficult to identify the identity of the person.
Disclosure of Invention
The application provides a method for identifying identity of a person to be masked, which is used for solving the technical problem that the identity of the person to be masked is difficult to identify in the existing face identification method.
In view of this, a first aspect of the present application provides a method for identifying identity of a person on a mask, including:
inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain an image of the person to be identified;
inputting the mask area image into a preset encoding-decoding model for feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask area image;
carrying out averaging treatment on the typical characteristics of the mask area image to obtain static gait characteristics, and inputting the gesture characteristics into a preset LSTM network for treatment to obtain dynamic gait characteristics;
and inputting the static gait characteristics and the dynamic gait characteristics into a preset classifier for classification and identification to obtain an identification result of the to-be-identified mask image.
Optionally, inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain an image of the person to be identified, and before the step of obtaining the image of the person to be identified, further includes:
inputting the to-be-identified mask image into a preset super-resolution network for processing, and outputting a super-resolution mask image;
correspondingly, the step of inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain the image of the person to be identified comprises the following steps:
and inputting the super-resolution mask image into a preset region segmentation network for segmentation processing to obtain a mask region image.
Optionally, the preset super-resolution network includes a first convolution module and a second convolution module;
the first convolution module comprises 6 convolution layers and one sub-pixel convolution layer, and is used for improving pixels in the length direction and the width direction of the to-be-identified mask image;
the second convolution module comprises 4 convolution layers and one sub-pixel convolution layer and is used for improving pixels in the height direction of the to-be-identified mask image.
Optionally, the configuration process of the preset encoding-decoding model includes:
framing the acquired video data to obtain a training sample image;
sequentially inputting the training sample images to an encoding-decoding network, so that an encoder in the encoding-decoding network encodes the training sample images, outputting external features, gesture features and typical features of the training sample images, performing image reconstruction by a decoder in the encoding-decoding network based on the features output by the encoder, and outputting reconstructed images, wherein the training sample images are mask region images obtained by dividing mask images to be trained;
based on the reconstructed image and the training sample image, separating non-posture features of the training sample image through a cross reconstruction loss function, separating posture features of the training sample image through a posture similarity loss function, and separating typical features of the training sample image from the non-posture features through a standard consistency loss function, wherein the non-posture features comprise the external features and the typical features.
Optionally, the cross-reconstruction loss function is:
wherein t is 1 ,t 2 F for different moments under the same video a As external features, f p For gesture features, f c As is typical of the features of this application,at t 2 The training sample image at the moment, D (·) is the decoding function.
Optionally, the pose similarity loss function is:
wherein n is 1 For video scene c 1 Number of video frames, n 2 For video scene c 2 The number of video frames below.
Optionally, the canonical consistency loss function:
optionally, the preset area dividing network is a trained Mask R-CNN network.
From the above technical scheme, the application has the following advantages:
the application provides a method for identifying identity of a mask person, which comprises the following steps: inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain an image of the person to be identified; inputting the mask person region image into a preset encoding-decoding model for feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask person region image; carrying out averaging treatment on typical features of the area image of the mask person to obtain static gait features, and inputting the gesture features into a preset LSTM network for treatment to obtain dynamic gait features; and inputting the static gait characteristics and the dynamic gait characteristics into a preset classifier for classification and identification, and obtaining an identity identification result of the to-be-identified mask image.
According to the identity recognition method of the mask, the mask area and the background area are segmented through the preset area segmentation network, so that a mask area image is obtained, and interference of factors such as background is reduced; the gait characteristics of the mask are obtained by extracting the typical characteristics and the gesture characteristics of the mask, so that the identity of the mask is identified, and the identity of the mask is identified by extracting the gait characteristics of the mask, so that the problem that the face characteristics of the mask cannot be accurately extracted and the identity of the mask cannot be identified by a face identification method is avoided; in addition, in consideration of the fact that the external features of the mask extracted by the convolutional neural network are different due to the fact that the mask possibly wears different in different scenes, the identity recognition effect of the mask is affected by the fact that the different external features participate in the identity recognition of the mask, in order to avoid the effect of the identity recognition due to the fact that the change of the external features of the mask is affected, the external features are separated and deleted through the preset coding-decoding model, accuracy of the identity recognition of the mask is improved, and therefore the technical problem that the identity recognition of the mask is difficult to recognize in an existing face recognition method is solved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the application, and that other drawings can be obtained from these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a schematic flow chart of a method for identifying identity of a person to be masked according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a method for identifying a person with a mask according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a super-resolution network according to an embodiment of the present application.
Detailed Description
The application provides a method for identifying identity of a person to be masked, which is used for solving the technical problem that the identity of the person to be masked is difficult to identify in the existing face identification method.
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
For easy understanding, referring to fig. 1, an embodiment of a method for identifying identity of a person on a mask provided by the present application includes:
step 101, inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain the image of the person to be identified.
The image of the mask to be identified is directly input into a preset encoding-decoding model for feature extraction, so that unnecessary background features can be extracted, and the identity identification effect is affected. Therefore, in the embodiment of the application, the preset area segmentation network is adopted to segment the image of the to-be-identified mask, and the mask area is segmented from the image of the to-be-identified mask, so that unnecessary characteristics are reduced, and the quality of effective characteristic information is improved. The to-be-identified mask image can be obtained by performing image framing processing on a video stream acquired by monitoring equipment.
And 102, inputting the mask area image into a preset encoding-decoding model to perform feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask area image.
For different scenes, external features of the mask person are different, such as wearing, and in this case, when the identity is identified through the trained classifier, an erroneous identification result may be generated due to the change of the external features of the mask person. For example, the same person wearing the mask may be wearing the mask differently in different places, and the trained classifier may identify the same person wearing the mask as a person of different identity. Therefore, in the embodiment of the application, the input mask area image is subjected to feature separation by a preset encoding-decoding model to obtain typical features, external features and gesture features, and the external features are deleted without participating in subsequent identification. Typical features are the shape features of the person's height, arm length, etc., and the posture features are the representation of the person's gait information on a particular frame while acting.
Further, the preset encoding-decoding model includes an encoder and a decoder, and the configuration process of the preset encoding-decoding model specifically includes:
carrying out framing treatment on the acquired video data to obtain a to-be-trained mask image, dividing the to-be-trained mask image to obtain a mask area image, and taking the mask area image as a training sample image; sequentially inputting the training sample images into an encoding-decoding network, enabling an encoder in the encoding-decoding network to encode the training sample images, outputting external features, gesture features and typical features of the training sample images, and enabling a decoder in the encoding-decoding network to reconstruct images based on the features output by the encoder and output reconstructed images; based on the reconstructed image and the training sample image, separating out non-posture features of the training sample image through a cross reconstruction loss function, separating out posture features of the training sample image through a posture similarity loss function, and separating out typical features of the training sample image from the non-posture features through a standard consistency loss function, wherein the non-posture features comprise external features and typical features.
It should be noted that the encoding-decoding network is also a typical CNN structure, which includes convolution layers and a max-pooling layer, where each convolution layer is followed by a ReLU activation function, and the last layer is a Sigmoid activation function, and is used to output the value of [0,1] for the next operation. Firstly, an encoder epsilon encodes an input mask person region image X, wherein the characteristic representation mode of the encoding is as follows:
f a ,f p ,f c =ε(X);
wherein f a As external features, f p For gesture features, f c As a typical feature, ε (·) is the coding function.
To adequately extract the image features of the mask area, a reconstructed image approximating the original image X is reconstructed by a decoder DWherein the reconstructed image +.>Can be expressed as:
where D (·) is the decoding function.
After three different features are completely learned and classified, external features, typical features and gesture features in the features of the mask are separated by designing different loss functions, and the method specifically comprises the following steps:
(1) Cross reconstruction loss function:
wherein t is 1 ,t 2 C, for different moments in the same video 1 ,c 2 For different video scenes.
The cross reconstruction loss function proposed by the present application is realized by using t 1 Appearance characteristic f at time a Characteristic features f c And t 2 Gesture feature at time f p To reconstruct t 1 Is the next frame image of (a)Due to t 1 ,t 2 F independent of posture in case a ,f c Is unchanged, f of the current frame can be used p F with any frame in the same video a ,f c Matching, reconstructing the same object, which can force f in all frames of the video a ,f c Keep similar, f a ,f c Corresponds to a constant factor, can make the feature (f a And f c ) Separating, namely separating out the non-attitude features.
(2) Pose similarity loss function:
wherein n is 1 For video scene c 1 Number of video frames, n 2 For video scene c 2 Video of the lower partFrame number.
Under different scenes, f p Will receive f a To ensure f p Only comprises gesture information, a gesture similarity loss function is provided, and f is deleted by utilizing the uniqueness of the gesture information under different scenes p External characteristics present in the model.
(3) Canonical consistency loss function:
in the formula, i, j E [1, n ] 1 ]. Each individual is invariant at different times and scenarios, and from this characteristic, the representative feature can be separated from the non-pose features by a canonical consistency loss function.
In the process of feature separation, deleting external features, and keeping the gesture features and typical features of the mask person for subsequent identification.
Step 103, carrying out averaging treatment on typical features of the area image of the mask person to obtain static gait features, and inputting the gesture features into a preset LSTM network for treatment to obtain dynamic gait features.
102, after typical features and gesture features of a person are obtained through separation, carrying out averaging treatment on the typical features to obtain static gait features of the person; the dynamic gait characteristics are obtained by inputting the gesture characteristics into a multi-layer LSTM network with a design increment identity loss function for characteristic processing.
Step 104, inputting the static gait characteristics and the dynamic gait characteristics into a preset classifier for classification and identification, and obtaining an identity identification result of the to-be-identified mask image.
The static gait characteristics and the dynamic gait characteristics of the mask are input into the preset classifier for classification and identification, so that an identity identification result is achieved, and the identity identification result is more accurate and reliable than an identity identification result obtained by classification and identification based on single static gait characteristics or single dynamic gait characteristics.
According to the identity recognition method for the mask, the mask area and the background area are segmented through the preset area segmentation network, the mask area image is obtained, and interference of factors such as the background is reduced; the gait characteristics of the mask are obtained by extracting the typical characteristics and the gesture characteristics of the mask, so that the identity of the mask is identified, and the identity of the mask is identified by extracting the gait characteristics of the mask, so that the problem that the face characteristics of the mask cannot be accurately extracted and the identity of the mask cannot be identified by a face identification method is avoided; in addition, in consideration of the fact that the external features of the mask extracted by the convolutional neural network are different due to the fact that the mask possibly wears different in different scenes, the identity recognition effect of the mask is affected by the fact that the different external features participate in the identity recognition of the mask, in order to avoid the effect of the identity recognition due to the fact that the change of the external features of the mask is affected, the external features are separated and deleted through the preset coding-decoding model, accuracy of the identity recognition of the mask is improved, and therefore the technical problem that the identity recognition of the mask is difficult to recognize in an existing face recognition method is solved.
The above is an embodiment of a method for identifying a person on a mask provided by the present application, and the following is another embodiment of a method for identifying a person on a mask passed by the present application.
For easy understanding, referring to fig. 2, another embodiment of a method for identifying identity of a person on a mask provided by the present application includes:
step 201, inputting the to-be-identified mask image into a preset super-resolution network for processing, and outputting the super-resolution mask image.
In consideration of the problem of monitoring equipment, the quality of the photographed images of the people with the face is poor, so that the extraction of effective characteristics is affected. According to the application, the super-resolution network is preset to process the to-be-identified mask image, and the super-resolution mask image is output, so that the quality of the to-be-identified mask image is improved.
Further, the preset super-resolution network in the embodiment of the present application includes a first convolution module and a second convolution module, please refer to fig. 3; the first convolution module comprises 6 convolution layers and one sub-pixel convolution layer (upsampling layer) and is used for improving pixels in the length and width directions of the to-be-identified mask image; the second convolution module comprises 4 convolution layers and one sub-pixel convolution layer, and is used for improving pixels in the height direction of the to-be-identified mask image. For an input image of a to-be-identified mask, a patch (pats) -based segmentation method is adopted, low-resolution pats with a pixel of 7*7 is input, a 3*3 convolution kernel is adopted for each convolution layer to perform feature learning, a linear rectification function (ReLU) is connected behind each convolution layer, the learned feature of each convolution layer is input of the next layer, and in order to obtain spatial information among the features of the image of the to-be-identified mask, a first convolution layer in a first convolution module is connected with a third convolution layer in a short jump manner, and the first convolution layer is connected with a sixth convolution layer in a long jump manner. And the activation map generated by the sixth convolution layer is processed by the sub-pixel convolution layer, and the amplified activation map is output, so that the pixels in the length and width directions of the to-be-identified mask image are improved. The input of the second convolution module is the output (amplified activation diagram) of the first convolution module, the first convolution layer in the second convolution module is connected with the fourth convolution layer in a long jump way, the sub-pixel convolution layer in the second convolution module carries out further amplification processing on the activation diagram output by the fourth convolution layer, the pixels in the height direction of the to-be-identified mask image are improved, and finally the super-resolution mask image is output.
The sub-pixel convolution layer in the pre-set super-resolution network has scale factors of two directions for scaling in two axis directions, for which r=2, the sub-pixel convolution layer requires 4 activation maps as inputs. The low resolution mask image is input to the activation map and then the activation map is output in response to the high resolution, with the pixels rearranged for super resolution on both axes, outputting Meng Mianren the super resolution image.
Step 202, inputting the super-resolution mask image into a preset region segmentation network for segmentation processing to obtain a mask region image.
The preset split area network in the embodiment of the application is preferably a trained Mask R-CNN network. The Mask R-CNN network performs feature extraction and region segmentation through the ResNext-101+FPN network, the ROIALign is adopted to replace pooling operation, an interpolation process is introduced, bilinear interpolation processing is performed first, pooling operation is performed again, and thus the problem of data nonlinearity caused by sampling only through pooling operation is avoided.
And 203, inputting the mask area image into a preset encoding-decoding model to perform feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask area image.
And 204, carrying out averaging treatment on typical features of the area image of the mask person to obtain static gait features, and inputting the gesture features into a preset LSTM network for treatment to obtain dynamic gait features.
Step 205, inputting the static gait feature and the dynamic gait feature into a preset classifier for classification and identification, and obtaining an identity identification result of the to-be-identified mask image.
The details of steps 203 to 205 are identical to those of steps 102 to 104, and the details of steps 203 to 205 will not be described here.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (7)

1. A method for identifying a person on a mask, comprising:
inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain an image of the person to be identified;
inputting the mask area image into a preset encoding-decoding model for feature separation, deleting the extracted external features, and outputting typical features and gesture features of the mask area image;
carrying out averaging treatment on the typical characteristics of the mask area image to obtain static gait characteristics, and inputting the gesture characteristics into a preset LSTM network for treatment to obtain dynamic gait characteristics;
inputting the static gait characteristics and the dynamic gait characteristics into a preset classifier for classification and identification to obtain an identification result of the to-be-identified mask image;
the configuration process of the preset encoding-decoding model comprises the following steps:
framing the acquired video data to obtain a training sample image;
sequentially inputting the training sample images to an encoding-decoding network, so that an encoder in the encoding-decoding network encodes the training sample images, outputting external features, gesture features and typical features of the training sample images, performing image reconstruction by a decoder in the encoding-decoding network based on the features output by the encoder, and outputting reconstructed images, wherein the training sample images are mask region images obtained by dividing mask images to be trained;
based on the reconstructed image and the training sample image, separating non-attitude features of the training sample image through a cross reconstruction loss function, separating attitude features of the training sample image through an attitude similarity loss function, and separating typical features of the training sample image from the non-attitude features through a standard consistency loss function, wherein the non-attitude features comprise the external features and the typical features.
2. The method for identifying the identity of the person to be masked according to claim 1, wherein the step of inputting the image of the person to be masked into a preset area dividing network for dividing processing to obtain the image of the area of the person to be masked, further comprises the steps of:
inputting the to-be-identified mask image into a preset super-resolution network for processing, and outputting a super-resolution mask image;
correspondingly, the step of inputting the image of the person to be identified into a preset area segmentation network for segmentation processing to obtain the image of the person to be identified comprises the following steps:
and inputting the super-resolution mask image into a preset region segmentation network for segmentation processing to obtain a mask region image.
3. The method of claim 2, wherein the preset super-resolution network comprises a first convolution module and a second convolution module;
the first convolution module comprises 6 convolution layers and one sub-pixel convolution layer, and is used for improving pixels in the length direction and the width direction of the to-be-identified mask image;
the second convolution module comprises 4 convolution layers and one sub-pixel convolution layer and is used for improving pixels in the height direction of the to-be-identified mask image.
4. The method of claim 1, wherein the cross-reconstruction loss function is:
wherein t is 1 ,t 2 F for different moments under the same video a As external features, f p For gesture features, f c As is typical of the features of this application,at t 2 The training sample image at the moment, D (·) is the decoding function.
5. The method of claim 4, wherein the pose similarity loss function is:
wherein n is 1 For video scene c 1 Number of video frames, n 2 For video scene c 2 The number of video frames below.
6. The method of claim 5, wherein the canonical consistency loss function:
7. the method of claim 1, wherein the preset area division network is a trained Mask R-CNN network.
CN202010843398.0A 2020-08-20 2020-08-20 Mask person identity recognition method Active CN111950496B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010843398.0A CN111950496B (en) 2020-08-20 2020-08-20 Mask person identity recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010843398.0A CN111950496B (en) 2020-08-20 2020-08-20 Mask person identity recognition method

Publications (2)

Publication Number Publication Date
CN111950496A CN111950496A (en) 2020-11-17
CN111950496B true CN111950496B (en) 2023-09-15

Family

ID=73358545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010843398.0A Active CN111950496B (en) 2020-08-20 2020-08-20 Mask person identity recognition method

Country Status (1)

Country Link
CN (1) CN111950496B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114494962A (en) * 2022-01-24 2022-05-13 上海商汤智能科技有限公司 Object identification method, network training method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815874A (en) * 2019-01-17 2019-05-28 苏州科达科技股份有限公司 A kind of personnel identity recognition methods, device, equipment and readable storage medium storing program for executing
CN110084156A (en) * 2019-04-12 2019-08-02 中南大学 A kind of gait feature abstracting method and pedestrian's personal identification method based on gait feature
CN110222634A (en) * 2019-06-04 2019-09-10 河海大学常州校区 A kind of human posture recognition method based on convolutional neural networks
CN110991281A (en) * 2019-11-21 2020-04-10 电子科技大学 Dynamic face recognition method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815874A (en) * 2019-01-17 2019-05-28 苏州科达科技股份有限公司 A kind of personnel identity recognition methods, device, equipment and readable storage medium storing program for executing
CN110084156A (en) * 2019-04-12 2019-08-02 中南大学 A kind of gait feature abstracting method and pedestrian's personal identification method based on gait feature
CN110222634A (en) * 2019-06-04 2019-09-10 河海大学常州校区 A kind of human posture recognition method based on convolutional neural networks
CN110991281A (en) * 2019-11-21 2020-04-10 电子科技大学 Dynamic face recognition method

Also Published As

Publication number Publication date
CN111950496A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
Chen et al. Fsrnet: End-to-end learning face super-resolution with facial priors
CN108830855B (en) Full convolution network semantic segmentation method based on multi-scale low-level feature fusion
CN113362223B (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
CN108537754B (en) Face image restoration system based on deformation guide picture
CN112766160A (en) Face replacement method based on multi-stage attribute encoder and attention mechanism
Tang et al. Real-time neural radiance talking portrait synthesis via audio-spatial decomposition
CN111369565A (en) Digital pathological image segmentation and classification method based on graph convolution network
CN111488932B (en) Self-supervision video time-space characterization learning method based on frame rate perception
CN114723760B (en) Portrait segmentation model training method and device and portrait segmentation method and device
JP2010108494A (en) Method and system for determining characteristic of face within image
CN113343950B (en) Video behavior identification method based on multi-feature fusion
Krishnan et al. SwiftSRGAN-Rethinking super-resolution for efficient and real-time inference
CN113516604B (en) Image restoration method
CN117274059A (en) Low-resolution image reconstruction method and system based on image coding-decoding
CN111950496B (en) Mask person identity recognition method
CN112906675B (en) Method and system for detecting non-supervision human body key points in fixed scene
CN117409476A (en) Gait recognition method based on event camera
CN117097853A (en) Real-time image matting method and system based on deep learning
CN112488165A (en) Infrared pedestrian identification method and system based on deep learning model
CN116152710A (en) Video instance segmentation method based on cross-frame instance association
CN115330655A (en) Image fusion method and system based on self-attention mechanism
CN115100409A (en) Video portrait segmentation algorithm based on twin network
CN110853040B (en) Image collaborative segmentation method based on super-resolution reconstruction
JP2023092185A (en) Image processing apparatus, learning method, and program
CN114511487A (en) Image fusion method and device, computer readable storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant