CN111782857A - Footprint image retrieval method based on mixed attention intensive network - Google Patents

Footprint image retrieval method based on mixed attention intensive network Download PDF

Info

Publication number
CN111782857A
CN111782857A CN202010710865.2A CN202010710865A CN111782857A CN 111782857 A CN111782857 A CN 111782857A CN 202010710865 A CN202010710865 A CN 202010710865A CN 111782857 A CN111782857 A CN 111782857A
Authority
CN
China
Prior art keywords
footprint
layer
output
sample
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010710865.2A
Other languages
Chinese (zh)
Other versions
CN111782857B (en
Inventor
朱明�
赵琛
陈春
王年
唐俊
张艳
鲍文霞
江畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202010710865.2A priority Critical patent/CN111782857B/en
Publication of CN111782857A publication Critical patent/CN111782857A/en
Application granted granted Critical
Publication of CN111782857B publication Critical patent/CN111782857B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a footprint image retrieval method based on a mixed attention-intensive network, which comprises the following steps: 1. preparing a footprint image data set; 2. establishing a footprint image preprocessing module; 3. establishing an initial feature extraction module; 4. establishing a mixed attention intensive network module; 5. establishing a final characteristic output module; 6. initializing the weight; 7. and training, testing and optimizing the network. The method can acquire more abundant feature information of the footprint image and extract feature information of different personal differences as much as possible, thereby improving the precision and speed of the footprint image retrieval.

Description

Footprint image retrieval method based on mixed attention intensive network
Technical Field
The invention relates to the field of image processing, in particular to a footprint image retrieval method based on a mixed attention-intensive network.
Background
Due to the difference of human skeletons and walking postures, the footprints of every person are unique and are more difficult to disguise compared with characteristics such as fingerprints. Therefore, the method has great scientific research value for relevant research of footprints, and can be applied to criminal reconnaissance, safety protection and the like.
With the rapid development of deep learning, the neural network is widely applied in the field of computer vision, and the footprint image retrieval is also advanced due to the introduction of the deep learning method. In the past, the retrieval of the footprint images is mostly finished by footprint experts, and the method is not only easily influenced by personal subjectivity, but also has low speed.
Disclosure of Invention
The invention provides a footprint image retrieval method based on a mixed attention-intensive network to solve the defects of the prior art, so that more abundant feature information of the footprint image can be acquired, and feature information of different personal differences can be extracted as much as possible, thereby improving the precision and speed of the footprint image retrieval.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a footprint image retrieval method based on a mixed attention-intensive network, which is characterized by comprising the following steps of:
step 1: constructing a training set and a test set;
step 1.1: acquiring a footprint image containing a plurality of footprints of a test object in a walking state;
step 1.2: denoising the footprint image to obtain a processed footprint image sample;
step 1.3: cutting the footprint image sample, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines;
step 1.4: defining a tag for each individual footprint outline in the footprint sample set that can distinguish different ID information;
step 1.5: repeating the steps 1.1-1.4, so as to collect a plurality of footprint images of a plurality of test objects, and carrying out corresponding processing, thereby forming a footprint data set D;
step 1.6: dividing the footprint data set D into a training set X and a test set Y, and subdividing the test set into a test query set Y1And testing the bottom library set Y2(ii) a The training set X comprises A kinds of ID information, and the test query set and the test bottom library set both comprise B kinds of same ID information;
step 2: establishing a footprint image retrieval model based on a mixed attention-intensive network, wherein the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module;
step 2.1: the preprocessing layer is used for carrying out normalization processing on the training set X to obtain a footprint sample set X 'and inputting the footprint sample set X' into an initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in the training set X', S representing the total number of footprint samples;
step 2.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 2.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 2.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs the shared weight of the mth convolutional layer;
step 2.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 2.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 2.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 2.3.1: initializing weights of all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 2.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating said output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the output result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cA value representing the ith spatial position and the c-th channel position in the nth convolutional layer output; i represents the number of spatial positions, C represents the number of channels;
step 2.3.3: obtaining an output result ZZ 'of an n-th mixed attention layer of an n-th mixed attention dense network in any one dense block by using formula (2)'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
Figure BDA0002596492340000031
step 2.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz″nsIndicates that the output result ZZ ″)nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 2.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 2.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 2.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 2.4.2: the final convolution layer and the final pooling layer sequentially perform convolution and pooling operations on the footprint characteristics F' and then sequentially pass through the full-connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 2.4.3: feeding said final feature F "into two parallel sub-networks;
step 2.4.3.1: the first subnetwork inputs said final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in said sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,psa,...,psA-1In which p issaRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,psa,...,psA-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 2.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caIn the representation of the a-th ID informationHeart characteristics;
step 2.4.4: the probability set P output by the first sub-network and the central feature set F' output by the second sub-network are comparedcReversely propagating to the footprint image retrieval model for self-adaptive updating of corresponding network parameters, thereby obtaining the footprint image retrieval model;
and step 3: ID information identifying the new sample;
step 3.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the feature set into the footprint image retrieval model, and finally outputting a feature set containing a plurality of ID information
Figure BDA0002596492340000041
And
Figure BDA0002596492340000042
step 3.2: computing feature sets extracted from a test query set
Figure BDA0002596492340000043
In the feature set extracted from each ID information and the test base library set respectively
Figure BDA0002596492340000044
Sorting according to the Euclidean distance in ascending order of the Euclidean distance among the ID information, and setting a retrieval threshold value according to a sorting result;
step 3.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of output
Figure BDA0002596492340000045
And performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention combines image processing, deep learning and footprint image retrieval to form a set of complete footprint image retrieval framework. In terms of image processing: a whole set of preprocessing method is adopted for the footprint image, so that the footprint image sample is optimized; in terms of network structure: the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention intensive network module and a final feature output module.
2. The image processing part of the invention enables the footprint image sample to be cleaned by removing the background noise of the image, and the integrity of the original footprint image information is kept to the maximum extent;
3. the preprocessing layer of the invention can lead the footprint images to be uniformly put into a neural network for training by adjusting the size of the footprint images in the footprint sample set to the same size; the images are normalized, so that the training time is shortened, and the obtained model is more suitable for the actual situation.
4. The mixed attention-intensive network module can more effectively extract the characteristic information. This module is implemented by a combination of a densely connected convolutional network and a hybrid attention mechanism. Each layer in the dense connection convolution network is directly connected with the previous layer, so that the characteristic information can be repeatedly utilized; meanwhile, each layer of the network is designed to be narrower, and redundant information can be reduced. The hybrid attention mechanism takes the advantages of the space attention mechanism and the channel attention mechanism into consideration, and can extract more representative characteristic information.
Drawings
FIG. 1 is an overall flow diagram of the lap footprint image retrieval in the present invention;
FIG. 2 is a diagram of a hybrid attention-intensive network architecture in accordance with the present invention;
fig. 3 is a dense block diagram of a mixed attention dense network in the present invention.
Detailed Description
In this embodiment, a footprint image retrieval method based on a mixed attention-intensive network mainly extracts feature information of a footprint image by using the mixed attention-intensive network. The neural network can extract detailed characteristic information of the footprint images through training, and then retrieval of the footprint images is carried out, so that retrieval speed is increased, and retrieval accuracy is greatly improved.
The data set adopted by the invention comprises 3500 footmark data, 35000 single footmark image data after preprocessing, and totally comprises 100 persons, wherein each person comprises at least 35 footmark data images, and each image is provided with a person ID information label. As shown in fig. 1: the whole process can be divided into the following steps:
step 1, taking a continuous footprint image of any test object in a walking state, and carrying out preprocessing operations of denoising and normalization to obtain a processed footprint image sample.
And 2, cutting the footprint sample containing the plurality of footprint images in the step 1, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines. The invention designs an algorithm, the pixel information of each column of a footprint image sample is counted, the temporary average pixel less than ten is taken as a gap in a footprint image, and when the length of continuous gaps exceeds a certain set threshold value, the column in which the centers of the continuous gaps are located is taken as the cut column. This algorithm can divide one footprint image into separate footprint image sample sets.
Step 3, defining the footprint samples in the footprint sample set in the step 2 into labels capable of distinguishing different ID information;
step 4, repeating the step 1 to the step 3, collecting a plurality of continuous footprint images of a plurality of test objects in a walking state and carrying out corresponding processing, thereby forming a footprint data set D;
step 5, dividing the data set D into three parts according to a ratio of 9:4:2, wherein the first part is a training set X; the second part is a test bottom library set Y1(ii) a The third part is a test query library set Y2. Wherein the first part of data and the second part of data do not existThe second part and the third part are different data of the same person for the repeated persons.
And 6, sending the footprint data set into a footprint image retrieval model based on a mixed attention-intensive network for training, and obtaining a footprint image retrieval pre-training model based on the mixed attention-intensive network through a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module. The initial feature information of the footprint image is extracted through the initial feature extraction module, the initial feature information is processed through the mixed attention intensive network module, finally the features are integrated through the final feature output module, and the rich and concise final feature information is output and acquired, so that the retrieval precision and speed of the footprint image retrieval are greatly improved.
As shown in FIG. 2, the stepping footprint image retrieval model is composed of a preprocessing layer, an initial feature extraction module, a mixed attention intensive network module and a final feature output module:
step 6.1: the preprocessing layer normalizes the footprint image training set X to obtain a footprint sample set X 'and inputs the footprint sample set X' into the initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in sample set X', S represents the total number of footprint samples;
step 6.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 6.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 6.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs m atThe shared weight of the convolutional layer;
step 6.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 6.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 6.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 6.3.1: initializing all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 6.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating the output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cA value representing the ith spatial position and the c-th channel position in the nth convolutional layer output; i represents the number of spatial positions, C represents the number of channels;
step 6.3.3: obtaining the nth layer in any one dense block by using the formula (2)Output result ZZ 'of n-th mixed attention layer of mixed attention dense network'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
Figure BDA0002596492340000071
step 6.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 6.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 6.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 6.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 6.4.2: middle feature of final convolutional layer and final pooling layer pair footprintThe F' is subjected to convolution and pooling operations in sequence and then sequentially passes through the full connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 6.4.3: feeding the final feature F' into two parallel sub-networks;
step 6.4.3.1: the first subnetwork will input the final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in the sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,psa,...,psA-1In which p issaRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,psa,...,psA-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 6.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caA center feature indicating the a-th kind of ID information;
step 6.4.4: the probability set P output by the first sub-network and the central feature set F' output by the second sub-network are comparedcMatching with a cross entropy loss function, reversely transmitting a central loss function to a footprint image retrieval model to perform self-adaptive updating on corresponding network parameters, so that each footprint sample is divided into correct ID information as much as possible, and the Euclidean distance between final features extracted from the footprint samples divided into the same ID information in the model is made as small as possible, thereby obtaining the footprint image retrieval model;
and 7: ID information identifying the new sample;
step 7.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the obtained feature set into the footprint image retrieval model, and finally outputting a final feature set containing a plurality of ID information
Figure BDA0002596492340000081
And
Figure BDA0002596492340000082
step 7.2: computing feature sets extracted from a test query set
Figure BDA0002596492340000091
In the feature set extracted from each ID information and the test base library set respectively
Figure BDA0002596492340000092
Sorting the Euclidean distances among the ID information in an ascending order according to the Euclidean distances, and setting a proper retrieval threshold value according to a sorting result;
step 7.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of output
Figure BDA0002596492340000093
And performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.

Claims (1)

1. A footprint image retrieval method based on a mixed attention-intensive network is characterized by comprising the following steps:
step 1: constructing a training set and a test set;
step 1.1: acquiring a footprint image containing a plurality of footprints of a test object in a walking state;
step 1.2: denoising the footprint image to obtain a processed footprint image sample;
step 1.3: cutting the footprint image sample, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines;
step 1.4: defining a tag for each individual footprint outline in the footprint sample set that can distinguish different ID information;
step 1.5: repeating the steps 1.1-1.4, so as to collect a plurality of footprint images of a plurality of test objects, and carrying out corresponding processing, thereby forming a footprint data set D;
step 1.6: dividing the footprint data set D into a training set X and a test set Y, and subdividing the test set into a test query set Y1And testing the bottom library set Y2(ii) a The training set X comprises A kinds of ID information, and the test query set and the test bottom library set both comprise B kinds of same ID information;
step 2: establishing a footprint image retrieval model based on a mixed attention-intensive network, wherein the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module;
step 2.1: the preprocessing layer is used for carrying out normalization processing on the training set X to obtain a footprint sample set X 'and inputting the footprint sample set X' into an initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in the training set X', S representing the total number of footprint samples;
step 2.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 2.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 2.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs the shared weight of the mth convolutional layer;
step 2.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 2.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 2.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 2.3.1: initializing weights of all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 2.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating said output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the output result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cRepresenting the ith spatial position in the output of the nth convolutional layerAnd the value of the c-th channel position; i represents the number of spatial positions, C represents the number of channels;
step 2.3.3: obtaining an output result ZZ 'of an n-th mixed attention layer of an n-th mixed attention dense network in any one dense block by using formula (2)'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
Figure FDA0002596492330000021
step 2.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz″nsIndicates that the output result ZZ ″)nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 2.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 2.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 2.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 2.4.2: the final convolution layer and the final pooling layer sequentially perform convolution and pooling operations on the footprint characteristics F' and then sequentially pass through the full-connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 2.4.3: feeding said final feature F "into two parallel sub-networks;
step 2.4.3.1: the first subnetwork inputs said final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in said sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,ps a,...,ps A-1In which p iss aRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,ps a,...,ps A-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 2.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caA center feature indicating the a-th kind of ID information;
step 2.4.4: combining the probability set P output by the first sub-network with the center feature set F output by the second sub-networkc"reversely transmitting to the footprint image retrieval model for self-adaptive updating of corresponding network parameters, thereby obtaining the footprint image retrieval model;
and step 3: ID information identifying the new sample;
step 3.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the feature set into the footprint image retrieval model, and finally outputting a feature set containing a plurality of ID information
Figure FDA0002596492330000031
And
Figure FDA0002596492330000032
step 3.2: computing feature sets extracted from a test query set
Figure FDA0002596492330000033
In the feature set extracted from each ID information and the test base library set respectively
Figure FDA0002596492330000034
Sorting according to the Euclidean distance in ascending order of the Euclidean distance among the ID information, and setting a retrieval threshold value according to a sorting result;
step 3.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of output
Figure FDA0002596492330000041
And performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.
CN202010710865.2A 2020-07-22 2020-07-22 Footprint image retrieval method based on mixed attention-dense network Active CN111782857B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010710865.2A CN111782857B (en) 2020-07-22 2020-07-22 Footprint image retrieval method based on mixed attention-dense network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010710865.2A CN111782857B (en) 2020-07-22 2020-07-22 Footprint image retrieval method based on mixed attention-dense network

Publications (2)

Publication Number Publication Date
CN111782857A true CN111782857A (en) 2020-10-16
CN111782857B CN111782857B (en) 2023-11-03

Family

ID=72763921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010710865.2A Active CN111782857B (en) 2020-07-22 2020-07-22 Footprint image retrieval method based on mixed attention-dense network

Country Status (1)

Country Link
CN (1) CN111782857B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257662A (en) * 2020-11-12 2021-01-22 安徽大学 Pressure footprint image retrieval system based on deep learning
CN113656623A (en) * 2021-08-17 2021-11-16 安徽大学 Time sequence shift and multi-branch space-time enhancement network-based stepping footprint image retrieval method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150023471A1 (en) * 2012-03-06 2015-01-22 Koninklijke Philips N.V. Stereo x-ray tube based suppression of outside body high contrast objects
WO2019237567A1 (en) * 2018-06-14 2019-12-19 江南大学 Convolutional neural network based tumble detection method
CN111177446A (en) * 2019-12-12 2020-05-19 苏州科技大学 Method for searching footprint image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150023471A1 (en) * 2012-03-06 2015-01-22 Koninklijke Philips N.V. Stereo x-ray tube based suppression of outside body high contrast objects
WO2019237567A1 (en) * 2018-06-14 2019-12-19 江南大学 Convolutional neural network based tumble detection method
CN111177446A (en) * 2019-12-12 2020-05-19 苏州科技大学 Method for searching footprint image

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈扬;曾诚;程成;邹恩岑;顾建伟;陆悠;奚雪峰;: "一种基于CNN的足迹图像检索与匹配方法", 南京师范大学学报(工程技术版), no. 03 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112257662A (en) * 2020-11-12 2021-01-22 安徽大学 Pressure footprint image retrieval system based on deep learning
CN113656623A (en) * 2021-08-17 2021-11-16 安徽大学 Time sequence shift and multi-branch space-time enhancement network-based stepping footprint image retrieval method

Also Published As

Publication number Publication date
CN111782857B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
CN108615010B (en) Facial expression recognition method based on parallel convolution neural network feature map fusion
CN106326886B (en) Finger vein image quality appraisal procedure based on convolutional neural networks
CN108830157B (en) Human behavior identification method based on attention mechanism and 3D convolutional neural network
CN106203395B (en) Face attribute recognition method based on multitask deep learning
CN111814661B (en) Human body behavior recognition method based on residual error-circulating neural network
CN108764072B (en) Blood cell subtype image classification method based on multi-scale fusion
CN112801040B (en) Lightweight unconstrained facial expression recognition method and system embedded with high-order information
CN110082821B (en) Label-frame-free microseism signal detection method and device
CN109902615B (en) Multi-age-group image generation method based on countermeasure network
CN112818764A (en) Low-resolution image facial expression recognition method based on feature reconstruction model
CN113221694B (en) Action recognition method
CN112347908B (en) Surgical instrument image identification method based on space grouping attention model
CN111782857A (en) Footprint image retrieval method based on mixed attention intensive network
CN106503616A (en) A kind of Mental imagery Method of EEG signals classification of the learning machine that transfinited based on layering
CN111368734B (en) Micro expression recognition method based on normal expression assistance
CN115966010A (en) Expression recognition method based on attention and multi-scale feature fusion
CN116012653A (en) Method and system for classifying hyperspectral images of attention residual unit neural network
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN110826534B (en) Face key point detection method and system based on local principal component analysis
CN115965864A (en) Lightweight attention mechanism network for crop disease identification
CN114863572A (en) Myoelectric gesture recognition method of multi-channel heterogeneous sensor
CN113705713B (en) Text recognition method based on global and local attention mechanisms
CN114495163A (en) Pedestrian re-identification generation learning method based on category activation mapping
CN113111797A (en) Cross-view gait recognition method combining self-encoder and view transformation model
CN113255543A (en) Facial expression recognition method based on graph convolution network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant