CN111782857A - Footprint image retrieval method based on mixed attention intensive network - Google Patents
Footprint image retrieval method based on mixed attention intensive network Download PDFInfo
- Publication number
- CN111782857A CN111782857A CN202010710865.2A CN202010710865A CN111782857A CN 111782857 A CN111782857 A CN 111782857A CN 202010710865 A CN202010710865 A CN 202010710865A CN 111782857 A CN111782857 A CN 111782857A
- Authority
- CN
- China
- Prior art keywords
- footprint
- layer
- output
- sample
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 14
- 238000012360 testing method Methods 0.000 claims abstract description 38
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims description 21
- 238000013527 convolutional neural network Methods 0.000 claims description 18
- 238000011176 pooling Methods 0.000 claims description 15
- 238000011423 initialization method Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 230000001174 ascending effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract description 5
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000036544 posture Effects 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Library & Information Science (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Databases & Information Systems (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a footprint image retrieval method based on a mixed attention-intensive network, which comprises the following steps: 1. preparing a footprint image data set; 2. establishing a footprint image preprocessing module; 3. establishing an initial feature extraction module; 4. establishing a mixed attention intensive network module; 5. establishing a final characteristic output module; 6. initializing the weight; 7. and training, testing and optimizing the network. The method can acquire more abundant feature information of the footprint image and extract feature information of different personal differences as much as possible, thereby improving the precision and speed of the footprint image retrieval.
Description
Technical Field
The invention relates to the field of image processing, in particular to a footprint image retrieval method based on a mixed attention-intensive network.
Background
Due to the difference of human skeletons and walking postures, the footprints of every person are unique and are more difficult to disguise compared with characteristics such as fingerprints. Therefore, the method has great scientific research value for relevant research of footprints, and can be applied to criminal reconnaissance, safety protection and the like.
With the rapid development of deep learning, the neural network is widely applied in the field of computer vision, and the footprint image retrieval is also advanced due to the introduction of the deep learning method. In the past, the retrieval of the footprint images is mostly finished by footprint experts, and the method is not only easily influenced by personal subjectivity, but also has low speed.
Disclosure of Invention
The invention provides a footprint image retrieval method based on a mixed attention-intensive network to solve the defects of the prior art, so that more abundant feature information of the footprint image can be acquired, and feature information of different personal differences can be extracted as much as possible, thereby improving the precision and speed of the footprint image retrieval.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a footprint image retrieval method based on a mixed attention-intensive network, which is characterized by comprising the following steps of:
step 1: constructing a training set and a test set;
step 1.1: acquiring a footprint image containing a plurality of footprints of a test object in a walking state;
step 1.2: denoising the footprint image to obtain a processed footprint image sample;
step 1.3: cutting the footprint image sample, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines;
step 1.4: defining a tag for each individual footprint outline in the footprint sample set that can distinguish different ID information;
step 1.5: repeating the steps 1.1-1.4, so as to collect a plurality of footprint images of a plurality of test objects, and carrying out corresponding processing, thereby forming a footprint data set D;
step 1.6: dividing the footprint data set D into a training set X and a test set Y, and subdividing the test set into a test query set Y1And testing the bottom library set Y2(ii) a The training set X comprises A kinds of ID information, and the test query set and the test bottom library set both comprise B kinds of same ID information;
step 2: establishing a footprint image retrieval model based on a mixed attention-intensive network, wherein the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module;
step 2.1: the preprocessing layer is used for carrying out normalization processing on the training set X to obtain a footprint sample set X 'and inputting the footprint sample set X' into an initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in the training set X', S representing the total number of footprint samples;
step 2.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 2.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 2.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m:
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs the shared weight of the mth convolutional layer;
step 2.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 2.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 2.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 2.3.1: initializing weights of all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 2.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating said output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the output result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cA value representing the ith spatial position and the c-th channel position in the nth convolutional layer output; i represents the number of spatial positions, C represents the number of channels;
step 2.3.3: obtaining an output result ZZ 'of an n-th mixed attention layer of an n-th mixed attention dense network in any one dense block by using formula (2)'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
step 2.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz″nsIndicates that the output result ZZ ″)nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 2.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 2.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 2.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 2.4.2: the final convolution layer and the final pooling layer sequentially perform convolution and pooling operations on the footprint characteristics F' and then sequentially pass through the full-connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 2.4.3: feeding said final feature F "into two parallel sub-networks;
step 2.4.3.1: the first subnetwork inputs said final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in said sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,psa,...,psA-1In which p issaRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,psa,...,psA-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 2.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caIn the representation of the a-th ID informationHeart characteristics;
step 2.4.4: the probability set P output by the first sub-network and the central feature set F' output by the second sub-network are comparedcReversely propagating to the footprint image retrieval model for self-adaptive updating of corresponding network parameters, thereby obtaining the footprint image retrieval model;
and step 3: ID information identifying the new sample;
step 3.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the feature set into the footprint image retrieval model, and finally outputting a feature set containing a plurality of ID informationAnd
step 3.2: computing feature sets extracted from a test query setIn the feature set extracted from each ID information and the test base library set respectivelySorting according to the Euclidean distance in ascending order of the Euclidean distance among the ID information, and setting a retrieval threshold value according to a sorting result;
step 3.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of outputAnd performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.
Compared with the prior art, the invention has the beneficial effects that:
1. the invention combines image processing, deep learning and footprint image retrieval to form a set of complete footprint image retrieval framework. In terms of image processing: a whole set of preprocessing method is adopted for the footprint image, so that the footprint image sample is optimized; in terms of network structure: the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention intensive network module and a final feature output module.
2. The image processing part of the invention enables the footprint image sample to be cleaned by removing the background noise of the image, and the integrity of the original footprint image information is kept to the maximum extent;
3. the preprocessing layer of the invention can lead the footprint images to be uniformly put into a neural network for training by adjusting the size of the footprint images in the footprint sample set to the same size; the images are normalized, so that the training time is shortened, and the obtained model is more suitable for the actual situation.
4. The mixed attention-intensive network module can more effectively extract the characteristic information. This module is implemented by a combination of a densely connected convolutional network and a hybrid attention mechanism. Each layer in the dense connection convolution network is directly connected with the previous layer, so that the characteristic information can be repeatedly utilized; meanwhile, each layer of the network is designed to be narrower, and redundant information can be reduced. The hybrid attention mechanism takes the advantages of the space attention mechanism and the channel attention mechanism into consideration, and can extract more representative characteristic information.
Drawings
FIG. 1 is an overall flow diagram of the lap footprint image retrieval in the present invention;
FIG. 2 is a diagram of a hybrid attention-intensive network architecture in accordance with the present invention;
fig. 3 is a dense block diagram of a mixed attention dense network in the present invention.
Detailed Description
In this embodiment, a footprint image retrieval method based on a mixed attention-intensive network mainly extracts feature information of a footprint image by using the mixed attention-intensive network. The neural network can extract detailed characteristic information of the footprint images through training, and then retrieval of the footprint images is carried out, so that retrieval speed is increased, and retrieval accuracy is greatly improved.
The data set adopted by the invention comprises 3500 footmark data, 35000 single footmark image data after preprocessing, and totally comprises 100 persons, wherein each person comprises at least 35 footmark data images, and each image is provided with a person ID information label. As shown in fig. 1: the whole process can be divided into the following steps:
step 1, taking a continuous footprint image of any test object in a walking state, and carrying out preprocessing operations of denoising and normalization to obtain a processed footprint image sample.
And 2, cutting the footprint sample containing the plurality of footprint images in the step 1, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines. The invention designs an algorithm, the pixel information of each column of a footprint image sample is counted, the temporary average pixel less than ten is taken as a gap in a footprint image, and when the length of continuous gaps exceeds a certain set threshold value, the column in which the centers of the continuous gaps are located is taken as the cut column. This algorithm can divide one footprint image into separate footprint image sample sets.
Step 3, defining the footprint samples in the footprint sample set in the step 2 into labels capable of distinguishing different ID information;
step 4, repeating the step 1 to the step 3, collecting a plurality of continuous footprint images of a plurality of test objects in a walking state and carrying out corresponding processing, thereby forming a footprint data set D;
step 5, dividing the data set D into three parts according to a ratio of 9:4:2, wherein the first part is a training set X; the second part is a test bottom library set Y1(ii) a The third part is a test query library set Y2. Wherein the first part of data and the second part of data do not existThe second part and the third part are different data of the same person for the repeated persons.
And 6, sending the footprint data set into a footprint image retrieval model based on a mixed attention-intensive network for training, and obtaining a footprint image retrieval pre-training model based on the mixed attention-intensive network through a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module. The initial feature information of the footprint image is extracted through the initial feature extraction module, the initial feature information is processed through the mixed attention intensive network module, finally the features are integrated through the final feature output module, and the rich and concise final feature information is output and acquired, so that the retrieval precision and speed of the footprint image retrieval are greatly improved.
As shown in FIG. 2, the stepping footprint image retrieval model is composed of a preprocessing layer, an initial feature extraction module, a mixed attention intensive network module and a final feature output module:
step 6.1: the preprocessing layer normalizes the footprint image training set X to obtain a footprint sample set X 'and inputs the footprint sample set X' into the initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in sample set X', S represents the total number of footprint samples;
step 6.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 6.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 6.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m:
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs m atThe shared weight of the convolutional layer;
step 6.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 6.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 6.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 6.3.1: initializing all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 6.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating the output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cA value representing the ith spatial position and the c-th channel position in the nth convolutional layer output; i represents the number of spatial positions, C represents the number of channels;
step 6.3.3: obtaining the nth layer in any one dense block by using the formula (2)Output result ZZ 'of n-th mixed attention layer of mixed attention dense network'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
step 6.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 6.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 6.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 6.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 6.4.2: middle feature of final convolutional layer and final pooling layer pair footprintThe F' is subjected to convolution and pooling operations in sequence and then sequentially passes through the full connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 6.4.3: feeding the final feature F' into two parallel sub-networks;
step 6.4.3.1: the first subnetwork will input the final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in the sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,psa,...,psA-1In which p issaRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,psa,...,psA-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 6.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caA center feature indicating the a-th kind of ID information;
step 6.4.4: the probability set P output by the first sub-network and the central feature set F' output by the second sub-network are comparedcMatching with a cross entropy loss function, reversely transmitting a central loss function to a footprint image retrieval model to perform self-adaptive updating on corresponding network parameters, so that each footprint sample is divided into correct ID information as much as possible, and the Euclidean distance between final features extracted from the footprint samples divided into the same ID information in the model is made as small as possible, thereby obtaining the footprint image retrieval model;
and 7: ID information identifying the new sample;
step 7.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the obtained feature set into the footprint image retrieval model, and finally outputting a final feature set containing a plurality of ID informationAnd
step 7.2: computing feature sets extracted from a test query setIn the feature set extracted from each ID information and the test base library set respectivelySorting the Euclidean distances among the ID information in an ascending order according to the Euclidean distances, and setting a proper retrieval threshold value according to a sorting result;
step 7.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of outputAnd performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.
Claims (1)
1. A footprint image retrieval method based on a mixed attention-intensive network is characterized by comprising the following steps:
step 1: constructing a training set and a test set;
step 1.1: acquiring a footprint image containing a plurality of footprints of a test object in a walking state;
step 1.2: denoising the footprint image to obtain a processed footprint image sample;
step 1.3: cutting the footprint image sample, and extracting the outline of the footprint image by using a canny operator to obtain a footprint sample set containing a plurality of single footprint outlines;
step 1.4: defining a tag for each individual footprint outline in the footprint sample set that can distinguish different ID information;
step 1.5: repeating the steps 1.1-1.4, so as to collect a plurality of footprint images of a plurality of test objects, and carrying out corresponding processing, thereby forming a footprint data set D;
step 1.6: dividing the footprint data set D into a training set X and a test set Y, and subdividing the test set into a test query set Y1And testing the bottom library set Y2(ii) a The training set X comprises A kinds of ID information, and the test query set and the test bottom library set both comprise B kinds of same ID information;
step 2: establishing a footprint image retrieval model based on a mixed attention-intensive network, wherein the footprint image retrieval model consists of a preprocessing layer, an initial feature extraction module, a mixed attention-intensive network module and a final feature output module;
step 2.1: the preprocessing layer is used for carrying out normalization processing on the training set X to obtain a footprint sample set X 'and inputting the footprint sample set X' into an initial feature extraction module, wherein X 'is { X's|s=1,2,…S},x′sIs the S-th footprint sample in the training set X', S representing the total number of footprint samples;
step 2.2: constructing an initial feature extraction module consisting of M layers of convolutional neural networks; any mth layer of convolutional neural network sequentially comprises: the mth convolution layer, the mth active layer, the mth pooling layer;
step 2.2.1: initializing weights of all convolution layers in the initial characteristic extraction module by using a Gaussian random initialization method;
step 2.2.2: obtaining an output Z of the mth convolutional layer by using the formula (1)m:
Zm=Wm·Xm+Bm(1)
In the formula (1), XmAn input image of a portion to be convolved for the mth convolution layer; b ismIs the step length S of the mth convolution layermLower bias, WmIs the shared weight of the mth convolutional layer;
step 2.2.3: output result Z of mth convolutional layermObtaining output Z 'of the mth layer convolutional neural network through the mth active layer and the mth pooling layer'mAnd Z'm={z′ms|s=1,2,…S},z′msRepresents output Z'mThe output of the s-th footprint sample;
step 2.2.4: output Z 'of the m-th layer convolutional neural network'mAs an input of the (M + 1) th layer of convolutional neural network, after the processing of the M layer of convolutional neural network, finally outputting the initial footprint characteristic F, and inputting the initial footprint characteristic F into the mixed attention-intensive network module, wherein F ═ { F ═ Fs|s=1,2,…S},fsRepresenting the initial feature of the s th footprint sample in the footprint initial features F;
step 2.3: establishing a mixed attention intensive network module consisting of K intensive blocks, wherein each intensive block is connected by a convolution layer;
any one of the dense blocks is composed of N layers of mixed attention dense networks, wherein the mixed attention dense network of the nth layer in any one of the dense blocks sequentially comprises: the nth convolution layer, the nth mixed attention layer and the nth splicing layer;
step 2.3.1: initializing weights of all convolution layers in the mixed attention intensive network module by using a Gaussian random initialization method;
step 2.3.2: the nth convolution layer in any one of the dense blocks obtains an output ZZ result by using the formula (1)nWherein ZZn={zzns|s=1,2,…S},zznsIndicating said output result ZZnThe output of the s-th footprint sample;
according to the spatial position i and the channel position c of the convolution layer output characteristic, outputting the output result ZZnIs shown as ZZn={ZZn i,c|i=1,2,…I,c=1,2…,C},ZZn i,cRepresenting the ith spatial position in the output of the nth convolutional layerAnd the value of the c-th channel position; i represents the number of spatial positions, C represents the number of channels;
step 2.3.3: obtaining an output result ZZ 'of an n-th mixed attention layer of an n-th mixed attention dense network in any one dense block by using formula (2)'nAnd ZZ'n={zz′ns|s=1,2,…S},zz′nsRepresents an output result ZZ'nOutput of the s-th footprint sample:
ZZ′n i,c=(1+Mi,c)·ZZn i,c(2)
in the formula (2), Mi,cOutput result ZZ representing nth mixed attention layernThe weight of the ith space position and the c channel position is as follows:
step 2.3.4: obtaining the output result ZZ' of the nth layer mixed attention dense network by using the formula (4)nAnd ZZ ″)n={zz″ns|s=1,2,…S},zz″nsIndicates that the output result ZZ ″)nOutput of the s-th footprint sample:
ZZ″n=concat(ZZ′1,ZZ′2,...,ZZ′n) (4)
in formula (4), concat (·) represents a splicing operation;
step 2.3.5: the output result of the N-th layer mixed attention dense network in each dense block is processed by one layer of convolution layer and then input into the next dense block, so that the intermediate characteristic F ' is finally output after the processing of K dense blocks and the corresponding convolution layers thereof, and the F ' is { F 's|s=1,2,…S},f′sIntermediate features representing the s-th footprint sample in the intermediate features F';
step 2.4: establishing a connection consisting of a final convolutional layer, a final pooling layer, two full-connection layers FC1And FC2And a final feature output module consisting of two parallel sub-networks;
step 2.4.1: initializing weights of the convolution layer and the full-connection layer in the final characteristic output module by using a Gaussian random initialization method;
step 2.4.2: the final convolution layer and the final pooling layer sequentially perform convolution and pooling operations on the footprint characteristics F' and then sequentially pass through the full-connection layer FC1And FC2The final characteristic F "is then obtained, and F ═ F ″s|s=1,2,…S},f″sRepresenting the final feature of the s-th footprint sample in the footprint feature information F';
step 2.4.3: feeding said final feature F "into two parallel sub-networks;
step 2.4.3.1: the first subnetwork inputs said final feature F' into the fully connected output layer FC with the same number A of classes of all ID information in said sample set3Processing, and finally outputting a probability set P (P ═ P) by processing the obtained result through a SoftMax functions|s=1,2,…S},psAn output probability set representing the s-th footprint sample in the probability set P, and having: p is a radical ofs={ps0,ps1,...,ps a,...,ps A-1In which p iss aRepresenting the probability that the s-th footprint sample belongs to the a-th ID information, from the output probability set { ps0,ps1,...,ps a,...,ps A-1Selecting a subscript corresponding to the maximum value as ID information identified by the s-th footprint sample;
step 2.4.3.2: the final feature F ″, which the second subnetwork will outputcObtaining the average value according to each ID information to obtain the central feature set F ″' of each ID informationcAnd F ″)c={F″c0,F″c1,...,F″ca,...,F″cA-1Wherein, F ″)caA center feature indicating the a-th kind of ID information;
step 2.4.4: combining the probability set P output by the first sub-network with the center feature set F output by the second sub-networkc"reversely transmitting to the footprint image retrieval model for self-adaptive updating of corresponding network parameters, thereby obtaining the footprint image retrieval model;
and step 3: ID information identifying the new sample;
step 3.1: test query set Y containing the same ID information1And testing the bottom library set Y2Inputting the feature set into the footprint image retrieval model, and finally outputting a feature set containing a plurality of ID informationAnd
step 3.2: computing feature sets extracted from a test query setIn the feature set extracted from each ID information and the test base library set respectivelySorting according to the Euclidean distance in ascending order of the Euclidean distance among the ID information, and setting a retrieval threshold value according to a sorting result;
step 3.3: inputting any one to-be-identified footprint sample into the footprint image retrieval model and then outputting a final feature to be identified; respectively connecting the final features to be identified with the test bottom library set Y2Feature set of outputAnd performing Euclidean distance calculation, and taking the ID information corresponding to the calculation result smaller than the retrieval threshold value as the ID information of the footprint sample to be identified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010710865.2A CN111782857B (en) | 2020-07-22 | 2020-07-22 | Footprint image retrieval method based on mixed attention-dense network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010710865.2A CN111782857B (en) | 2020-07-22 | 2020-07-22 | Footprint image retrieval method based on mixed attention-dense network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111782857A true CN111782857A (en) | 2020-10-16 |
CN111782857B CN111782857B (en) | 2023-11-03 |
Family
ID=72763921
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010710865.2A Active CN111782857B (en) | 2020-07-22 | 2020-07-22 | Footprint image retrieval method based on mixed attention-dense network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111782857B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257662A (en) * | 2020-11-12 | 2021-01-22 | 安徽大学 | Pressure footprint image retrieval system based on deep learning |
CN113656623A (en) * | 2021-08-17 | 2021-11-16 | 安徽大学 | Time sequence shift and multi-branch space-time enhancement network-based stepping footprint image retrieval method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150023471A1 (en) * | 2012-03-06 | 2015-01-22 | Koninklijke Philips N.V. | Stereo x-ray tube based suppression of outside body high contrast objects |
WO2019237567A1 (en) * | 2018-06-14 | 2019-12-19 | 江南大学 | Convolutional neural network based tumble detection method |
CN111177446A (en) * | 2019-12-12 | 2020-05-19 | 苏州科技大学 | Method for searching footprint image |
-
2020
- 2020-07-22 CN CN202010710865.2A patent/CN111782857B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150023471A1 (en) * | 2012-03-06 | 2015-01-22 | Koninklijke Philips N.V. | Stereo x-ray tube based suppression of outside body high contrast objects |
WO2019237567A1 (en) * | 2018-06-14 | 2019-12-19 | 江南大学 | Convolutional neural network based tumble detection method |
CN111177446A (en) * | 2019-12-12 | 2020-05-19 | 苏州科技大学 | Method for searching footprint image |
Non-Patent Citations (1)
Title |
---|
陈扬;曾诚;程成;邹恩岑;顾建伟;陆悠;奚雪峰;: "一种基于CNN的足迹图像检索与匹配方法", 南京师范大学学报(工程技术版), no. 03 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112257662A (en) * | 2020-11-12 | 2021-01-22 | 安徽大学 | Pressure footprint image retrieval system based on deep learning |
CN113656623A (en) * | 2021-08-17 | 2021-11-16 | 安徽大学 | Time sequence shift and multi-branch space-time enhancement network-based stepping footprint image retrieval method |
Also Published As
Publication number | Publication date |
---|---|
CN111782857B (en) | 2023-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108615010B (en) | Facial expression recognition method based on parallel convolution neural network feature map fusion | |
CN106326886B (en) | Finger vein image quality appraisal procedure based on convolutional neural networks | |
CN108830157B (en) | Human behavior identification method based on attention mechanism and 3D convolutional neural network | |
CN106203395B (en) | Face attribute recognition method based on multitask deep learning | |
CN111814661B (en) | Human body behavior recognition method based on residual error-circulating neural network | |
CN108764072B (en) | Blood cell subtype image classification method based on multi-scale fusion | |
CN112801040B (en) | Lightweight unconstrained facial expression recognition method and system embedded with high-order information | |
CN110082821B (en) | Label-frame-free microseism signal detection method and device | |
CN109902615B (en) | Multi-age-group image generation method based on countermeasure network | |
CN112818764A (en) | Low-resolution image facial expression recognition method based on feature reconstruction model | |
CN113221694B (en) | Action recognition method | |
CN112347908B (en) | Surgical instrument image identification method based on space grouping attention model | |
CN111782857A (en) | Footprint image retrieval method based on mixed attention intensive network | |
CN106503616A (en) | A kind of Mental imagery Method of EEG signals classification of the learning machine that transfinited based on layering | |
CN111368734B (en) | Micro expression recognition method based on normal expression assistance | |
CN115966010A (en) | Expression recognition method based on attention and multi-scale feature fusion | |
CN116012653A (en) | Method and system for classifying hyperspectral images of attention residual unit neural network | |
CN114170657A (en) | Facial emotion recognition method integrating attention mechanism and high-order feature representation | |
CN110826534B (en) | Face key point detection method and system based on local principal component analysis | |
CN115965864A (en) | Lightweight attention mechanism network for crop disease identification | |
CN114863572A (en) | Myoelectric gesture recognition method of multi-channel heterogeneous sensor | |
CN113705713B (en) | Text recognition method based on global and local attention mechanisms | |
CN114495163A (en) | Pedestrian re-identification generation learning method based on category activation mapping | |
CN113111797A (en) | Cross-view gait recognition method combining self-encoder and view transformation model | |
CN113255543A (en) | Facial expression recognition method based on graph convolution network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |