CN112966716A - Sketch-guided shoe print image retrieval method - Google Patents

Sketch-guided shoe print image retrieval method Download PDF

Info

Publication number
CN112966716A
CN112966716A CN202110152568.5A CN202110152568A CN112966716A CN 112966716 A CN112966716 A CN 112966716A CN 202110152568 A CN202110152568 A CN 202110152568A CN 112966716 A CN112966716 A CN 112966716A
Authority
CN
China
Prior art keywords
convolution
shoe print
sketch
shoe
print image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110152568.5A
Other languages
Chinese (zh)
Other versions
CN112966716B (en
Inventor
王新年
姜浩
王琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202110152568.5A priority Critical patent/CN112966716B/en
Publication of CN112966716A publication Critical patent/CN112966716A/en
Application granted granted Critical
Publication of CN112966716B publication Critical patent/CN112966716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4007Interpolation-based scaling, e.g. bilinear interpolation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture

Abstract

The invention provides a sketch guided shoe print image retrieval method, which comprises the following steps: constructing a shoe printing pattern content and semantic information collaborative model; and on-line retrieval of the shoe print image based on the cooperation of the pattern content characteristics and the semantic information. The invention introduces sketch guide in the defect area, can generate complete shoe print images which accord with the subjective feeling of human eyes no matter the random defect or the defect of a large area or even a half sole, and then uses the complete shoe print images for retrieval, thus solving the problems of insufficient information, unavailable cognition of patterns and the like, and further solving the problems of low retrieval precision and inaccurate matching of the shoe print images with large-area defect. In addition, the invention can also search the shoe print image in a complete hand-drawing sketch mode, thereby solving the search problem under the conditions that the background is complex and the on-site shoe print image can not be extracted.

Description

Sketch-guided shoe print image retrieval method
Technical Field
The invention relates to the technical field of image recognition, in particular to a sketch-guided shoe print image retrieval method.
Background
The current shoe print image retrieval algorithms are mainly divided into two categories. The first type is a shoe print image retrieval algorithm based on a conventional algorithm. The second category is the shoe print image retrieval algorithm based on deep learning. The specific contents of each method are as follows: (1) the shoe print image retrieval algorithm based on the traditional method mainly uses various image processing technologies to extract features. Such as fourier spectral features, histograms, SIFT features, etc. And then, measuring and sequencing the similarity by calculating to obtain a retrieval result of the shoe print image. If the neighborhood similarity estimation shoe print retrieval algorithm is adopted, the algorithm extracts three features of the shoe print image to be retrieved, namely the regional feature, the global feature and the Gabor feature, and then the shoe print retrieval performance is improved by utilizing the information contained in the adjacent image. (2) The shoe print image retrieval algorithm based on deep learning mainly extracts the depth features of the images by using a convolutional neural network, and then performs matching sequencing on the depth features through various similarity matching algorithms to obtain the retrieval result of the shoe print images. Such as shoe print image automatic retrieval algorithm based on neural network VGG16, the algorithm firstly performs rotation compensation on the shoe print image, eliminates the influence of rotation, and then obtains the matching score of the test image and the reference image by calculating the weighted sum of cosine similarity of the neural codes of the two areas.
The traditional shoe print image retrieval algorithm or the shoe print image retrieval algorithm based on deep learning has better retrieval effect on the defective shoe print images in small areas. But the method has no better retrieval effect on the shoe print images with defects in the large areas, mainly because the residual areas of the shoe print images are too few, the content information and the texture information are seriously insufficient, the retrieval of the defective shoe print images only depends on the designed characteristics, and the recognition of people on patterns cannot be added. Therefore, the problems of low retrieval precision, inaccurate matching and the like of the shoe print images with large area defects can occur.
Disclosure of Invention
According to the technical problem that the shoe print image with a defect large area does not have a good searching effect, the method for searching the shoe print image through sketch guide is characterized by comprising the following steps of:
step S1: constructing a shoe printing pattern content and semantic information collaborative model;
step S2: and on-line retrieval of the shoe print image based on the cooperation of the pattern content characteristics and the semantic information.
Further, the step S1 further includes the following steps:
step S11: fusing a shoe print image to generate a network through the full convolution span interpolation of the cavity;
step S12: introducing expansion convolution extraction features, and fusing a full convolution shoe print image discrimination network through a cavity;
step S13: constructing a loss function; the loss functions of the void full-convolution span interpolation fusion shoe printing image generation network comprise countermeasure loss, perception loss and content loss;
the calculation formula is as follows:
Figure BDA0002932153080000021
wherein the alpha, beta, gamma,
Figure BDA0002932153080000022
lambda is a weighting coefficient and is a coefficient,
Figure BDA0002932153080000023
the formula for the penalty is as follows:
Figure BDA0002932153080000024
Figure BDA0002932153080000025
is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiI is 1,2, …, n and Fi,i=1,2…, the mean square error between n and the L1 distance, the formula is as follows:
Figure BDA0002932153080000026
Figure BDA0002932153080000027
wherein c, W and H are the number of channels, width values and height values of the image, and G represents the generated network.
Figure BDA0002932153080000028
Is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiR is 1,2, …, n and FiAnd i is 1,2, …, n is the mean square error and the L1 distance between deep feature maps extracted by the VGG19, and the formula is as follows:
Figure BDA0002932153080000031
Figure BDA0002932153080000032
where φ represents the deep profile extracted through VGG 19.
The loss function of the hole fusion full-convolution shoe print image discrimination network adopts the loss function of WGAN-GP, and comprises antagonistic loss and gradient punishment. The calculation formula is as follows:
Figure BDA0002932153080000033
wherein x1=εSi+(1-ε)Si,i=1,2,…,n,x2=εFi+(1-ε)Ri,i=1,2,…,n,ε~uniform[0,1]G denotes passing through the generating network, DRepresenting the lambda gradient penalty parameter across the discriminating network.
Further, the generating network of the hole full convolution span interpolation fusion shoe print image in the step S11 further includes: 7 void span convolution modules; 7 deconvolution fusion modules and 5 bilinear interpolation upsampling modules;
each cavity span convolution module comprises a cavity convolution with an expansion coefficient of 2 and a step length of 1 and a span convolution with an expansion coefficient of 1 and a step length of 2; each deconvolution fusion module comprises a deconvolution with the step length of 2 and a series connection of three feature maps with the same size; each bilinear interpolation upsampling module performs bilinear interpolation operation on the target feature map, the size of the target feature map is amplified by one time, and in order to match the number of channels, 1 × 1 convolution operation is performed on the first three bilinear interpolation upsampling modules.
The hollow hole fusion full convolution shoe print image discrimination network in the step S12 further includes: 3 void convolution feature fusion modules and 3 span convolution down-sampling modules;
each cavity convolution feature fusion module comprises a common convolution with the step size of 1 and the expansion coefficient of 1, an expansion convolution with the step size of 1 and the expansion coefficient of 2, and an expansion convolution with the step size of 1 and the expansion coefficient of 3, and feature graphs extracted by three times of convolution are spliced with feature graphs which are not subjected to convolution; each span convolution downsampling module comprises a convolution with the step size of 2 to carry out downsampling operation on the feature map.
Further, the on-line shoe print image retrieval based on the cooperation of the pattern content features and the semantic information further comprises the following steps:
step S21: drawing pattern semantic information; supplementing the incomplete part with patterns in a sketch mode for the actually shot incomplete shoe printing image to form a mixed image of real-scene patterns and sketch patterns;
if the actually shot shoe print image is not available, and only the pattern form is known, then:
a. manually drawing a sketch of the shoe print on the blank paper, and then scanning to form the sketch of the shoe print;
b. and (4) directly drawing a sketch of the shoe print on the blank image through an input device such as a stylus and a mouse.
Step S22: retrieving the sole pattern image guided by sketch;
inputting a mixed picture of real-scene patterns and sketch patterns or a sketch map into the constructed shoe-print pattern content and semantic information collaborative model to generate a virtual sole pattern image;
and calculating the similarity score of the virtual sole pattern image and each shoe print image in the sole pattern image data set by adopting the conventional sole pattern image retrieval algorithm, sequencing the sole pattern images in the sole pattern image data set according to the similarity score by a preset rule, and outputting according to the sequencing mode.
Compared with the prior art, the invention has the following advantages:
the invention introduces sketch guide in the defect area, can generate complete shoe print images which accord with the subjective feeling of human eyes no matter the random defect or the defect of a large area or even a half sole, and then uses the complete shoe print images for retrieval, thus solving the problems of insufficient information, unavailable cognition of patterns and the like, and further solving the problems of low retrieval precision and inaccurate matching of the shoe print images with large-area defect. In addition, the invention can also search the shoe print image in a complete hand-drawing sketch mode, thereby solving the search problem under the conditions that the background is complex and the on-site shoe print image can not be extracted.
(1) According to the method, sketch guide is added to missing parts (including small-range missing, random missing and large-area missing) of shoe print images according to complete sole pattern textures, the sketch guide image is used for generating complete virtual shoe print images which are in line with the visual perception of human eyes, and then the complete shoe print images are used for shoe print image retrieval, so that the problems that information is insufficient, the cognition of patterns cannot be added and the like can be solved, and the problems that the retrieval accuracy of the residual and large-area missing shoe print images is low and the matching is inaccurate are solved.
(2) The problem of the complicated scene shoe print image of leading to the fact the complicated shoe print retrieval under the condition can't be extracted and the pattern semantic information introduction problem in the shoe print retrieval process is solved, help criminal investigation personnel to utilize the on-the-spot information of case issue as much as possible to carry out case detection, improve the efficiency of solving a case.
(3) The generation of the countermeasure network is improved on the basis of algorithm, a void full convolution span interpolation fusion shoe print image generation network and a void fusion full convolution shoe print image discrimination network are provided, the quality of generated images is improved on the basis of saving a large amount of calculation overhead, the loss function of the network is improved, and the difference between a shoe print image and a real complete shoe print image can be generated by the network in a pixel angle and a characteristic angle in a minimum mode.
For the above reasons, the present invention can be applied to popularization in the criminal investigation field.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is an overall flow chart of the method of the present invention.
FIG. 2 is a schematic diagram of the expansion of the present invention; wherein (a) is prior to expansion; (b) after expansion.
FIG. 3 is a schematic diagram of a void full convolution span interpolation fusion shoe print image generation network according to the present invention.
FIG. 4 is a schematic diagram of a void fusion full-convolution shoe print image discrimination network according to the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in the figure, the invention provides a sketch guided shoe print image retrieval method, which is characterized by comprising the following steps of:
step S1: and constructing a shoe printing pattern content and semantic information collaborative model.
As a preferred embodiment, in the present application, the process of constructing the collaborative model of the shoe print pattern content and the semantic information further comprises the following steps:
step S11: and fusing the shoe print image to generate a network through the full convolution span interpolation of the cavity. The hollow full convolution span interpolation fusion shoe print image generation network in the step S11 further includes: 7 void span convolution modules; 7 deconvolution fusion modules and 5 bilinear interpolation upsampling modules;
each cavity span convolution module comprises a cavity convolution with an expansion coefficient of 2 and a step length of 1 and a span convolution with an expansion coefficient of 1 and a step length of 2; each deconvolution fusion module comprises a deconvolution with the step length of 2 and a series connection of three feature maps with the same size; each bilinear interpolation upsampling module performs bilinear interpolation operation on the target feature map, the size of the target feature map is amplified by one time, and in order to match the number of channels, 1 × 1 convolution operation is performed on the first three bilinear interpolation upsampling modules.
Introducing double-line interpolation in a generating network to finish up-sampling and use for feature fusion; performing feature fusion on the feature map which is not subjected to down-sampling, the feature map which is subjected to up-sampling after interpolation and the feature map which is subjected to up-sampling by deconvolution, combining the feature map after down-sampling, performing bilinear interpolation on the feature map after the down-sampling next time and the multi-scale information of the feature map which is subjected to up-sampling after the down-sampling to the bottom layer and then subjected to deconvolution amplification, and enhancing the fitting force of the network;
step S12: and introducing expansion convolution to extract features, and fusing full-convolution shoe print image discrimination networks through the holes. The hollow hole fusion full convolution shoe print image discrimination network in the step S12 further includes: 3 void convolution feature fusion modules and 3 span convolution down-sampling modules;
each cavity convolution feature fusion module comprises a common convolution with the step size of 1 and the expansion coefficient of 1, an expansion convolution with the step size of 1 and the expansion coefficient of 2, and an expansion convolution with the step size of 1 and the expansion coefficient of 3, and feature graphs extracted by three times of convolution are spliced with feature graphs which are not subjected to convolution; each span convolution downsampling module comprises a convolution with the step size of 2 to carry out downsampling operation on the feature map.
In the discrimination network, convolution with stride of 2 is used to replace maximum pooling to realize downsampling, but before each downsampling, convolution with expansion coefficient of 1, expansion coefficient of 2 and expansion coefficient of 3 is used to extract features, and the three feature maps and the non-convolution feature map are subjected to feature fusion, so that features with different scales are extracted on the basis of saving calculation cost, discrimination capability of the discrimination network is improved, and the generation of the network can be better guided to generate images with higher quality.
Further, the feature diagram is directly judged to be true or false. The feature map with the size of 32 multiplied by 16 multiplied by 1 is finally output in the judgment network, the feature map is directly judged without mapping a full connection layer into a vector, the speed is high, the calculation cost is low, one point of the feature map is equivalent to a small area in a picture, and the picture is equivalent to the regional judgment.
Step S13: constructing a loss function; the loss functions of the void full-convolution span interpolation fusion shoe printing image generation network comprise countermeasure loss, perception loss and content loss;
the calculation formula is as follows:
Figure BDA0002932153080000071
wherein the alpha, beta, gamma,
Figure BDA0002932153080000072
lambda is a weighting coefficient and is a coefficient,
Figure BDA0002932153080000073
the formula for the penalty is as follows:
Figure BDA0002932153080000074
Figure BDA0002932153080000075
is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiI is 1,2, …, n and FiI is 1,2, …, the mean square error between n and the L1 distance, the formula is as follows:
Figure BDA0002932153080000076
Figure BDA0002932153080000077
wherein c, W and H are the number of channels, width values and height values of the image, and G represents the generated network.
Figure BDA0002932153080000078
Is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiR is 1,2, …, n and FiAnd i is 1,2, …, n is the mean square error and the L1 distance between deep feature maps extracted by the VGG19, and the formula is as follows:
Figure BDA0002932153080000079
Figure BDA0002932153080000081
where φ represents the deep profile extracted through VGG 19.
Meanwhile, the loss function of the hole fusion full convolution shoe print image discrimination network adopts the loss function of WGAN-GP, including the countermeasure loss and the gradient punishment. The calculation formula is as follows:
Figure BDA0002932153080000082
wherein x1=εSi+(1-ε)Si,i=1,2,…,n,x2=εFi+(1-ε)Ri,i=1,2,…,n,ε~uniform[0,1]G represents the lambda gradient penalty parameter passing through the generation network, and D represents the lambda gradient penalty parameter passing through the discrimination network.
The data augmentation and training comprises the following steps: as shown in fig. 1, the image expansion and enhancement includes the following steps:
taking the collected complete shoe print image as a set A to form a target image set.
Recording a set formed by incomplete shoe print images corresponding to each complete shoe print in the set A as a set B, namely an incomplete image set, and randomly erasing pattern contents on the images in the set A to form corresponding incomplete images if the images in the set A have no corresponding incomplete images;
and thirdly, drawing the missing part of the incomplete shoe print image in the set B by input equipment such as a handwriting pen, a mouse and the like to form a mixed image of the real-scene pattern and the sketch pattern. The set of the mixed images of the real-scene patterns and the sketch patterns formed by all the images in the set B in this way is marked as C, namely the set of the mixed images of the real-scene patterns and the sketch patterns;
manually drawing a paper sketch of each image in the A on blank paper, scanning to form a sketch of the shoe print, and recording a set of the shoe print sketch formed in the way as a digital set of the paper sketch;
directly drawing a sketch of the shoe print on each image in the A through input equipment such as a stylus and a mouse, and recording a set of the shoe print sketch formed in this way as a digital sketch set;
sixthly, horizontally turning over the left foot and right foot shoe print images in the set C, D, E, and marking the left foot shoe print image obtained after turning over the right foot shoe print image as a left foot shoe print image amplification image; in the same way, the left and right shoe print images are complemented and enhanced, and C, D, E and the set of its augmented images are denoted as S, S ═ S { (S)i1,2, …, n }; take out from set A and S ═ SiThe shoe print image corresponding to | i ═ 1,2, …, n } is taken as the target shoe print image set and is marked as R, R ═ Ri|i=1,2,…,n};
2. Will SiAnd i is 1,2, …, n is input into a generating network, and the generating network generates a complete virtual shoe print image which is marked as FiI ═ 1,2, …, n; r is a handleiI is 1,2, …, n and FiInputting i to 1,2, …, and n to VGG19 to obtain a deep feature map; will SiI is 1,2, …, n and FiI-1, 2, …, n are input into the discrimination network together with the image pair, and S is inputiI is 1,2, …, n and RiI is 1,2, …, n is input into the discrimination network together with the image pair;
3. when S isiI is 1,2, …, n and FiWhen the i is 1,2, …, n constitutes an image pair and is input into the discrimination network together, the discrimination network wants to discriminate the result as false, and the generation network wants to discriminate the result as true; when S isiI is 1,2, …, n and RiI is 1,2, …, nWhen the image pairs are input into the discrimination network together, the discrimination network hopes to discriminate the result as true; the generation network and the discrimination network continuously resist and are continuously promoted in the training process until the discrimination network cannot judge whether the input is true or false, namely a Nash equilibrium state is reached, and the generation effect of the generation network reaches the best;
4. and training and storing the cavity full-convolution span interpolation fused shoe print image generation network and the cavity full-convolution shoe print image discrimination network.
Step S2: and on-line retrieval of the shoe print image based on the cooperation of the pattern content characteristics and the semantic information. Further, the on-line shoe print image retrieval based on the cooperation of the pattern content features and the semantic information further comprises the following steps:
step S21: drawing pattern semantic information; supplementing the incomplete part with patterns in a sketch mode for the actually shot incomplete shoe printing image to form a mixed image of real-scene patterns and sketch patterns;
if the actually shot shoe print image is not available, and only the pattern form is known, then:
a. manually drawing a sketch of the shoe print on the blank paper, and then scanning to form the sketch of the shoe print;
b. and (4) directly drawing a sketch of the shoe print on the blank image through an input device such as a stylus and a mouse.
Step S22: retrieving the sole pattern image guided by sketch;
inputting a mixed picture of real-scene patterns and sketch patterns or a sketch map into the constructed shoe-print pattern content and semantic information collaborative model to generate a virtual sole pattern image;
and calculating the similarity score of the virtual sole pattern image and each shoe print image in the sole pattern image data set by adopting the conventional sole pattern image retrieval algorithm, sequencing the sole pattern images in the sole pattern image data set according to the similarity score by a preset rule, and outputting according to the sequencing mode.
Example 1
The flow of the dilation convolution is shown in FIG. 2, where a full large black square represents a 3X 3 volumeAnd the product kernel consists of 9 black small squares, and each small square represents a weight. The calculation formula of the size of the convolution kernel after expansion is f ═ d (f)0-1)+1,f0Since d represents the original convolution kernel size and the expansion coefficient (scaling rate), when d is 2, f is 5, as shown by the 5 × 5 convolution kernel in the figure, but the number of weights is not changed, and the remaining small white squares represent 0, the expansion convolution can increase the field of the convolution kernel without introducing extra parameters.
Fig. 3 is a network for generating a hole full convolution span interpolation fusion shoe print image, in which a solid line frame portion is a hole span convolution module, a dashed line frame portion is a deconvolution fusion module, a dot-dashed line frame portion is a bilinear interpolation upsampling module, a solid connecting line represents convolution, S represents a span (Stride), d represents an expansion coefficient (scaling rate), Conv3 represents that a convolution kernel size is 3 × 3, Conv1 represents that a convolution kernel size is 1 × 1, and numbers following Conv3 represent the number of convolution kernels. For example, in the figure, conv3,512, S is 1, d is 2, and it indicates that the convolution kernel size of the convolution layer is 3 × 3, the number of convolution kernels is 512, the span is 1, and the expansion coefficient is 2. The dotted lines represent the concatenation (Concat), the dotted lines represent the deconvolution, and ConvT3 represents the deconvolution with a convolution kernel size of 3 × 3. The thick connecting lines represent bilinear interpolation.
FIG. 4 is a void fusion full convolution shoe print image discrimination network, wherein the solid line frame portion is a void convolution feature fusion module and the dashed line frame portion is a span convolution down-sampling module. The solid connecting line indicates convolution, S indicates a span (Stride), d indicates a dilation rate (dilation rate), Conv3 indicates a convolution kernel size of 3 × 3, and the number following Conv3 indicates the number of convolution kernels. For example, in the figure, conv3,256, S is 1, d is 2, and it indicates that the convolution kernel size of the convolution layer is 3 × 3, the number of convolution kernels is 256, the span is 1, and the expansion coefficient is 2. The point connecting lines represent splicing (Concat), the final output of the network is a feature diagram of 32 multiplied by 16 multiplied by 1, a full connecting layer in a traditional discriminator is removed, and the feature diagram is directly judged to be true or false.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments. In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. A sketch guided shoe print image retrieval method is characterized by comprising the following steps:
s1: constructing a shoe printing pattern content and semantic information collaborative model;
s2: and on-line retrieval of the shoe print image based on the cooperation of the pattern content characteristics and the semantic information.
2. The sketch-guided shoe print image retrieval method according to claim 1, wherein said step S1 further comprises the steps of:
s11: fusing a shoe print image to generate a network through the full convolution span interpolation of the cavity;
s12: introducing expansion convolution extraction features, and fusing a full convolution shoe print image discrimination network through a cavity;
s13: constructing a loss function; the loss functions of the void full-convolution span interpolation fusion shoe printing image generation network comprise countermeasure loss, perception loss and content loss;
the calculation formula is as follows:
Figure FDA0002932153070000011
wherein the alpha, beta, gamma,
Figure FDA0002932153070000012
lambda is a weighting coefficient and is a coefficient,
Figure FDA0002932153070000013
the formula for the penalty is as follows:
Figure FDA0002932153070000014
Figure FDA0002932153070000015
is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiI is 1,2, …, n and FiI is 1,2, …, the mean square error between n and the L1 distance, the formula is as follows:
Figure FDA0002932153070000016
Figure FDA0002932153070000017
wherein c, W and H are the number of channels, width values and height values of the image, and G represents the generated network.
Figure FDA0002932153070000018
Is RiI is 1,2, …, n and FiI-1, 2, …, n, measured as RiR is 1,2, …, n and FiAnd i is 1,2, …, n is the mean square error and the L1 distance between deep feature maps extracted by the VGG19, and the formula is as follows:
Figure FDA0002932153070000019
Figure FDA0002932153070000021
wherein phi represents a deep level feature map extracted by VGG 19;
the loss function of the hole fusion full-convolution shoe print image discrimination network adopts the loss function of WGAN-GP, including countermeasure loss and gradient punishment; the calculation formula is as follows:
Figure FDA0002932153070000022
wherein x1=εSi+(1-ε)Si,i=1,2,…,n,x2=εFi+(1-ε)Ri,i=1,2,…,n,ε~uniform[0,1]G represents the lambda gradient penalty parameter passing through the generation network, and D represents the lambda gradient penalty parameter passing through the discrimination network.
3. The sketch-guided shoe print image retrieval method according to claim 1, wherein said step S11 of generating the hollow full-convolution span interpolation-fused shoe print image network further comprises: 7 void span convolution modules; 7 deconvolution fusion modules and 5 bilinear interpolation upsampling modules;
each cavity span convolution module comprises a cavity convolution with an expansion coefficient of 2 and a step length of 1 and a span convolution with an expansion coefficient of 1 and a step length of 2; each deconvolution fusion module comprises a deconvolution with the step length of 2 and a series connection of three feature maps with the same size; each bilinear interpolation upsampling module performs bilinear interpolation operation on the target characteristic diagram, the size of the target characteristic diagram is amplified by one time, and in order to match the number of channels, 1 multiplied by 1 convolution operation is performed on the first three bilinear interpolation upsampling modules;
the hollow hole fusion full convolution shoe print image discrimination network in the step S12 further includes: 3 void convolution feature fusion modules and 3 span convolution down-sampling modules;
each cavity convolution feature fusion module comprises a common convolution with the step size of 1 and the expansion coefficient of 1, an expansion convolution with the step size of 1 and the expansion coefficient of 2, and an expansion convolution with the step size of 1 and the expansion coefficient of 3, and feature graphs extracted by three times of convolution are spliced with feature graphs which are not subjected to convolution; each span convolution downsampling module comprises a convolution with the step size of 2 to carry out downsampling operation on the feature map.
4. The sketch-guided shoe print image retrieval method according to claim 1, wherein the on-line shoe print image retrieval based on the cooperation of the pattern content features and the semantic information further comprises the following steps:
s21: drawing pattern semantic information; supplementing the incomplete part with patterns in a sketch mode for the actually shot incomplete shoe printing image to form a mixed image of real-scene patterns and sketch patterns;
if the actually shot shoe print image is not available, and only the pattern form is known, then:
a. manually drawing a sketch of the shoe print on the blank paper, and then scanning to form the sketch of the shoe print;
b. and (4) directly drawing a sketch of the shoe print on the blank image through an input device such as a stylus and a mouse.
S22: retrieving the sole pattern image guided by sketch;
inputting a mixed picture of real-scene patterns and sketch patterns or a sketch map into the constructed shoe-print pattern content and semantic information collaborative model to generate a virtual sole pattern image;
and calculating the similarity score of the virtual sole pattern image and each shoe print image in the sole pattern image data set by adopting the conventional sole pattern image retrieval algorithm, sequencing the sole pattern images in the sole pattern image data set according to the similarity score by a preset rule, and outputting according to the sequencing mode.
CN202110152568.5A 2021-02-03 2021-02-03 Sketch-guided shoe print image retrieval method Active CN112966716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110152568.5A CN112966716B (en) 2021-02-03 2021-02-03 Sketch-guided shoe print image retrieval method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110152568.5A CN112966716B (en) 2021-02-03 2021-02-03 Sketch-guided shoe print image retrieval method

Publications (2)

Publication Number Publication Date
CN112966716A true CN112966716A (en) 2021-06-15
CN112966716B CN112966716B (en) 2023-10-27

Family

ID=76275057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110152568.5A Active CN112966716B (en) 2021-02-03 2021-02-03 Sketch-guided shoe print image retrieval method

Country Status (1)

Country Link
CN (1) CN112966716B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776950A (en) * 2016-12-02 2017-05-31 大连海事大学 A kind of field shoe impression mark decorative pattern image search method based on expertise guiding
US20190244061A1 (en) * 2018-02-05 2019-08-08 The Regents Of The University Of California Local binary pattern networks methods and systems
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
CN110689092A (en) * 2019-10-18 2020-01-14 大连海事大学 Sole pattern image depth clustering method based on data guidance
WO2020215236A1 (en) * 2019-04-24 2020-10-29 哈尔滨工业大学(深圳) Image semantic segmentation method and system
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776950A (en) * 2016-12-02 2017-05-31 大连海事大学 A kind of field shoe impression mark decorative pattern image search method based on expertise guiding
US20190244061A1 (en) * 2018-02-05 2019-08-08 The Regents Of The University Of California Local binary pattern networks methods and systems
KR20190119261A (en) * 2018-04-12 2019-10-22 가천대학교 산학협력단 Apparatus and method for segmenting of semantic image using fully convolutional neural network based on multi scale image and multi scale dilated convolution
WO2020215236A1 (en) * 2019-04-24 2020-10-29 哈尔滨工业大学(深圳) Image semantic segmentation method and system
CN110689092A (en) * 2019-10-18 2020-01-14 大连海事大学 Sole pattern image depth clustering method based on data guidance
CN112287940A (en) * 2020-10-30 2021-01-29 西安工程大学 Semantic segmentation method of attention mechanism based on deep learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
XINNIAN WANG ET AL.: "A manifold ranking based method using hybrid features for crime scene shoeprint retrieval", 《SPRINGER LINK》, pages 21629 - 21649 *
史文韬 等: "基于微调VGG-16的现场鞋印检索算法", 《中国人民公安大学学报》, vol. 26, no. 03, pages 22 - 29 *
翟鹏博;杨浩;宋婷婷;余亢;马龙祥;黄向生;: "结合注意力机制的双路径语义分割", 中国图象图形学报, no. 08, pages 119 - 128 *
青晨;禹晶;肖创柏;段娟;: "深度卷积神经网络图像语义分割研究进展", 中国图象图形学报, no. 06, pages 5 - 26 *

Also Published As

Publication number Publication date
CN112966716B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN111080629B (en) Method for detecting image splicing tampering
CN108898137B (en) Natural image character recognition method and system based on deep neural network
CN110348319B (en) Face anti-counterfeiting method based on face depth information and edge image fusion
CN110909651B (en) Method, device and equipment for identifying video main body characters and readable storage medium
CN111160249A (en) Multi-class target detection method of optical remote sensing image based on cross-scale feature fusion
JP6345147B2 (en) Method for detecting an object in a pair of stereo images
CN104298995B (en) Three-dimensional face identifying device and method based on three-dimensional point cloud
CN111401324A (en) Image quality evaluation method, device, storage medium and electronic equipment
US11625906B2 (en) Method for identifying works of art at the stroke level
CN106548169A (en) Fuzzy literal Enhancement Method and device based on deep neural network
CN106557740B (en) The recognition methods of oil depot target in a kind of remote sensing images
CN109902585A (en) A kind of three modality fusion recognition methods of finger based on graph model
CN113095158A (en) Handwriting generation method and device based on countermeasure generation network
CN113870254B (en) Target object detection method and device, electronic equipment and storage medium
CN116798041A (en) Image recognition method and device and electronic equipment
CN109165551B (en) Expression recognition method for adaptively weighting and fusing significance structure tensor and LBP characteristics
CN112966716A (en) Sketch-guided shoe print image retrieval method
CN110910497A (en) Method and system for realizing augmented reality map
CN112926569B (en) Method for detecting natural scene image text in social network
CN114120129A (en) Three-dimensional identification method for landslide slip surface based on unmanned aerial vehicle image and deep learning
Rani et al. Pre filtering techniques for face recognition based on edge detection algorithm
CN113724273A (en) Edge light and shadow fusion method based on neural network regional target segmentation
Rasheed et al. A Novel Method for Signature Verification Using Deep Learning
CN111612045A (en) Universal method for acquiring target detection data set
CN112668621B (en) Image quality evaluation method and system based on cross-source image translation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant