CN110674334B - Near-repetitive image retrieval method based on consistency region deep learning features - Google Patents

Near-repetitive image retrieval method based on consistency region deep learning features Download PDF

Info

Publication number
CN110674334B
CN110674334B CN201910869635.8A CN201910869635A CN110674334B CN 110674334 B CN110674334 B CN 110674334B CN 201910869635 A CN201910869635 A CN 201910869635A CN 110674334 B CN110674334 B CN 110674334B
Authority
CN
China
Prior art keywords
image
sift
feature
region
pairs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910869635.8A
Other languages
Chinese (zh)
Other versions
CN110674334A (en
Inventor
周志立
孙文迪
周煜
孙星明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Maidian Media Technology Co.,Ltd.
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN201910869635.8A priority Critical patent/CN110674334B/en
Publication of CN110674334A publication Critical patent/CN110674334A/en
Application granted granted Critical
Publication of CN110674334B publication Critical patent/CN110674334B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a near-repetitive image retrieval method based on a consistency region deep learning characteristic, which specifically comprises the following steps: extracting SIFT features of all images in an image library, quantizing the SIFT features into visual words, and establishing an inverted index file for all the SIFT features; k target regions of each image are reserved, and CNN characteristic C (R) of the target regions is calculatedc) (ii) a Extracting SIFT characteristics of the query image, and quantizing the SIFT characteristics into visual words; finding out candidate images by utilizing the inverted index file; finding a near-repeat region in the query image that approximately repeats with each target region of each candidate image; extracting CNN feature C (R) of near-repetitive regionQ) (ii) a Calculating any C (R)c) C (R) corresponding to the CNN featureQ) As a similarity score for the group; in each candidate image, a group of scores with the highest cosine similarity is selected as the similarity score between the candidate image and the query image. The invention greatly improves the accuracy of image retrieval while improving the retrieval efficiency.

Description

Near-repetitive image retrieval method based on consistency region deep learning features
Technical Field
The invention belongs to the field of information safety, and particularly relates to a near-repetitive image retrieval method based on a consistency region deep learning characteristic.
Background
Digital image data is increasingly being illegally copied, tampered and transmitted over networks due to the widespread use of powerful image processing tools and the rapid development of internet technology. In fact, these illegal images are near-duplicate images that share a small copy area and undergo various image modifications such as rescaling, occlusion, noise addition, and brightness and color changes. In order to prevent unauthorized use and privacy violation of image content, it has become an urgent problem to detect illegal partially copied versions of copyrighted images. Therefore, as a branch of content-based image retrieval, near-repetitive image retrieval plays a very important role in the field of copyright and privacy protection. In addition, the method is also applied to other emerging fields, such as information hiding, image labeling and near-repetitive image redundancy removal.
In recent years, deep learning features have been successfully used in content-based image retrieval tasks and they provide superior performance compared to traditional manual features. According to the way of feature extraction, the existing image retrieval methods based on CNN features are mainly divided into two categories: image-based CNN features and region-based CNN features. Generally, the CNN feature based on an image directly takes the activation value of a convolutional layer or a fully-connected layer as the CNN feature. Of these, the most representative is to directly input the image into a pre-trained or fine-tuned convolutional Neural network and extract the CNN features from the fully connected layers in the network (Krizhevsky A, Sutskeeper I, and Hinton G, imaging classification with a default Neural networks [ C ],2012 advanced in Neural information processing Systems,2012: 1097-1105.). However, CNN features extracted from fully connected layers tend to lack spatial location information, resulting in extracted CNN features having limited discriminative power. To improve the identification of CNN features, we began to switch from fully-connected layers to CNN feature extraction from convolutional layers, mainly because convolutional layer features consist of the activation values of the convolutional filters, containing rich local spatial information (Babenko A and Lempitsky V, Aggregating discrete conditional defects for image retrieval [ J ], Computer Science, 2015; Kalantis Y, Mellina C, and Osindeno S, Cross-dimensional weighting for acquiring discrete conditional defects [ C ],2016 European Convergence Computer Vision,2016: 685-. Since image-based CNN features mainly describe the visual pattern or semantic meaning of the entire image, intuitively, these methods are not suitable for retrieving near-duplicate images that share small partial regions. Unlike image-based CNN features, such methods of region-based CNN features generally extract CNN features from image regions using the regions as basic units. It is noted that such methods mostly obtain image regions by simply dividing the image into a series of image blocks or directly using existing Region detection methods, such as Selective search (Uijlinks, J.R., Sande V, et al, Selective search for object Recognition [ J ], International journal of Computer Vision,2013,104(2): 154. about. 171.), EdgeBox (Zitnick C and Doll. rP, EdgeBox: locating objects probes [ C ],2014European Convergence Computer Vision,2014: 391. 405.) and Region detection network (Regixiasal. RPN, Salvador. Nile. about. X, Mars F, sample F. about. 2016. about. center. about. 2016. about. Regijjjjjjjjjjjjjc). Although these algorithms can meet the requirements for generating image regions to some extent, if the same region detection method is used for the candidate image and the query image, when the images are subjected to a series of image attacks, inconsistent region pairs can be detected between the near-duplicate images, which can seriously affect the accuracy of image retrieval.
Although research on near-duplicate image retrieval has been greatly advanced, the existing near-duplicate image retrieval method mainly has the following technical problems:
1) most of the existing near-duplicate image retrieval methods are based on feature extraction and matching of the whole image, and are not suitable for retrieving the near-duplicate images sharing small partial copy areas.
2) The existing near-duplicate image retrieval method uses the same region detection method for a candidate image and a query image, and when the images are attacked by a series of images, the detected region pairs between the near-duplicate images are inconsistent.
3) The existing near-repetitive image retrieval method generally directly takes the activation value extracted by a convolutional layer or a full link layer as a CNN feature, and the efficiency of feature extraction and matching is reduced due to overhigh dimensionality.
4) The existing near-repetitive image retrieval method generally directly performs region detection and feature extraction on all images in an image library, and irrelevant images in the images consume more time cost, so that the image retrieval efficiency is reduced.
Disclosure of Invention
The purpose of the invention is as follows: the problems that the existing retrieval technology is not suitable for sharing the near-repeated images of the small partial copy area, the retrieval efficiency is low and the like are solved; the invention provides a near-repetitive image retrieval method based on a consistency region deep learning characteristic.
The technical scheme is as follows: the invention provides a near-repetitive image retrieval method based on a depth learning characteristic of a consistency region; the method specifically comprises the following steps:
step 1: extracting SIFT characteristics of all images in an image library;
step 2: quantizing each SIFT feature into a visual word by using a K-means clustering method, and considering any two SIFT features which come from different images and have the same visual word as mutually matched; establishing an inverted index file for all SIFT features based on the visual words;
step 3, calculating and obtaining a target area of each image by utilizing an EdgeBox algorithm, deleting the target area with the area smaller than M/5 × N/5, wherein M and N are the width and the height of the image respectively, leaving k target areas in the rest target areas, deleting other target areas, and calculating CNN characteristics C (R) of each target area by utilizing an improved CNN characteristic extraction methodc);
And 4, step 4: extracting SIFT characteristics of the query image; the SIFT features of the query image are quantized into visual words by using a K-means clustering method; finding out candidate images by utilizing the inverted index file; the candidate image is an SIFT feature pair with more than 5 pairs between the candidate image and the query image in the image library; the pair of SIFT feature pairs consists of two mutually matched SIFT features;
and 5: finding out a near-repetition region approximately repeated with each target region in the query image according to SIFT feature pairs existing between the query image and each target region in each candidate image; forming a set of near-repetitive region pairs by the near-repetitive region and the target region;
step 6: using improved CNN feature extraction method to extract CNN feature C (R) of near-repetitive region in any group of near-repetitive region pairsQ) (ii) a Centering the group of near-repetitive regions on C (R)Q) And C (R)C) The cosine similarity of (a) as the similarity score of the group; in each candidate imageAnd selecting a group of scores with the highest cosine similarity as the similarity scores between the candidate image and the query image.
Further, in step 2 or step 4, quantizing each SIFT feature into a visual word, specifically: and performing K-means clustering on all the extracted SIFT features, thereby dividing all the SIFT features into E categories, wherein each category is represented by one visual word.
Furthermore, in the step 3, each target region with an area greater than or equal to M/5 × N/5 is arranged according to the number of the SIFT features included therein from high to low, and the first k target regions are selected.
Further, the specific method of step 5 is as follows:
step 5.1: finding out n pairs of SIFT feature pairs between the query image and a certain target region in a certain candidate image by using the inverted index file;
step 5.2: randomly selecting n from n pairs of SIFT feature pairssFor the pair of SIFT features,
Figure BDA0002202388360000031
Figure BDA0002202388360000032
y is an SIFT feature pair with Y pairs of real matches in the SIFT feature pairs of n pairs, and Y is more than or equal to n and less than 1; the actually matched SIFT feature pairs consist of two SIFT features which are from different images and have consistent content description on the images; p (n)s) Is at nsThe probability of at least one pair of actually matched SIFT feature pairs in the feature pairs;
step 5.3: according to nsAny one of the pairs of features fQ=[σQQ,(xQ,yQ)T]And fC=[σCC,(xC,yC)T]Wherein f isQTo query for SIFT features in an image, σQ、θQ、(xQ,yQ) Respectively representing the scale, the main direction and the coordinate of the SIFT feature; f. ofCAs SIFT feature in the target region, σC、θC、(xC,yC) Respectively representing the scale, the main direction and the coordinate of the SIFT feature; determining a near repeat region by using the formula that there is n between the query image and the target regionsA pair of closely repeating regions;
Figure BDA0002202388360000041
wherein (u)Q,vQ)T、wQAnd hQRespectively near repeat region R in the query imageCCenter coordinates, width and height;
Figure BDA0002202388360000042
further, the method for extracting the CNN feature in step 3 or step 6 specifically includes: taking any one target region/near-repetitive region as an input image of the AlexNet model, and outputting 256 feature maps with the size of W multiplied by H by the model to obtain a feature vector with the dimension of W multiplied by H multiplied by 256; w and H are the width and height of the feature map respectively and are in direct proportion to the width and height of the input image; compressing the size W × H of each feature map to m × m using a summing pooling aggregation operation; merging and summing pooling aggregation operations for every 256/d feature maps with the size of m × m, thereby obtaining feature vectors with dimensions of m × m × d, wherein d is more than 0 and less than 256, and d is a multiple of 256; finally, the generated m × m × d-dimensional feature vector is normalized by L2, and the normalized m × m × d-dimensional feature vector is used as the CNN feature of the input image.
Further, in step 6, the method for calculating the cosine similarity includes:
Figure BDA0002202388360000043
has the advantages that:
(1) according to the method, SIFT feature matching based on the BOW model is adopted, some irrelevant images are filtered according to SIFT feature matching results, and the number of candidate images is greatly reduced, so that near-repeated image retrieval can be realized more quickly.
(2) Because the SIFT features are robust to common attacks, the invention uses the characteristics of the SIFT features to detect visually consistent region pairs, so that visually consistent region pairs are detected between near-duplicate images when subjected to common image attacks.
(3) The invention adopts a two-stage sum-firing strategy, so that compact CNN characteristics are generated while the space coding of a coding region is fully realized.
(4) The CNN features obtained by calculation have strong identification capability, can capture the semantic characteristics of the image, and improve the accuracy of image retrieval.
Drawings
FIG. 1 is a general framework schematic of the present invention;
FIG. 2 is a schematic diagram of the structure of the inverted index in the present invention.
Detailed Description
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate an embodiment of the invention and, together with the description, serve to explain the invention and not to limit the invention.
As shown in fig. 1, the present embodiment provides a near-duplicate image retrieval method based on a depth learning feature of a consistent region: in the off-line stage, SIFT features are extracted from all images in the image library, then each SIFT feature is quantized into a visual word by using a K-Means clustering method, and the visual word is stored in a constructed inverted index file. And in the online stage, the same feature extraction and quantization methods are used for the input query image, the similarity between the quantized SIFT features and the features in the index file is calculated, the obtained similarity results are sequenced, and the image related to the query image, namely the candidate image, is sequentially output. The above process is to use the Bag-of-visual-words (BOW) model to perform image retrieval. In addition, in order to reduce the calculation complexity of detecting the target area in the image and extracting the features, the method uses the existing area detection EdgeBox algorithm to extract the target area from all the images in the image library in an off-line stage, and extracts the CNN features from all the target areas in the candidate images. In order to ensure that the region pairs with visual consistency are detected between the near-repeated images, in an online stage, the characteristics of SIFT features are fully utilized to position the near-repeated regions consistent with the target region in the query image, the near-repeated region pairs are formed, and compact CNN features are extracted from the near-repeated region pairs, so that the accuracy and efficiency of near-repeated image retrieval are improved. The specific steps are as follows:
step 1: and extracting 128-dimensional SIFT features from all the images in the image library.
Step 2: BOW quantization is carried out on the extracted SIFT features: and performing K-means clustering on all the extracted SIFT features, dividing all the extracted SIFT features into E categories, representing each category by using a visual word, and classifying the SIFT features quantized to the same visual word into one category. The set of all visual word labels constitutes a visual dictionary. Thus, each image can be described by several visual words.
And step 3: in order to improve the efficiency of image retrieval, an inverted index is established for all SIFT features. The indexed features not only record the ID of the image to which they belong, but also its orientation, scale and coordinates and other relevant information. This information will further be used to generate potential near-duplicate region pairs. The inverted index is shown in fig. 2.
And 4, step 4: by using an inverted index structure, SIFT features quantized from any two different images to the same visual word are considered to match, and the similarity between the images is measured by counting the number of SIFT features shared between the two images. When images in the image library share 5 pairs or more of SIFT feature pairs with the input query image, the images are considered as candidate images. Therefore, a large number of irrelevant images can be filtered out, and the time complexity of image detection areas and feature extraction is reduced.
And 5: since the EdgeBox algorithm can achieve high recall by computing an information edge map, meaningful target regions are detected from the image that are most likely to be copied and propagated between near-duplicate images. Furthermore, the edge calculation of the algorithm is efficient and the calculated edge map is sparse with low computational complexity. Most importantly, the algorithm can directly detect the target region from the edge information of the image without a learning process based on a deep learning network. Therefore, the algorithm has strong flexibility. The method comprises the following specific steps:
step 5-1: a set of target regions is detected for each candidate image using the EdgeBox algorithm.
Step 5-2: in order to avoid that small regions negatively affect the image retrieval, the present embodiment will delete regions with an area smaller than M/5 × N/5, where M and N are the width and height of the image, respectively.
Step 5-3: theoretically, for a detected target region, the number of SIFT features may reflect the texture complexity thereof to some extent, because the number of SIFT features extracted from a region with good texture is much larger than the SIFT features extracted from a flat region. Therefore, in order to save computing resources, all target regions detected in the candidate image are sorted in a descending order according to the SIFT feature quantity contained in each region, and the first k target regions (detected regions) are reserved; other target areas are deleted.
Step 6: according to the SIFT feature pair between the query image and any one target region in any one candidate image, finding out a near-repetition region approximately repeated with the target region in the query image; and forming a group of near-repeated region pairs by the near-repeated region and the target region. The details are as follows:
step 6-1: the method comprises the steps of utilizing an inverted index file to find n pairs of SIFT feature pairs existing between a query image and a target region in a candidate image, wherein the number n of the SIFT feature pairs can be as high as hundreds, and if all SIFT feature pairs are directly matched to position corresponding potential near-repetition region pairs in the query image, although many correct near-repetition region pairs can be positioned, the calculation consumption is very large. In practice, the accuracy of near-duplicate image detection can be ensured only by ensuring that the positioned near-duplicate region pair at least comprises a pair of truly matched SIFT feature pairs; the true matched SIFT feature pairThe image processing method comprises the following steps of (1) forming two SIFT features which come from different graphs and are consistent in description of image content; therefore, to reduce the amount of computation, assume that the probability of a true match is pT
Figure BDA0002202388360000071
Y is a feature pair which exists in n pairs of feature pairs and is true match, and when n is randomly selectedSWhen the SIFT features are matched, the probability of the SIFT feature pair at least containing one real match is approximated as:
Figure BDA0002202388360000072
therefore, pick nSThe near-repetitive region pairs are positioned for the SIFT feature matching pairs, so that at least one pair of SIFT feature matching pairs can be guaranteed to be real matching, and at least one pair of correct near-repetitive region pairs can be positioned.
Step 6-2: the detection of the SIFT features is based on the content of the image, so the scale, principal direction and coordinates of the feature points of the local features are changed together with scaling, rotation and translation transformations, respectively. Therefore, the parameters of the transformation can be estimated from the property variation between two matching local features.
Assume two SIFT features fQAnd fCIs respectively [ sigma ]QQ,(xQ,yQ)T]And [ sigma ]CC,(xC,yC)T](ii) a Wherein f isQTo query for SIFT features in an image, σQ、θQ、(xQ,yQ) Respectively representing the scale, the main direction and the coordinate of the SIFT feature; f. ofCAs SIFT feature in the target region, σC、θC、(xC,yC) Respectively representing the scale, principal direction and coordinates of the SIFT feature. A near repeat region (localized region) is determined using the formula that there is n between the query image and the target regionsA pair of closely repeating regions;
Figure BDA0002202388360000073
wherein (u)Q,vQ)T、wQAnd hQRespectively near repeat region R in the query imageCCenter coordinates, width and height;
Figure BDA0002202388360000074
intuitively, if the two features are a true match, then RCAnd RQIt is likely that the correct near-duplicate region pair.
And 7: after detecting potential near-duplicate region pairs, extracting compact CNN features for these near-duplicate region pairs by the following steps:
step 7-1: when any target region/near-repetitive region is used as an input image of the AlexNet model, the model outputs 256 feature maps with the size of W × H, and a feature vector with the dimension of W × H × 256 can be obtained.
Step 7-2: entering the first sum-posing stage, for an input area of any size, applying spatial sum-posing of size m × m to activation of the area to obtain a feature map of dimension m × m × 256.
And 7-3: entering a second sum-firing stage, compressing the features by summarizing the activation values of the m × m × 256 dimensional feature map and concatenating the results to generate a feature vector of m × m × d dimensions. Where 256 is a multiple of d. Finally, the generated m × m × d-dimensional feature vector is normalized by L2, and the normalized m × m × d-dimensional feature is regarded as a CNN feature.
And 8: in the online retrieval stage, the CNN characteristics between the near-repeat region and the target region of the candidate image are compared to measure the similarity between the two images so as to achieve the purpose of retrieving the near-repeat image version. For a given repeating region pair RQ(near repeat region) and RC(target region) and their corresponding CNN features are C (R) respectivelyQ) And C (R)C) It calculates the cosine similarity:
Figure BDA0002202388360000081
and step 9: and selecting the scores of a group of near-repeated region pairs with the highest cosine similarity score between the query image and a candidate image as the similarity score between the query image and the candidate image.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims (5)

1. The near-repetitive image retrieval method based on the consistency region deep learning features is characterized by comprising the following steps:
step 1: extracting SIFT characteristics of all images in an image library;
step 2: quantizing each SIFT feature into a visual word by using a K-means clustering method, and considering any two SIFT features which come from different images and have the same visual word as mutually matched; establishing an inverted index file for all SIFT features based on the visual words;
step 3, calculating and obtaining a target area of each image by utilizing an EdgeBox algorithm, deleting the target area with the area smaller than M/5 × N/5, wherein M and N are the width and the height of the image respectively, leaving k target areas in the rest target areas, deleting other target areas, and calculating CNN characteristics C (R) of each target area by utilizing an improved CNN characteristic extraction methodc);
And 4, step 4: extracting SIFT characteristics of the query image; the SIFT features of the query image are quantized into visual words by using a K-means clustering method; finding out candidate images by utilizing the inverted index file; the candidate image is an SIFT feature pair with more than 5 pairs between the candidate image and the query image in the image library; the pair of SIFT feature pairs consists of two mutually matched SIFT features;
and 5: finding out a near-repetition region approximately repeated with each target region in the query image according to SIFT feature pairs existing between the query image and each target region in each candidate image; forming a set of near-repetitive region pairs by the near-repetitive region and the target region;
step 6: using improved CNN feature extraction method to extract CNN feature C (R) of near-repetitive region in any group of near-repetitive region pairsQ) (ii) a Centering the group of near-repetitive regions on C (R)Q) And C (R)C) The cosine similarity of (a) as the similarity score of the group; selecting a group of scores with highest cosine similarity in each candidate image as similarity scores between the candidate image and the query image;
the method for extracting the CNN feature in step 3 or step 6 specifically includes: taking any one target region/near-repetitive region as an input image of the AlexNet model, and outputting 256 feature maps with the size of W multiplied by H by the model to obtain a feature vector with the dimension of W multiplied by H multiplied by 256; w and H are the width and height of the feature map respectively and are in direct proportion to the width and height of the input image; compressing the size W × H of each feature map to m × m using a summing pooling aggregation operation; merging and summing pooling aggregation operations for every 256/d feature maps with the size of m × m, thereby obtaining feature vectors with dimensions of m × m × d, wherein d is more than 0 and less than 256, and d is a multiple of 256; finally, the generated m × m × d-dimensional feature vector is normalized by L2, and the normalized m × m × d-dimensional feature vector is used as the CNN feature of the input image.
2. The method according to claim 1, wherein each SIFT feature is quantized into a visual word in step 2 or step 4, specifically: and performing K-means clustering on all the extracted SIFT features, thereby dividing all the SIFT features into E categories, wherein each category is represented by one visual word.
3. The method according to claim 1, wherein in step 3, each target region having an area greater than or equal to M/5 × N/5 is ranked according to the number of SIFT features included therein, and the first k target regions are selected.
4. The method according to claim 1, wherein the specific method of the step 5 is as follows:
step 5.1: finding out n pairs of SIFT feature pairs between the query image and a certain target region in a certain candidate image by using the inverted index file;
step 5.2: randomly selecting n from n pairs of SIFT feature pairssFor the pair of SIFT features,
Figure FDA0002526267380000021
Figure FDA0002526267380000022
y is an SIFT feature pair with Y pairs of real matches in the SIFT feature pairs of n pairs, and Y is more than or equal to n and less than 1; the actually matched SIFT feature pairs consist of two SIFT features which are from different images and have consistent content description on the images; p (n)s) Is at nsThe probability of at least one pair of actually matched SIFT feature pairs in the feature pairs;
step 5.3: according to nsAny one of the pairs of features fQ=[σQQ,(xQ,yQ)T]And fC=[σCC,(xC,yC)T]Wherein f isQTo query for SIFT features in an image, σQ、θQ、(xQ,yQ) Respectively representing the scale, the main direction and the coordinate of the SIFT feature; f. ofCAs SIFT feature in the target region, σC、θC、(xC,yC) Respectively representing the scale, the main direction and the coordinate of the SIFT feature; determining a near repeat region by using the formula that there is n between the query image and the target regionsA pair of closely repeating regions;
Figure FDA0002526267380000023
wherein (u)Q,vQ)T、wQAnd hQRespectively near repeat region R in the query imageCCenter coordinates, width and height;
Figure FDA0002526267380000024
Figure FDA0002526267380000025
5. the method of claim 1, wherein in step 6, the cosine similarity is calculated by:
Figure FDA0002526267380000031
CN201910869635.8A 2019-09-16 2019-09-16 Near-repetitive image retrieval method based on consistency region deep learning features Active CN110674334B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910869635.8A CN110674334B (en) 2019-09-16 2019-09-16 Near-repetitive image retrieval method based on consistency region deep learning features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910869635.8A CN110674334B (en) 2019-09-16 2019-09-16 Near-repetitive image retrieval method based on consistency region deep learning features

Publications (2)

Publication Number Publication Date
CN110674334A CN110674334A (en) 2020-01-10
CN110674334B true CN110674334B (en) 2020-08-11

Family

ID=69078297

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910869635.8A Active CN110674334B (en) 2019-09-16 2019-09-16 Near-repetitive image retrieval method based on consistency region deep learning features

Country Status (1)

Country Link
CN (1) CN110674334B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859004A (en) * 2020-07-29 2020-10-30 书行科技(北京)有限公司 Retrieval image acquisition method, device, equipment and readable storage medium
CN113688261B (en) * 2021-08-25 2023-10-13 山东极视角科技股份有限公司 Image data cleaning method and device, electronic equipment and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108226889A (en) * 2018-01-19 2018-06-29 中国人民解放军陆军装甲兵学院 A kind of sorter model training method of radar target recognition
CN108765338A (en) * 2018-05-28 2018-11-06 西华大学 Spatial target images restored method based on convolution own coding convolutional neural networks

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912611B (en) * 2016-04-05 2019-04-26 中国科学技术大学 A kind of fast image retrieval method based on CNN
US10303979B2 (en) * 2016-11-16 2019-05-28 Phenomic Ai Inc. System and method for classifying and segmenting microscopy images with deep multiple instance learning
GB201713977D0 (en) * 2017-08-31 2017-10-18 Calipsa Ltd Anomaly detection
CN107908646B (en) * 2017-10-10 2019-12-17 西安电子科技大学 Image retrieval method based on hierarchical convolutional neural network
CN109977286B (en) * 2019-03-21 2022-10-28 中国科学技术大学 Information retrieval method based on content

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108226889A (en) * 2018-01-19 2018-06-29 中国人民解放军陆军装甲兵学院 A kind of sorter model training method of radar target recognition
CN108765338A (en) * 2018-05-28 2018-11-06 西华大学 Spatial target images restored method based on convolution own coding convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Encoding multiple contextual clues for partial-duplicate image retrieval;Zhou Zhili;《Pattern Recognition Letter》;20180715;第109卷;第18-26页 *

Also Published As

Publication number Publication date
CN110674334A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN107679250B (en) Multi-task layered image retrieval method based on deep self-coding convolutional neural network
CN106126581B (en) Cartographical sketching image search method based on deep learning
CN106682233B (en) Hash image retrieval method based on deep learning and local feature fusion
Pun et al. A two-stage localization for copy-move forgery detection
Tarawneh et al. Detailed investigation of deep features with sparse representation and dimensionality reduction in cbir: A comparative study
Sotoodeh et al. A novel adaptive LBP-based descriptor for color image retrieval
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
Xia et al. Exploiting deep features for remote sensing image retrieval: A systematic investigation
Alsmadi et al. Fish recognition based on robust features extraction from color texture measurements using back-propagation classifier
Kadam et al. [Retracted] Efficient Approach towards Detection and Identification of Copy Move and Image Splicing Forgeries Using Mask R‐CNN with MobileNet V1
CN104036012A (en) Dictionary learning method, visual word bag characteristic extracting method and retrieval system
Zeng et al. Curvature bag of words model for shape recognition
Sugamya et al. A CBIR classification using support vector machines
Wang et al. S 3 D: Scalable pedestrian detection via score scale surface discrimination
CN110674334B (en) Near-repetitive image retrieval method based on consistency region deep learning features
Chen et al. Instance retrieval using region of interest based CNN features
Lin et al. Scene recognition using multiple representation network
Tang et al. Geometrically robust video hashing based on ST-PCT for video copy detection
Dong et al. Multilayer convolutional feature aggregation algorithm for image retrieval
Unar et al. New strategy for CBIR by combining low‐level visual features with a colour descriptor
Chen et al. Image retrieval based on quadtree classified vector quantization
CN110110120B (en) Image retrieval method and device based on deep learning
CN107526772A (en) Image indexing system based on SURF BIT algorithms under Spark platforms
Song et al. Hierarchical deep hashing for image retrieval
Yousaf et al. Patch-CNN: deep learning for logo detection and brand recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210621

Address after: 518052 1111, building 2, aerospace building, 53 Gaoxin South 9th Road, high tech Zone community, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province

Patentee after: Shenzhen Maidian Media Technology Co.,Ltd.

Address before: No.219, ningliu Road, Jiangbei new district, Nanjing, Jiangsu Province, 210032

Patentee before: NANJING University OF INFORMATION SCIENCE & TECHNOLOGY

TR01 Transfer of patent right