CN109409388B - Dual-mode deep learning descriptor construction method based on graphic primitives - Google Patents

Dual-mode deep learning descriptor construction method based on graphic primitives Download PDF

Info

Publication number
CN109409388B
CN109409388B CN201811317282.2A CN201811317282A CN109409388B CN 109409388 B CN109409388 B CN 109409388B CN 201811317282 A CN201811317282 A CN 201811317282A CN 109409388 B CN109409388 B CN 109409388B
Authority
CN
China
Prior art keywords
image
patch
primitive
training
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811317282.2A
Other languages
Chinese (zh)
Other versions
CN109409388A (en
Inventor
丁新涛
左开中
汪金宝
接标
俞庆英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Normal University
Original Assignee
Anhui Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Normal University filed Critical Anhui Normal University
Priority to CN201811317282.2A priority Critical patent/CN109409388B/en
Publication of CN109409388A publication Critical patent/CN109409388A/en
Application granted granted Critical
Publication of CN109409388B publication Critical patent/CN109409388B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration

Abstract

The invention is suitable for the technical field of image registration, and provides a dual-mode deep learning descriptor construction method based on graphic primitives, which learns the attribute category of a patch image by marking samples, learns the geometric characteristics of the patch image by using the graphic primitives, and fuses the attribute category and the geometric characteristics to obtain the characteristic vector of a local patch image, namely: a descriptor based on a graphics primitive. The registration among patches is completed through the similarity of descriptor vectors, and classification portrayal based on machine learning descriptors is realized. The method mainly comprises the steps of establishing a descriptor training set, constructing a multi-mode convolution network, training categories and geometric modes on a GPU, and achieving classification and registration of local patch images. The method solves the problems of the classification description method of the descriptor and the realization on the GPU.

Description

Dual-mode deep learning descriptor construction method based on graphic primitives
Technical Field
The invention belongs to the technical field of image registration and provides a dual-mode deep learning descriptor construction method based on image elements.
Background
Feature matching is the mainstream method of image registration, and a classical image registration method adopts a local feature descriptor, which takes an image local region with a key point as a center as an object, and describes the feature of the region according to gray scale information of internal pixel points to obtain a feature vector expressing local information around the image key point. However, the classical descriptors are large in calculation amount, difficult to apply to a real-time system and not suitable for mobile equipment.
Disclosure of Invention
The embodiment of the invention provides a dual-mode deep learning descriptor construction method based on image primitives, and aims to solve the problems that a classical descriptor is large in calculation amount and difficult to apply to a real-time system.
In order to achieve the above object, the present invention provides a method for constructing a dual-mode deep learning descriptor based on image primitives, the method comprising the steps of:
s1, extracting image I1And image I2Key point p of1iAnd p2iRespectively form a set of key points P1And a set of keypoints P2
S2, intercepting a key point set P1And a set of keypoints P2Forming a scale space of a local area of the key points based on the patch images of all the key points, wherein the patch images refer to N images with different sizes which are captured by taking a certain key point as a center;
s3, scaling the patch images corresponding to the key points into set sizes respectively to obtain normalized dimension patch images, and executing the steps S4 and S5 simultaneously;
s4, inputting the normalized dimension patch images of the key points into a category detection model respectively, and outputting the categories of the normalized dimension patch images;
s5, performing marginalization standardization scale patch image to obtain patch marginalized image, and inputting the patch marginalized image into a geometric detection model to obtain a geometric feature vector of the patch marginalized image;
and S6, combining the category and the geometric feature vector of the same key point on the patch images with different sizes to form a descriptor vector of each key point on the patch images with different sizes.
Further, after step S6, the method further includes:
s7, image I1And image I2In (1) registration of keypoints, image I1And image I2Any one pair of key points p in1iAnd p2iThe registration method specifically comprises the following steps:
if j is present1And j2So that
Figure BDA0001856625420000021
Then image I1Key point p1iAnd image I2Key point p2iMatching;
wherein T is a set distance threshold value,
Figure BDA0001856625420000022
as an image I1Wherein the ith key point is at the jth1The sub-vectors of the descriptors on a single scale,
Figure BDA0001856625420000023
as an image I2Wherein the ith key point is at the jth2Descriptor vectors on a size.
Further, the method for constructing the category detection model in step S3 is specifically as follows:
s31, constructing a class detection training set and a class detection verification set, wherein the class detection training set and the class detection verification set are formed on the basis of classified marking data;
s32, constructing a category classifier;
s33, training a class classifier through a class training set;
and S34, if the training frequency reaches a set frequency threshold, verifying the trained class classifier through the class verification set, and if the error of the trained class classifier on the class verification set is in an error allowable range or the training frequency reaches an upper limit threshold, stopping training, namely forming a class detection model.
Further, the method for constructing the set detection model in step S4 is specifically as follows:
s41, constructing an image primitive training set and an image primitive verification set, wherein the image primitive training set and the image primitive verification set are formed on the basis of a randomly generated combined image containing straight lines and circles;
s42, constructing a multi-dimensional primitive classifier aiming at the image primitive training set;
s43, training the multi-dimensional primitive classifier through an image primitive training set;
and S44, if the training times reach the set time threshold, verifying the trained multi-dimensional primitive classifier through the image primitive verification set, and if the error of the trained multi-dimensional primitive classifier on the image primitive verification set is within the error allowable range or the training times reach the upper limit threshold, stopping training, namely forming the geometric detection model.
Further, the method for constructing the class detection training set and the class detection verification set specifically comprises the following steps:
s311, uploading and downloading images in a target database;
s312, selecting patch images with set sizes in a target area of the image according to the segmentation classification labels, wherein the number of the selected patch images is one fourth of the multiplication of the size rows and columns of the target area, and the frequency of the central coordinates of the patch images obeys the two-dimensional Gaussian distribution of the center in the area center;
and S314, respectively putting the patch graphics with the classification marks into a class detection training set and a class detection verification set according to a set proportion.
Further, the method for constructing the image element training set and the image element verification set specifically comprises the following steps:
s411, randomly forming a combined image with a set size by taking straight lines and circles as elements, and recording the number n of the straight lines1Number of circles n2The number n of intersections of the straight line and the circle3And the number n of intersections between straight lines4And acute angles at intersections between straight lines
Figure BDA0001856625420000031
Constructing classifier vectors based thereon
Figure BDA0001856625420000032
S412, randomly adding noise to the combined image to form a primitive sample;
and S413, respectively dividing the primitive samples into an image primitive training set and an image primitive verification set based on the set proportion.
Further, patch images of different sizes are enlarged to a set size by a nearest neighbor interpolation method.
Further, the normalized scale patch image is marginalized through a Sobel edge detection method, and a patch marginalized image is obtained.
Furthermore, a parallel network is built by double ResNet, the normalized dimension patch image is input into the category detection model, meanwhile, the patch marginalized image is input into the geometric detection model, and the category detection model and the geometric detection model are multiplied at a full connection layer to build the multiplicative detection model for output.
The dual-mode deep learning descriptor construction method based on the graphic primitives, which is provided by the invention, explores a descriptor classification method of GPU (graphics processing unit) calculation aiming at the defect of large CPU (central processing unit) calculation amount of a classical image registration method. The method mainly comprises the steps of establishing a descriptor training set, constructing a multi-mode convolution network, training categories and geometric modes on a GPU, and achieving classification and registration of local patch images. The method solves the problems of the classification description method of the descriptor and the realization on the GPU.
Drawings
FIG. 1 is a flowchart illustrating a dual-mode deep learning descriptor construction method based on graphics primitives according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a dual-mode building method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention learns the attribute category of the patch image by marking samples, learns the geometric characteristics of the patch image by using graphic primitives, and fuses the attribute category and the geometric characteristics to obtain the characteristic vector of the local patch image, namely: a descriptor based on a graphics primitive. The registration between patches is completed through the similarity of descriptor vectors, and the classification characterization based on machine learning descriptors is realized.
Fig. 1 is a schematic flowchart of a method for constructing a dual-mode deep learning descriptor based on image primitives according to an embodiment of the present invention, and as can be seen from fig. 1, the method includes the following steps:
s1, extracting image I1And image I2Respectively constitute a set of key points P1And a set of keypoints P2
Image I is extracted by OD-SIFT method1And image I2Is given as a key point of (1), assume image I1Above is provided with n1A key point, a set of key points P1Expressed as: p1={p1i|p1i=(x1i,y1i),i=1,2,…,n1},(x1i,y1i) Representing a set of keypoints P1Middle ith key point p1iAssuming the coordinates of image I2Above is provided with n2A key point, a set of key points P2Is represented as P2={p2i|p2i=(x2i,y2i),i=1,2,…,n2},(x2i,y2i) Representing a set of keypoints P2Middle ith key point p2iThe coordinates of (a);
s2, intercepting a key point set P1And a set of keypoints P2Patch images of all key points form a scale space of a local area of the key points based on the patch images of the key points;
taking a certain key point as a center, intercepting N images with different scales, wherein the N images with different scales form a patch image of the key point;
the embodiment of the invention takes N value 7 as an example to illustrate the key point p1iThe size space of the local area and the patch image ofKey point p1iAs the center, rectangular regions of 32 × 32, 28 × 28, 24 × 24 pixels, 20 × 20, 16 × 16, 12 × 12, 8 × 8 were sequentially cut out to form 7 patch images with different sizes, and the 7 patch images together form a key point p11Scale space of local regions, p1i∈I1The corresponding dimension patch image is
Figure BDA0001856625420000051
Wherein
Figure BDA0001856625420000052
32 x 32 rectangular patch areas, and so on,
Figure BDA0001856625420000053
for 8-8 rectangular patch areas, a key point set P can be constructed based on the method1And a set of keypoints P2Patch images of all key points and size space of local regions.
S3, scaling the patch image to a set size to obtain a normalized dimension patch image, and executing the steps S4 and S5;
s4, inputting the normalized dimension patch image into a category detection model, and outputting the category of the normalized dimension patch image;
scaling the patch images with different sizes to a set size by a nearest neighbor interpolation method, wherein the nearest neighbor interpolation method specifically comprises the following steps: if interpolation is done at p points, let the value at p points equal the value of the point closest to it, and furthermore, take 32 x 32 as the set size, as for p1i∈I1The corresponding normalized scale patch image is
Figure BDA0001856625420000054
Wherein the content of the first and second substances,
Figure BDA0001856625420000055
32 x 32 patch images, without stretching,
Figure BDA0001856625420000056
stretching a 28 x 28 rectangular patch area into a 32 x 32 patch image, and so on,
Figure BDA0001856625420000057
stretching 8-8 rectangular patch area to form 32-32 patch image, respectively using N (P)1) And N (P)2) To represent an image I1And image I2All normalized scale patch images obtained above, where N (P)1) And N (P)2) Is represented as follows: n (P)1)={N(p1i)|,i=1,2,…,n1};N(P2)={N(p2i)|,i=1,2,…,n2H, mixing N (P)1)、N(P2) Respectively inputting the class detection models to obtain an output C (P)1)={C(p1i)|,i=1,2,…,n1And C (P)2)={C(p2i)|,i=1,2,…,n2And (c) the step of (c) in which,
Figure BDA00018566254200000624
Figure BDA0001856625420000061
representing an image I1J at the ith key point1Patches on individual scales
Figure BDA0001856625420000062
The above-obtained classification is based on a similar method to obtain C (p)2i)。
S5, performing marginalization standardization scale patch image to obtain patch marginalized image, and inputting the patch marginalized image into a geometric detection model to obtain a geometric feature vector of the patch marginalized image;
let E (P)1)={E(p1i)|,i=1,2,…,n1And E (P)2)={E(p2i)|,i=1,2,…,n2Are respectively images I1And image I2All patch edge images obtained above, for N (p)1i) Corresponding to the marginalized image as
Figure BDA0001856625420000063
Wherein
Figure BDA0001856625420000064
Is a pair of
Figure BDA0001856625420000065
The image obtained by Sobel edge detection is applied, and so on,
Figure BDA0001856625420000066
is a pair of
Figure BDA0001856625420000067
Applying the image obtained by Sobel edge detection to the sample, and adding E (P)1) And E (P)2) Respectively input into a geometric detection model to obtain an output G (P)1)={G(p1i)|,i=1,2,…,n1And G (P)2)={G(p2i)|,i=1,2,…,n2Therein of
Figure BDA0001856625420000068
Figure BDA0001856625420000069
Representing an image I1J at the ith key point1Patches on individual scales
Figure BDA00018566254200000610
Edge image of
Figure BDA00018566254200000611
The geometric feature vector obtained above is
Figure BDA00018566254200000612
Wherein the content of the first and second substances,
Figure BDA00018566254200000613
as edge images
Figure BDA00018566254200000614
The number of the middle straight lines is,
Figure BDA00018566254200000615
as edge images
Figure BDA00018566254200000616
The number of the middle circles is such that,
Figure BDA00018566254200000617
as edge images
Figure BDA00018566254200000618
The number of intersections of the middle straight line and the circle,
Figure BDA00018566254200000619
as edge images
Figure BDA00018566254200000620
The number of intersections of the middle straight line and the straight line,
Figure BDA00018566254200000621
as edge images
Figure BDA00018566254200000622
The sharp included angle at the middle intersection point is used for obtaining G (p) based on a similar method2i)。
A parallel network is constructed by double ResNet, the normalized dimension patch image of the patch image is input into a category detection model, meanwhile, the patch marginalized image is input into a geometric detection model, the category detection model and the geometric detection model are multiplied at a full connection layer, and a multiplicative detection model is constructed and output, as shown in figure 2.
S5, combining the category and the geometric feature vector of the same key point on the patch images with different sizes to form a descriptor vector of each key point on the patch images with different sizes;
descriptor vector D (P)1)={D(p1i)|,i=1,2,…,n1And (c) the step of (c) in which,
Figure BDA00018566254200000625
Figure BDA00018566254200000623
as an image I1J at the ith key point1The descriptor vector obtained on each scale is similarly acquired as D (P)2) Descriptor vector D (P)2)={D(p2i)|,i=1,2,…,n2And (c) the step of (c) in which,
Figure BDA0001856625420000077
Figure BDA0001856625420000071
as an image I2J at the ith key point2Descriptor vectors derived on a scale, and thus, a keypoint p1iOr the key point p2i7 descriptors are generated on the scale space.
In the embodiment of the present invention, determining the registration of the images by describing the similarity of the sub-vectors further includes, after step S6:
s7, image I1And image I2Registration of middle key points, image I1And image I2Any one pair of key points p in1iAnd p2iThe registration method specifically comprises the following steps:
if j is present1And j2So that
Figure BDA0001856625420000072
Then image I1Key point p1iAnd image I2Key point p2iMatching, wherein T is a set distance threshold,
Figure BDA0001856625420000073
as an image I1Wherein the ith key point is at the jth1The sub-vectors of the descriptors on a single scale,
Figure BDA0001856625420000074
as an image I2Wherein the ith key point is at the jth2Descriptor vectors on a size. In the embodiment of the invention, if
Figure BDA0001856625420000075
Descriptor vector and
Figure BDA0001856625420000076
if the lengths of the descriptor vectors are different, the shorter vector is extended and then distance comparison is performed.
In the embodiment of the present invention, the method for constructing the category detection model in step S3 specifically includes:
s31, constructing a class detection training set and a class detection verification set;
in the embodiment of the present invention, the method for constructing the class detection training set and the class detection verification set specifically includes the following steps:
s311, uploading and downloading images in a target database;
the object database involved in the present invention comprises: COCO, Pascal Voc, Indor Scene Recognition, Cifar-100, Downsample ImageNet, Tiny Images database, deep learning from these databases, on which many scholars have gained some better precision in order to compare the results obtained with those of other scholars.
And S312, merging according to the same category of the image in different target databases, namely fusing the target categories according to the classification labels of the image in all the databases.
S313, selecting patch images with set sizes in a target area of the image according to the segmentation classification labels, wherein the number of the selected patch images is one fourth of the multiplication of the size row and column of the target area, and the frequency of the central coordinates of the patch images follows the two-dimensional Gaussian distribution of the center of the area;
and S314, respectively putting the patch graphics with the classification labels into a class detection training set and a class detection verification set according to a set proportion.
S32, constructing a category classifier;
s33, training a class classifier through a class training set;
and S34, if the training frequency reaches a set frequency threshold value, and the frequency threshold value of deep training is generally about 450 thousand, verifying the trained class classifier through a classification verification set, if the error of the trained class classifier on the class verification set exceeds an error allowable range, returning to the step S33, and if the error of the trained class classifier on the class verification set is within the error allowable range or the training frequency reaches an upper limit threshold value, stopping training, namely forming a class detection model.
In the embodiment of the present invention, the method for constructing the set detection model in step S4 specifically includes the following steps:
s41, constructing an image element training set and an image element verification set;
in the embodiment of the present invention, the method for constructing the image primitive training set and the image primitive verification set specifically includes the following steps:
s411, randomly composing a combined image with a set size, for example, 32 × 32, with the straight lines and circles as elements, the combined image being generated by 11 random parameter controls: (k)l,pstart,pend,kc,pc,rc) Wherein k isl,
Figure BDA0001856625420000081
Controlling the number of lines and circles, p, in the graph, respectivelystart,pend,pc∈R2Respectively controlling the starting point and the end point of the straight line and the central position r of the circlecE.g. R controls the radius of the circle, pstart=(xstart,ystart),pend=(xend,yend),pc=(xc,yc);
S412, adding random noise on the combined image to form a primitive sample; the noise is divided into 6 types: gaussian noise, Rayleigh noise, gamma noise, exponential distribution noise, uniformly distributed noise and salt and pepper noise, wherein parameters of the Gaussian noise, the Rayleigh noise, the gamma noise, the exponential distribution noise, the uniformly distributed noise and the salt and pepper noise are generated randomly;
and S413, respectively dividing the primitive samples into an image primitive training set and an image primitive verification set according to a set proportion (such as a proportion of 3: 2).
S42, aiming at imageElement training set construction multi-dimensional element classifier
Figure BDA0001856625420000091
S43, training the multi-dimensional primitive classifier through an image primitive training set;
s44, if the training times reach the set times threshold, the trained multi-dimensional primitive classifier is verified through the image primitive verification set, if the error of the trained multi-dimensional primitive classifier on the image primitive verification set exceeds the error allowable range, the step S43 is returned, if the error of the trained multi-dimensional primitive classifier on the image primitive verification set is within the error allowable range or the training times reach the upper limit threshold, the training is stopped, and the geometric detection model is formed.
The invention provides a dual-mode deep learning descriptor construction method based on graphic primitives, which is used for exploring a descriptor classification method calculated by a GPU (graphics processing unit) aiming at the defect of large CPU (central processing unit) calculated amount in a classical image registration method. The method mainly comprises the steps of establishing a descriptor training set, constructing a multi-mode convolution network, training categories and geometric modes on a GPU, and achieving classification and registration of local patch images. The method solves the problems of the classification description method of the descriptor and the realization on the GPU.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (9)

1. A dual-mode deep learning descriptor construction method based on graphic primitives, which is characterized by comprising the following steps:
s1, extracting image I1And image I2Key point p of1iAnd p2iRespectively form a set of key points P1And a set of keypoints P2
S2, intercepting a key point set P1And a set of keypoints P2Patch images of all the key points in the image data are constructed based on the patch imagesIn the scale space of the local area of the key point, the patch image refers to N images with different sizes which are intercepted by taking a certain key point as a center;
s3, scaling the patch images corresponding to the key points into set sizes respectively to obtain normalized dimension patch images, and executing the steps S4 and S5 simultaneously;
s4, inputting the normalized dimension patch images of the key points into a category detection model respectively, and outputting the categories of the normalized dimension patch images;
s5, performing marginalization standardization scale patch image to obtain patch marginalized image, and inputting the patch marginalized image into a geometric detection model to obtain a geometric feature vector of the patch marginalized image;
and S6, combining the category and the geometric feature vector of the same key point on the patch images with different sizes to form a descriptor vector of each key point on the patch images with different sizes.
2. A dual-mode graphics primitive-based deep learning descriptor construction method as claimed in claim 1, further comprising after step S6:
s7, image I1And image I2In (1) registration of keypoints, image I1And image I2Any one pair of key points p in1iAnd p2iThe registration method specifically comprises the following steps:
if j is present1And j2So that
Figure FDA0003053769130000011
Then image I1Key point p1iAnd image I2Key point p2iMatching;
wherein T is a set distance threshold value,
Figure FDA0003053769130000012
as an image I1Wherein the ith key point is at the jth1The sub-vectors of the descriptors on a single scale,
Figure FDA0003053769130000013
as an image I2Wherein the ith key point is at the jth2Descriptor vectors on a size.
3. The dual-mode deep learning descriptor construction method based on graphic primitives as claimed in claim 1, wherein the class detection model construction method in step S4 is specifically as follows:
s31, constructing a class detection training set and a class detection verification set, wherein the class detection training set and the class detection verification set are formed on the basis of classified marking data;
s32, constructing a category classifier;
s33, training a class classifier through a class training set;
and S34, if the training frequency reaches a set frequency threshold, verifying the trained class classifier through the class verification set, and if the error of the trained class classifier on the class verification set is in an error allowable range or the training frequency reaches an upper limit threshold, stopping training, namely forming a class detection model.
4. The method for constructing dual-mode deep learning descriptor based on graphic primitives as claimed in claim 1, wherein the geometric detection model construction method in step S5 is specifically as follows:
s41, constructing an image primitive training set and an image primitive verification set, wherein the image primitive training set and the image primitive verification set are formed on the basis of a randomly generated combined image containing straight lines and circles;
s42, constructing a multi-dimensional primitive classifier aiming at the image primitive training set;
s43, training the multi-dimensional primitive classifier through an image primitive training set;
and S44, if the training times reach the set time threshold, verifying the trained multi-dimensional primitive classifier through the image primitive verification set, and if the error of the trained multi-dimensional primitive classifier on the image primitive verification set is within the error allowable range or the training times reach the upper limit threshold, stopping training, namely forming the geometric detection model.
5. The method for constructing dual-mode deep learning descriptor based on graphic primitives as claimed in claim 3, wherein the method for constructing said class detection training set and said class detection verification set is as follows:
s311, uploading and downloading images in a target database;
s312, selecting patch images with set sizes in a target area of the image according to the segmentation classification labels, wherein the number of the selected patch images is one fourth of the multiplication of the size rows and columns of the target area, and the frequency of the central coordinates of the patch images obeys the two-dimensional Gaussian distribution of the center in the area center;
and S314, respectively putting the patch graphics with the classification marks into a class detection training set and a class detection verification set according to a set proportion.
6. The method as claimed in claim 4, wherein the method for constructing the training set and the verification set of image primitives comprises the following steps:
s411, randomly forming a combined image with a set size by taking straight lines and circles as elements, and recording the number n of the straight lines1Number of circles n2The number n of intersections of the straight line and the circle3And the number n of intersections between straight lines4And acute angles at intersections between straight lines
Figure FDA0003053769130000031
S412, randomly adding noise to the combined image to form a primitive sample;
and S413, respectively dividing the primitive samples into an image primitive training set and an image primitive verification set based on the set proportion.
7. A dual-mode graphics primitive-based deep learning descriptor construction method as claimed in claim 1 wherein patch images of different sizes are enlarged to a set size by nearest neighbor interpolation.
8. The dual-mode deep learning descriptor construction method based on graphic primitives of claim 1, wherein a normalized scale patch image is marginalized by a Sobel edge detection method to obtain a patch marginalized image.
9. The dual-mode deep learning descriptor construction method based on graphic primitives as claimed in claim 1, wherein a parallel network is constructed by using dual ResNet, a normalized dimension patch image is input into a class detection model, a patch marginalized image is input into a geometric detection model, the class detection model and the geometric detection model are multiplied at a full connection layer, and multiplicative detection model output is constructed.
CN201811317282.2A 2018-11-07 2018-11-07 Dual-mode deep learning descriptor construction method based on graphic primitives Active CN109409388B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811317282.2A CN109409388B (en) 2018-11-07 2018-11-07 Dual-mode deep learning descriptor construction method based on graphic primitives

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811317282.2A CN109409388B (en) 2018-11-07 2018-11-07 Dual-mode deep learning descriptor construction method based on graphic primitives

Publications (2)

Publication Number Publication Date
CN109409388A CN109409388A (en) 2019-03-01
CN109409388B true CN109409388B (en) 2021-08-27

Family

ID=65471796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811317282.2A Active CN109409388B (en) 2018-11-07 2018-11-07 Dual-mode deep learning descriptor construction method based on graphic primitives

Country Status (1)

Country Link
CN (1) CN109409388B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110084150B (en) * 2019-04-09 2021-05-11 山东师范大学 Automatic white blood cell classification method and system based on deep learning
CN110441329B (en) * 2019-08-12 2022-02-15 广东工业大学 Laser welding defect identification method, device and equipment based on deep learning
CN111049125B (en) * 2019-09-24 2021-07-30 安徽师范大学 Electric vehicle intelligent access control method based on machine learning
CN113537371B (en) * 2021-07-22 2023-03-17 苏州大学 Epithelial cell classification method and system integrating two stages of edge features

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6018349A (en) * 1997-08-01 2000-01-25 Microsoft Corporation Patch-based alignment method and apparatus for construction of image mosaics
CN101216932A (en) * 2008-01-03 2008-07-09 威盛电子股份有限公司 Methods of graphic processing arrangement, unit and execution triangle arrangement and attribute arrangement
CN102254303A (en) * 2011-06-13 2011-11-23 河海大学 Methods for segmenting and searching remote sensing image
CN103839074A (en) * 2014-02-24 2014-06-04 西安电子科技大学 Image classification method based on matching of sketch line segment information and space pyramid
WO2015060897A1 (en) * 2013-10-22 2015-04-30 Eyenuk, Inc. Systems and methods for automated analysis of retinal images
CN108022270A (en) * 2016-11-03 2018-05-11 奥多比公司 The image patch sampled using the probability based on prophesy is matched
CN108230268A (en) * 2016-12-21 2018-06-29 达索系统公司 Completion is carried out to image

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6018349A (en) * 1997-08-01 2000-01-25 Microsoft Corporation Patch-based alignment method and apparatus for construction of image mosaics
CN101216932A (en) * 2008-01-03 2008-07-09 威盛电子股份有限公司 Methods of graphic processing arrangement, unit and execution triangle arrangement and attribute arrangement
CN102254303A (en) * 2011-06-13 2011-11-23 河海大学 Methods for segmenting and searching remote sensing image
WO2015060897A1 (en) * 2013-10-22 2015-04-30 Eyenuk, Inc. Systems and methods for automated analysis of retinal images
CN103839074A (en) * 2014-02-24 2014-06-04 西安电子科技大学 Image classification method based on matching of sketch line segment information and space pyramid
CN108022270A (en) * 2016-11-03 2018-05-11 奥多比公司 The image patch sampled using the probability based on prophesy is matched
CN108230268A (en) * 2016-12-21 2018-06-29 达索系统公司 Completion is carried out to image

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《CNC视觉系统中的图像特征匹配技术及其应用研究》;惠国保;《硕士论文库》;20141231;第1-154页; *
《Single Image Dehazing Based on the Physical Model and MSRCR Algorithm 》;Jinbao Wang et al;;《IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY》;20180930;第28卷(第9期);第2190-2199页; *
《利用空间序列描述子的快速准确的图像配准算法》;靳峰 等;;《西安交通大学学报》;20140630;第48卷(第6期);第19-24页; *

Also Published As

Publication number Publication date
CN109409388A (en) 2019-03-01

Similar Documents

Publication Publication Date Title
CN109409388B (en) Dual-mode deep learning descriptor construction method based on graphic primitives
Ouyang et al. Copy-move forgery detection based on deep learning
CN103729885B (en) Various visual angles projection registers united Freehandhand-drawing scene three-dimensional modeling method with three-dimensional
JP2019514123A (en) Remote determination of the quantity stored in containers in geographical areas
CN109960742B (en) Local information searching method and device
US9619733B2 (en) Method for generating a hierarchical structured pattern based descriptor and method and device for recognizing object using the same
CN109558902A (en) A kind of fast target detection method
US10902053B2 (en) Shape-based graphics search
CN106778852A (en) A kind of picture material recognition methods for correcting erroneous judgement
CN103745201B (en) A kind of program identification method and device
CN111709980A (en) Multi-scale image registration method and device based on deep learning
CN104217459A (en) Spherical feature extraction method
Cai et al. A novel saliency detection algorithm based on adversarial learning model
Montserrat et al. Logo detection and recognition with synthetic images
CN112085835A (en) Three-dimensional cartoon face generation method and device, electronic equipment and storage medium
Álvarez et al. Junction assisted 3d pose retrieval of untextured 3d models in monocular images
Bui et al. A texture-based local soft voting method for vanishing point detection from a single road image
US20220092448A1 (en) Method and system for providing annotation information for target data through hint-based machine learning model
CN115358981A (en) Glue defect determining method, device, equipment and storage medium
CN103617616A (en) Affine invariant image matching method
CN114140551A (en) Expressway bifurcation merging point conjecture method and system based on track image
He et al. A computational fresco sketch generation framework
CN113011506A (en) Texture image classification method based on depth re-fractal spectrum network
Wang et al. SO-PERM: Pose Estimation and Robust Measurement for Small Objects
Minster et al. Geolocation for printed maps using line segment-based SIFT-like feature matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant