CN114647754A - Hand-drawn image real-time retrieval method fusing image label information - Google Patents
Hand-drawn image real-time retrieval method fusing image label information Download PDFInfo
- Publication number
- CN114647754A CN114647754A CN202210396360.2A CN202210396360A CN114647754A CN 114647754 A CN114647754 A CN 114647754A CN 202210396360 A CN202210396360 A CN 202210396360A CN 114647754 A CN114647754 A CN 114647754A
- Authority
- CN
- China
- Prior art keywords
- label
- image
- sketch
- sample
- distance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/53—Querying
- G06F16/535—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the field of image retrieval, and particularly relates to a hand-drawn image real-time retrieval method fusing image label information, which comprises the following steps: extracting characteristic graphs of the hand-drawn sketch and characteristic vectors of the real object image by adopting an improved neural network model, calculating Euclidean distances D between sketch branches and all images when the characteristic vectors of the hand-drawn image are generated for retrieval, and taking an average value D of DmAs a label distance reference value, a pseudo label P is processed according to an input label and the probability value of the corresponding input label category processed by Softmax stored in a databasecRespectively for the distance d according to the database sample categorymWeighting to obtain a label weighted distance value DlFinally according to D and DlThe sum sorts the images in the database, and the top-k searched images are returned; when the method is used for searching the early sketch, the information such as the color, the characteristics and the like of the target image can be used for inquiring, and the searching efficiency is greatly improved when stroke information is less.
Description
Technical Field
The invention belongs to the field of dynamic sketch retrieval, and particularly relates to a hand-drawn image real-time retrieval method fusing image label information.
Background
Due to the development of touch screens, in recent years, the sketch-based image retrieval can flexibly use an unlimited hand-drawn sketch to inquire a natural image receives wide attention. The sketch search may be classified into a coarse-level sketch search (CBIR) and a Fine-grained sketch search (FG-SBIR) according to the search Category. Fine-grained sketch retrieval FG-SBIR is image matching of the details of a hand-drawn sketch, aiming at retrieving a specific photo in the gallery. Currently, much progress is made in the research on FG-SBIR, but there are three problems in the sketching process that prevent FG-SBIR from being widely used in practice: (1) the drawing skill of the user is insufficient, the drawn sketch pattern has large difference, and the retrieval efficiency is low. (2) The time required to draw a complete sketch and the reduction in sketch retrieval time required to retrieve the target image with minimal strokes must also be considered. (3) The sketch has abstraction, a target image is generally searched by using simple lines during sketch retrieval, and information contained in the sample sketch is only black and white lines and is less. Secondly, the sketches have diversity, the contour similarity of target images (such as lady high-heeled shoes) with small style difference is extremely high, so that the sketches of the target images also have extremely high similarity, the target images cannot be distinguished from the sketches, and the reason that the early sketches are low in searching efficiency is also caused. In the traditional method, when a user searches commodities, only a target image can be searched by using lines, and if the user wants a chair with red color in the searching process, the user can search the wanted content when a draft at the later stage is complete because the line information does not contain color information.
In summary, the prior art has the technical problems that: how to improve retrieval efficiency when stroke information is few.
Disclosure of Invention
The invention provides a hand-drawn image real-time retrieval method fusing image label information to solve the technical problems, and the method is used for increasing sketch information in a sketch retrieval frame fusing the sketch style and enhancing the early retrieval efficiency of sketch retrieval. When the user searches early sketches, the user simultaneously uses information such as color, characteristics and the like of the target image to inquire, and the searching efficiency can be greatly improved when stroke information is less.
A hand-drawn image real-time retrieval method fusing image label information comprises the following steps:
inputting a hand-drawn image and label information of a target image into a neural network model trained and improved through a training set, and retrieving in real time to obtain a retrieval result;
the training of the improved neural network model comprises1、f2、f3、fexWherein, f1To pre-train the network, f2For the layer of attention, f3To lower the dimension layer, fexA label extraction layer;
the training process of the improved neural network model comprises the following steps:
s1: constructing a training set, wherein the training set comprises an image set consisting of a plurality of images and a complete sketch which is correspondingly retracted, and an expansion tag set corresponding to the images, and the expansion tag set corresponding to the images consists of all tag information of the images;
s2, selecting one image in the image set as a target image in each step of training, and training f of the neural network model by using the corresponding hand-drawn sketch of the image1、f2、f3Three branches, fixed after training f1、f2Parameters, simultaneous training is completed by f1、f2、f3Extracting embedded vectors of all target images;
s3, inputting the target images in the image set into the trained f1In (1), a feature map of the target image is obtained, and the feature map is input into (f)exPredicting the label of the image, training f by adopting a cross entropy loss function according to the label information in the extended label setexAfter the training is finishedFixing parameters;
s4, rendering the complete sketch of each image in the image set into a plurality of sketches according to the stroke sequence of the drawing, forming a sketch branch set of the image set after each sketch is rendered, and processing the sketch branch set by f1、f2、f3Extracting an embedded vector of a sketch branch;
s5, calculating the embedded vector error of each picture in the draft branch and the embedded vector error of the target image by adopting a triple loss function, reversely propagating the errors to approach the target image and keep away from the non-target image as the target, and adjusting f in the model3The parameters of (1);
and S6, obtaining a sketch branch of the next target image, and repeating the steps S4-S6 until the model reaches the upper limit of the training times.
Furthermore, a complete sketch of an image is rendered into N pictures according to the stroke sequence of drawing, the N pictures form a sketch branch, each picture in the sketch branch comprises a first pen to an nth pen of the complete sketch, the strokes of each picture are different, N is more than or equal to 1 and less than or equal to N, and one sketch branch S is arranged according to the ascending sequence of the number of the strokes contained in the pictures, wherein { S ═ S { (S) }1,s2,…,sn,…,sN},snRepresenting a picture containing first through nth strokes.
Further, the images in the training set are labeled with L ═ L1,l2,…,ln,…,lNIs used for training the label extraction layer fexThe cross entropy Loss expression is as follows:
wherein K represents the total number of categories contained by the tag; n represents the total number of samples; n represents the nth sample; p is a radical ofncRepresenting the probability that sample n belongs to class c; lncA correct probability label of class c representing sample n;
further, calculating the error of the embedded vector of each picture in the sketch branch and the embedded vector of the target image by adopting a triple Loss function, wherein the expression of the triple Loss is as follows:
Loss=max(d(VSi,Vp)-d(VSi,Vn)+α,0)
wherein, VSiAn embedded vector representing the ith picture in the sketch branch; vpAn embedded vector representing the target image; vnAn embedded vector representing a random one of the images in the image set other than the target image; α is a constant; d is the Euclidean distance calculation.
Further, the step of inputting the hand-drawn sketch and the label information of the target image by the user, retrieving in real time and obtaining a final retrieval result comprises the following steps:
the method comprises the following steps: user-entered sketch through image distance network f1、f2、f3Obtaining a sketch embedding vector V of the step iSi;
Step two: calculating VSiWith the embedded vector V of each image in the databasepThe Euclidean distance of (D), obtain distance vector D ═ D1,d2,…,dn,…,dN};
Step three: calculate the average of the elements in the distance vector and average f1Output feature map input fexPredicting the label probability of an input sketch, and processing the label probability by utilizing Softmax to obtain a pseudo label;
step four: average value d of elements in distance vector according to relation of pseudo label and input labelmWeighting to obtain a label weighted distance value Dl;
Step five: assigning an attenuation coefficient to the tag weighted distance based on D and DlAnd the sum sorts the images in the database and obtains a retrieval result.
Further, label probability of a convolutional neural network prediction image is adopted, and a probability vector set P of N samples respectively belonging to the category c is obtained through Softmax processingc={p1c,p2c,…,pnc,…,pNcH, will PcProbability p of a sample n belonging to class c as a pseudo labelncExpressed as:
wherein, VncA probability vector representing that sample n belongs to class c; vnkA probability vector representing the total number of label categories for sample n; k represents the total number of categories contained in the label; n represents the total number of samples; n represents the nth sample; p is a radical of formulancIndicating the probability that sample n belongs to class c.
Further, the average value d of the elements in the distance vector is calculated according to the relation between the pseudo label and the input labelmWeighting to obtain a label weighted distance value DlMax (p), the maximum value of the pseudo label of the sample nn) For the label class to which the sample belongs, if Max (p)n)>0.8, marking the sample n as a credible sample, otherwise, marking the sample n as an untrustworthy sample; if the pseudo label Max (p)n)>0.8 and the same as the input label, wherein the sample n is a credible positive sample; if the pseudo tag Max (p)n)>0.8 and different from the input label, the sample n is a credible negative sample; otherwise, the distance is an untrusted sample, and the distance is not weighted; calculating a tag weighted distance value DlThe expression of (a) is:
wherein d ismRepresents the average of the elements in the distance vector; dnRepresenting the Euclidean distance of the sample n from the vector of the sketch; n represents the total number of samples; omegap<0,ωpWeighting the credible negative sample label, omegan>0,ωnWeighting the credible positive sample label; p is a radical ofnA pseudo label representing the probability value of the sample n.
Further, a decay factor, D and D, is assigned to the tag weighted distancelAnd the sum ranks the images in the database, and the expression is as follows:
Dfinal=D+ω·Dl
wherein D is the distance vector between the sketch branch and all the images;DlWeighting the label distance; dfinalDistance according to the final sorting; ω is the label weighted distance weight, and ω gradually decreases as i increases, i.e., the input sketch is more complete.
The invention integrates the image label information of the sketch to carry out early image retrieval, retrieves images with less strokes in the early stage according to the extended label set of the target image, integrates the sketch style with the sketch retrieval frame, and can retrieve the target image by using the least strokes of the sketch, thereby reducing the early retrieval time of the hand-drawn sketch and improving the retrieval efficiency.
Drawings
FIG. 1 is a diagram of a baseline model of the present invention;
FIG. 2 is a diagram of a deep neural network search framework model according to the present invention;
FIG. 3 is a schematic diagram of a sketch branch rendering process and a picture tag classification;
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A hand-drawn image real-time retrieval method fusing image label information, as shown in fig. 1-2, comprising:
acquiring a complete sketch of a target image and label information of all target images, wherein the label information of all target images forms an expansion label set, then rendering the complete sketch into N pictures according to the stroke sequence of drawing, forming a sketch branch by the N pictures after rendering, wherein each picture in the sketch branch comprises the first stroke to the nth stroke of the complete sketch, the strokes of each picture are different, N is more than or equal to 1 and less than or equal to N, and one sketch branch S is { S { (S) } according to the ascending sequence of the strokes of the pictures1,s2,…,sn,…,sN},snRepresenting a picture containing the first through nth strokes.
As shown in fig. 3, a QMUL-Shoe-V2 data set and a QMUL-Chair-V2 data set are selected from an image retrieval data set of the fine-grained sketch retrieval FG-SBIR as a training data set of the current model, an image is respectively selected from a QMUL-Shoe-V2 data set and a QMUL-Chair-V2 data set as a target image, a complete sketch and label information of the image are obtained to form an extended label set, and a sketch branch of the target image is obtained by rendering according to a picture stroke sequence.
Specifically, as shown in (b) tag information in fig. 3, volunteers who found different drawing bases let them manually draw a complete sketch according to the target image. According to the picture information in the two data sets, corresponding label information is printed on each picture in the two data sets, and the content of the label information is classified artificially.
Specifically, as shown in the process of drawing a sketch by hand in fig. 3 (a), for a complete sketch, the complete sketch is rendered into N pictures according to the completeness of the sketch, the N pictures after rendering are sketch branches, and each picture in the sketch branches includes the first stroke to the nth stroke of the complete sketch. For example: the first picture in the sketch branch only comprises the first pen of the complete sketch, the second picture comprises the first pen and the second pen of the complete sketch, and the third picture comprises the first pen, the second pen and the third pen of the complete sketch, and so on.
A plurality of target images acquired from the QMEL-Shoe-V2 data set and the QMEL-Chair-V2 data set and label information of a complete sketch image set corresponding to the target images form an extended label set to form a training set, when a model is trained, label information of a branch specially used for training the images is provided, and when the model is searched, the label of the branch judging image is compared with the input label to perform auxiliary search; inputting a hand-drawn sketch and label information of a target image into the trained improved neural network model, and retrieving and obtaining the target image in real time;
the training process of the improved neural network model comprises the following steps:
s1: constructing a training set, wherein the training set comprises an image set consisting of a plurality of images and a complete sketch which is correspondingly recovered, and an expansion tag set corresponding to the images, and the expansion tag set corresponding to the images consists of all tag information of the images;
s2, selecting one image in the image set as a target image in each step of training, and using the corresponding hand-drawn sketch of the image to train f of the neural network model1、f2、f3Three branches, fixed after training f1、f2Parameters, simultaneous training is completed by f1、f2、f3Extracting embedded vectors of all target images;
s3, inputting the target images in the image set into the trained f1In (1), a feature map of the target image is obtained, and the feature map is input into (f)exPredicting a label of the target image; training f according to the label information in the extended label set by adopting a cross entropy loss functionexFixing parameters after training;
s4, rendering the complete sketch of each image in the image set into a plurality of sketches according to the stroke sequence of the drawing, forming a sketch branch set of the image set after each sketch is rendered, and processing the sketch branch set by f1、f2、f3Extracting an embedded vector of a sketch branch;
s5, calculating the embedded vector error of each picture in the draft branch and the embedded vector error of the target image by adopting a triple loss function, reversely propagating the errors to approach the target image and keep away from the non-target image as the target, and adjusting f in the model3The parameters of (a);
s6, obtaining the draft branch of the next target image, and repeating the steps S4-S6 until the model reaches the upper limit of the training times.
In step S3, the training process of the improved neural network model uses a cross entropy loss function to train f according to the label information in the extended label setexThe method comprises the following steps: set label L ═ L for images in the training set1,l2,…,ln,…,lNIs used for training the label extraction layer fexThe cross entropy Loss expression is as follows:
wherein K represents the total number of categories contained by the tag; n represents the total number of samples; p is a radical ofncRepresenting the probability that sample n belongs to class c; lncThe correct probability label of class c representing sample n.
In the step S5 of the training process of the improved neural network model, a triple Loss function is used to calculate the error between the embedded vector of each picture in the sketch branch and the embedded vector of the target image, and the expression of the triple Loss is:
Loss=max(d(VSi,Vp)-d(VSi,Vn)+α,0)
wherein, VSiAn embedded vector representing the ith picture in the sketch branch; vpAn embedded vector representing the target image; vnAn embedded vector representing a random one of the images in the image set other than the target image; α is a constant; d is the Euclidean distance calculation.
Preferably, the user inputs a hand-drawn sketch of the target image f1Through fexPredicting the label probability of the target image, processing the label probability by utilizing Softmax to obtain a pseudo label, and storing the pseudo label into a database;
further, label probability of a convolutional neural network prediction image is adopted, and a probability vector set P of N samples respectively belonging to the category c is obtained through Softmax processingc={p1c,p2c,…,pnc,…,pNcH, mixing PcAs pseudo label, the probability p that a sample n belongs to class cncExpressed as:
wherein, VncA probability vector representing that sample n belongs to class c; vnkLabel class representing sample nA probability vector of the total; k represents the total number of categories contained in the label; n represents the total number of samples; n represents the nth sample; p is a radical ofncRepresenting the probability that sample n belongs to class c;
further, the input sketch passes through an image distance network f1、f2、f3Obtaining a sketch embedding vector V of the step iSiCalculating VSiWith the embedded vector V of each image in the databasepThe Euclidean distance of (D), obtain distance vector D ═ D1,d2,…,dn,…,dN}; taking the average value D of DmSelecting a probability value pseudo label P corresponding to the input label category processed by Softmax stored in a database as a label distance reference value according to the input labelc={p1c,p2c,…,pnc,…,pNcH, for distance dmWeighting to obtain a label weighted distance value DlMax (p), the maximum value of the pseudo label of the sample nn) For the label class to which the sample belongs, if Max (p)n)>0.8, marking the sample n as a credible sample, otherwise, marking the sample n as an untrusted sample; if the pseudo tag Max (p)n)>0.8 and the same as the input label, wherein the sample n is a credible positive sample; if the pseudo label Max (p)n)>0.8 and different from the input label, the sample n is a credible negative sample; otherwise, the distance is an untrusted sample, and the distance is not weighted; meanwhile, an attenuation coefficient is given to the weighted distance, so that the influence of the label on the retrieval result is reduced along with the increase of the steps, and finally the label is obtained according to D and DlThe sum sequences the images in the database, compares the label information of the pseudo label in the database with the label information of the target image, and obtains a retrieval result; the expression is as follows:
Dfinal=D+ω·Dl
wherein, ω is a label weighted distance weight, and when i is increased, i.e. the input sketch is more complete, ω is gradually decreased; omegap<0,ωpRepresenting confidence negative sample label weighted weights, ωn>0,ωnRepresenting a trusted positive sample label weighted weight; dnRepresents the average of the elements in the distance vector; d is a distance vector between the sketch branch and all the images; dlWeighting the label distance; dfinalThe distance according to which the final sorting is based.
When no commodity picture exists and the commodity is difficult to describe by characters, a user can manually draw a commodity sketch on a touch screen device by means of the image of the commodity, meanwhile, the characteristics (color, height, shape and the like) of the commodity to be searched can be input and searched at the same time, the commodity sketch is rendered into sketch branches and then input into a trained neural network model, the model returns k images most similar to the commodity sketch through the search of the sketch branches and the search of the label branch parts, and the searching efficiency is improved when stroke information is few.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (7)
1. A hand-drawn image real-time retrieval method fusing image label information is characterized by comprising the following steps:
inputting a hand-drawn sketch and label information of a target image into the trained improved neural network model, and retrieving in real time to obtain a retrieval result;
the improved neural network model comprises1、f2、f3And fex,f1To pre-train the network, f2For the layer of attention, f3To lower the dimension layer, fexA label extraction layer;
the training process of the improved neural network model comprises the following steps:
s1: constructing a training set, wherein the training set comprises an image set consisting of a plurality of images and a complete sketch which is correspondingly retracted, and an expansion tag set corresponding to the images, and the expansion tag set corresponding to the images consists of all tag information of the images;
s2: selecting one image in the image set as a target image in each step of training, and training f of the neural network model by using the hand-drawn sketch corresponding to the image1、f2、f3Three branches, fixed after training f1、f2Parameters, simultaneous training is completed by f1、f2、f3Extracting embedded vectors of all target images;
s3: inputting the target image in the image set into the trained f1In (1), a feature map of the target image is obtained, and the feature map is input into (f)exPredicting the label of the target image, and training f according to the label information in the extended label set by adopting a cross entropy loss functionexFixing parameters after training;
s4: rendering the complete sketch of each image in the image set into a plurality of sketches according to the stroke sequence of drawing, forming a sketch branch set of the image set after rendering of each sketch is completed, and performing f1、f2、f3Extracting an embedded vector of a sketch branch;
s5: calculating the embedded vector error of each picture in the draft branch and the embedded vector error of the target image by adopting a triple loss function, reversely propagating the errors to approach the target image and keep away from the non-target image as targets, and adjusting f in the model3The parameters of (1);
s6: and obtaining a sketch branch of the next target image, and repeating the steps S4-S6 until the model reaches the upper limit of the training times.
2. The method for retrieving hand-drawn images fused with image label information in real time as claimed in claim 1, wherein a label L ═ L is set for the images in the training set1,l2,...,ln,...,lNIs used for training the label extraction layer fexThe cross entropy Loss expression is as follows:
wherein K represents the total number of categories contained by the tag; n represents the total number of samples; n represents the nth sample; p is a radical ofncRepresenting the probability that sample n belongs to class c; lncThe correct probability label of class c representing sample n.
3. The method for retrieving the hand-drawn image fused with the image tag information in real time as claimed in claim 1, wherein a triple Loss function is adopted to calculate the error between the embedded vector of each picture in the sketch branch and the embedded vector of the target image, and the expression of triple Loss is as follows:
Loss=max(d(VSi,Vp)-d(VSi,Vn)+α,0)
wherein, VSiAn embedded vector representing the ith picture in the sketch branch; vpAn embedded vector representing the target image; vnAn embedded vector representing a random one of the images in the image set other than the target image; α is a constant; d is the Euclidean distance calculation.
4. The method for searching the hand-drawn image fused with the image label information in real time according to claim 1, wherein the step of inputting the target image hand-drawn sketch and the label information, searching in real time and obtaining a final search result comprises the following steps:
the method comprises the following steps: user-entered sketch through image distance network f1、f2、f3Obtaining a sketch embedding vector V of the step iSi;
Step two: calculating VSiWith the embedded vector V of each image in the databasepTo obtain a distance vector D ═ D1,d2,...,dn,...,dN};
Step three: computing elements in a distance vectorAnd f is calculated and1output feature map input fexPredicting the label probability of an input sketch, and processing the label probability by utilizing Softmax to obtain a pseudo label;
step four: average value d of elements in distance vector according to relation of pseudo label and input labelmWeighting to obtain a label weighted distance value Dl;
Step five: assigning an attenuation coefficient to the tag weighted distance based on D and DlAnd the sum sorts the images in the database and obtains a retrieval result.
5. The method for retrieving the hand-drawn image fused with the image tag information in real time as claimed in claim 4, wherein tag probability of the image is predicted by using a convolutional neural network, and probability vector sets P of N samples respectively belonging to the category c are obtained by Softmax processingc={p1c,p2c,...,pnc,...,pNcH, mixing PcProbability p that a sample n belongs to class c as a pseudo labelncExpressed as:
wherein, VncA probability vector representing that sample n belongs to class c; vnkA probability vector representing the total number of label categories for sample n; k represents the total number of categories contained in the label; n represents the total number of samples; n represents the nth sample; p is a radical ofncRepresenting the probability that sample n belongs to class c.
6. The method as claimed in claim 4, wherein the method comprises averaging d of elements in the distance vector according to the relationship between the pseudo tag and the input tagmWeighting to obtain a label weighted distance value DlMax (p), the maximum value of the pseudo label of the sample nn) For the label class to which the sample belongs, if Max (p)n) > 0.8, the sample n is markedA credible sample, otherwise, a non-credible sample is marked; if the pseudo label Max (p)n) Is more than 0.8 and is the same as the input label, and the sample n is a credible positive sample; if the pseudo label Max (p)n) Is more than 0.8 and is different from the input label, and the sample n is a credible negative sample; otherwise, the distance is an untrusted sample, and the distance is not weighted; calculating a tag weighted distance value DlThe expression of (a) is:
wherein d ismRepresents the average of the elements in the distance vector; dnRepresenting the Euclidean distance of the sample n from the vector of the sketch; n represents the total number of samples; dlRepresenting a tag weighted distance value; omegap<0,ωpWeighting weights, ω, for the confidence negative sample labelsn>0,ωnWeighting the credible positive sample label; p is a radical ofnA pseudo label representing the probability value of the sample n.
7. The method as claimed in claim 4, wherein the hand-drawn image real-time retrieval method comprises assigning an attenuation coefficient to the label weighting distance, D and DlThe sum ranks the images in the database, and the expression is as follows:
Dfinal=D+ω·Dl
wherein D is a distance vector between the sketch branch and all the images; dlWeighting the label distance; dfinalDistance according to the final sorting; ω is the label weighted distance weight, and ω gradually decreases as i increases, i.e., the input sketch is more complete.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210396360.2A CN114647754A (en) | 2022-04-15 | 2022-04-15 | Hand-drawn image real-time retrieval method fusing image label information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210396360.2A CN114647754A (en) | 2022-04-15 | 2022-04-15 | Hand-drawn image real-time retrieval method fusing image label information |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114647754A true CN114647754A (en) | 2022-06-21 |
Family
ID=81996817
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210396360.2A Pending CN114647754A (en) | 2022-04-15 | 2022-04-15 | Hand-drawn image real-time retrieval method fusing image label information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114647754A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116310425A (en) * | 2023-05-24 | 2023-06-23 | 山东大学 | Fine-grained image retrieval method, system, equipment and storage medium |
-
2022
- 2022-04-15 CN CN202210396360.2A patent/CN114647754A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116310425A (en) * | 2023-05-24 | 2023-06-23 | 山东大学 | Fine-grained image retrieval method, system, equipment and storage medium |
CN116310425B (en) * | 2023-05-24 | 2023-09-26 | 山东大学 | Fine-grained image retrieval method, system, equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109919108B (en) | Remote sensing image rapid target detection method based on deep hash auxiliary network | |
CN111191732B (en) | Target detection method based on full-automatic learning | |
CN110598029B (en) | Fine-grained image classification method based on attention transfer mechanism | |
US20220415027A1 (en) | Method for re-recognizing object image based on multi-feature information capture and correlation analysis | |
CN105701502B (en) | Automatic image annotation method based on Monte Carlo data equalization | |
CN107683469A (en) | A kind of product classification method and device based on deep learning | |
CN110633708A (en) | Deep network significance detection method based on global model and local optimization | |
CN112348036A (en) | Self-adaptive target detection method based on lightweight residual learning and deconvolution cascade | |
CN114841257B (en) | Small sample target detection method based on self-supervision comparison constraint | |
CN112668579A (en) | Weak supervision semantic segmentation method based on self-adaptive affinity and class distribution | |
CN111753828A (en) | Natural scene horizontal character detection method based on deep convolutional neural network | |
CN112115291B (en) | Three-dimensional indoor model retrieval method based on deep learning | |
Rad et al. | Image annotation using multi-view non-negative matrix factorization with different number of basis vectors | |
CN111061904A (en) | Local picture rapid detection method based on image content identification | |
CN111738113A (en) | Road extraction method of high-resolution remote sensing image based on double-attention machine system and semantic constraint | |
CN110287952A (en) | A kind of recognition methods and system for tieing up sonagram piece character | |
CN112347284A (en) | Combined trademark image retrieval method | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
CN111340034A (en) | Text detection and identification method and system for natural scene | |
CN110929746A (en) | Electronic file title positioning, extracting and classifying method based on deep neural network | |
CN115292532B (en) | Remote sensing image domain adaptive retrieval method based on pseudo tag consistency learning | |
CN116610778A (en) | Bidirectional image-text matching method based on cross-modal global and local attention mechanism | |
CN114510594A (en) | Traditional pattern subgraph retrieval method based on self-attention mechanism | |
CN112819837A (en) | Semantic segmentation method based on multi-source heterogeneous remote sensing image | |
CN115457332A (en) | Image multi-label classification method based on graph convolution neural network and class activation mapping |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |