CN104346456B - The digital picture multi-semantic meaning mask method measured based on spatial dependence - Google Patents
The digital picture multi-semantic meaning mask method measured based on spatial dependence Download PDFInfo
- Publication number
- CN104346456B CN104346456B CN201410599268.1A CN201410599268A CN104346456B CN 104346456 B CN104346456 B CN 104346456B CN 201410599268 A CN201410599268 A CN 201410599268A CN 104346456 B CN104346456 B CN 104346456B
- Authority
- CN
- China
- Prior art keywords
- mrow
- msub
- msubsup
- mtd
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5862—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using texture
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/26—Techniques for post-processing, e.g. correcting the recognition result
- G06V30/262—Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
- G06V30/274—Syntactic or semantic context, e.g. balancing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
- Image Analysis (AREA)
Abstract
The invention belongs to digital picture multi-semantic meaning mask method, it is characterised in that in turn include the following steps:(1)Semantic known some digital pictures and all digital pictures to be marked are inputted to computer;(2)By feature extraction, the set of eigenvectors of all images is obtained;(2)Construction has marked the final label vector collection of label vector and all images of image;(3)Calculate the Gram matrixes of set of eigenvectors;(4)Utilization space dependence measure obtains the metric of dependence degree between set of eigenvectors and label vector collection;(6)Dependence metric is stepped up in iterative process to maximum, the value of the confidence that image to be marked belongs to each semantic category is obtained;(7)Given threshold, judges image, semantic to be marked.The invention has the advantages that:1)Mark effect is improved using largely non-semantic tagger image;2)Situation is marked suitable for multi-semantic meaning;3)Faster operation speed.
Description
Technical field
The present invention relates to a kind of semi-supervised multi-semantic meaning mask method of digital picture measured based on spatial dependence, belong to electricity
Sub-information technical field.
Background technology
Linguistic indexing of pictures is intended to represent the semantic content of piece image using semantic key words, and it is for graphical analysis
Understand and image retrieval all has very important significance.The linguistic indexing of pictures of early stage needs professional according to each image
Semanteme it is artificial mark keyword, it is time-consuming and with subjectivity.In order to overcome these defects manually marked, researcher is in recent years
The method for proposing many automatic marking image, semantic contents, including the translation model based on generation model, across media relevant modes
The methods such as type, and the method such as asymmetric SVMs and hierarchical classification based on discrimination model.Usually, these method sheets
The process of machine learning can be regarded in matter as:Learn on the sample data set for having marked image composition and construct one
Statistical classification model, and obtain using the model semantic classes of image to be marked.
Although the proposition of numerous automatic semantic tagger technologies provides one and had for the analysis and understanding of mass image data
The basis of benefit and premise, but there are still many bottleneck problem urgent need to resolve for the technology.Wherein, image multi-semantic meaning and marked
This excessively rare two classes problem of image increasingly causes the extensive concern of researcher.Image multi-semantic meaning, which refers to a sub-picture, generally to be had
Multiple different semantic, such as in landscape figure, piece image can possess the themes such as " sky ", " white clouds ", " grassland " simultaneously;
In medical image, medical image can include simultaneously to " tumour ", " calculus " and etc. the related information of disease.Conventional machines learn
Method, including nearest neighbour method, decision tree, neutral net and SVMs etc., belong to single label learning method, it is impossible to directly use more
Linguistic indexing of pictures in the case of multi-semantic meaning.The situation that this single sample possesses many generics is referred to as many marks in machine learning field
Label study.At present, multi-tag problem concerning study has Binary Relevance, Classifier Chains, MLKNN and Rank-
The solutions such as SVM.These methods are that single stamp methods are changed by problem or algorithm improvement is obtained, and are respectively had in actual applications
It is good and bad.
In addition to multi-semantic meaning problem, automatic semantic tagger technology, which is also existed, has marked the problem of image is excessively rare.Make
Main cause into this problem is because the acquisition for having marked image is generally required for expending substantial amounts of man power and material.Especially
It is that all kinds of to have marked the relative reduction of picture number with the increase of semantic classes, this problem just seems outstanding in the case of multi-semantic meaning
Its is sharp.The decline of disaggregated model Generalization Capability can be caused by having marked sample excessively rareness, and then influence the accurate of semantic tagger
Rate.An effective way for solving this problem is exactly to develop semi-supervised semanteme marking method.At present, although semi-supervised learning side
Method has grown a lot, it is proposed that including TSVM, a variety of methods such as figure semi-supervised learning, but can be applied to multi-semantic meaning (many marks
Label) problem concerning study semi-supervised learning method it is still rarer.
For above-mentioned two problems, the invention discloses a kind of digital picture based on spatial dependence measurement is semi-supervised many
Semanteme marking method.Its theoretical foundation is spatial dependence measurement, using all samples, including has marked and do not marked sample
This, the dependence to feature set and semantic classes collection is estimated, and will mark image pattern as boundary constraint, finally leads to
Cross iterative technique and step up the estimate to maximum, so as to obtain all semantic classes of image to be marked.The present invention has
Good technique effect.First, the present invention, can be by increasing sample number based on the dependence based on statistical theory
Mesh, including do not mark the accuracy that number of samples improves dependence estimation, therefore it is a kind of first using not marking image
Improve the semi-supervised mask method of mark accuracy rate;Secondly, no matter image has how many semantic classes simultaneously, and the present invention will
The semantic combination of the image regards in semantic set a point as and maps to reproducing kernel Hilbert space, so it is also simultaneously
Multi-semantic meaning image labeling method;The last present invention completes image labeling on the basis of feasible direction method by iteration, achieves
With the comparable calculating speed of prior art.
The content of the invention
It is an object of the invention to provide a kind of semi-supervised multi-semantic meaning mask method of digital picture of precise and high efficiency.
The technical scheme is that:Receive semantic known some digital pictures and all digital pictures to be marked
And extract characteristics of image and obtain set of eigenvectors, construction has marked the label vector and the final label vector of all images of image
Collection, calculate set of eigenvectors Gram matrixes and according to spatial dependence measurement obtain image belong to each semantic category the value of the confidence and
Image is finally semantic, specifically comprises the steps of:
Step 1, some semantic known digital pictures are inputted and all digital pictures of progress semantic tagger are needed extremely
Computer;It is rgb format by the unification of all picture formats, and size normalization is carried out to all images;
Step 2, the global textural characteristics of image are extracted using Gist descriptors, above-mentioned all digital pictures are converted into
Vector, a width figure one column vector of correspondence, and by these characteristic vector composition of vector collection, it is designated as X=[x1, x2..., xv,
xv+1..., xv+u], wherein xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is the possible semantic classes sum of sample to make m, constructs and has marked image x under original statei(1≤i≤v)
Label vector be It is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1,
y2..., yv, yv+1..., yv+u];
Step 4, it is k (x to select the kernel function on set of eigenvectors Xi, xj), X Gram matrixes are calculated by kernel function,
It is designated as K;
Step 5, utilization space dependence measure obtains dependence degree between set of eigenvectors and label vector collection
Metric it is as follows:
Wherein, Tr [] represents to seek mark,I is unit matrix, e be element value be all 1 n ranks to
Amount, n=v+u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, updated using iterative technique
YUValue, Q (Y) is stepped up to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y=
[YV, YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is
Frobenius norms, τ > 0 are previously given smaller constants, for avoiding YUThe excessive decrease Y of yardstickVFor dependence journey
The contribution rate of degree;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the figure
As the average value of all semantic confidence values, i.e.,:
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i
≤ m), if YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
Kernel function in the step 3 includes radial direction base core, linear kernel, polynomial kernel, sigmoid cores.
The specific steps of the step 6 include:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ > 0 are the number of very little;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solution obtains new
Step 6.4, λ is madea=λb,
Step 6.5, λ is worked asb-λa< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U
(i, j) represents that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step
Rapid 6.5.
The specific steps of the step 6.3 include:
Step 6.3.1, specified threshold δ > 0 is the number of very little, order
M=(AU-λbHU)
N=2YV(AVU-λbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
Step 6.3.2, orderBy KKT conditions, ifAnd I.e.
For optimal solution, outputTo be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, at random
InitializationSo thatMeet
Step 6.3.4, initializes w2For constant, w is made1=-w2/2;Wherein, w2For representing next feasible direction
Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d;Wherein, direction d should ensure that next iteration point
WithFrobenius norms it is consistent and in the direction optimization target values increase it is most fast, meet the feasible direction d of this two condition
It can be calculated as follows:
Wherein,
Step 6.3.6, makes w1=α w1, w2=α w2, α < 1 are given normal numbers;
Step 6.3.7, whenOrderOtherwise step 6.3.5 is jumped to, continues to follow
Ring performs 6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues to follow
Ring performs step 6.3.4 to step 6.3.8.
The general principle of the present invention between the feature space and semantic space of image it is believed that have very strong dependence
Property, on the basis of quantitative estimation is carried out to dependence, using the semantic classes for having marked image as constraints, pass through iteration
Technology steps up the estimate to maximum, so as to obtain all semantic classes of image to be marked.
The present invention compared with prior art, with following obvious advantage and beneficial effect:
First, the present invention as a result of spatial dependence as theoretical foundation, be it is a kind of it is new be used to solving image it is many
The new technology of semantic tagger problem;Secondly, of the invention or a kind of semi-supervised mask method, it can be big by what is inexpensively easily taken
Amount does not mark image and learnt, therefore can often obtain the mark accuracy rate higher than prior art, is especially marking
In the case of noting image rareness, lifting effect is obvious;The last present invention is complete by iterative technique on the basis of feasible direction method
Into the mark of image, achieve and the comparable calculating speed of prior art.
Brief description of the drawings
Fig. 1 is the structured flowchart of the embodiment of the present invention.
Fig. 2 is the flow chart that the embodiment of the present invention obtains each sample the value of the confidence by iteration.
Fig. 3 is the flow chart that the embodiment of the present invention solves iterative process neutron optimization problem.
Fig. 4 is the ROC curve effect contrast figure of the embodiment of the present invention.
Embodiment
According to Fig. 1 dispose embodiments of the invention, comprising comprise the following steps that:
Step 1, digital picture known to 200 semantemes and remaining 1800 numeral for needing to carry out semantic tagger are inputted
Image is to computer, including desert, mountain peak, sea, the setting sun and the class of trees 5;It is rgb format by the unification of all picture formats, and
Size is carried out to all images and is normalized to 512 × 512;Here all images derive from Nanjing University's machine learning and data
Image data base disclosed in Research on Mining, can be from network address http://lamda.nju.edu.cn/data_
Downloaded in MIMLimage.ashx;
Step 2, the global textural characteristics of image are extracted using Gist descriptors:Each secondary figure is converted into gray-scale map, 4
Individual yardstick, 8 directions carry out Gabor filtering, and filtered image carries out 4 × 4 piecemeals, obtain 512 dimensions of each secondary figure
Gist feature column vectors;By these characteristic vector composition of vector collection, X=[x are designated as1, x2..., xv, xv+1..., xv+u], wherein
V=200, u=1800, xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is semantic classes sum to make m=5;Image x has been marked under construction original stateiThe label of (1≤i≤v)
Vector is It is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1,
y2..., yv, yv+1..., yv+u];
Step 4, the kernel function k (x on set of eigenvectors X are selectedi, xj) it is radial direction base core, X is calculated by the kernel function
Gram matrixes, be designated as K;
Step 5, utilization space dependence measure obtains dependence degree between set of eigenvectors and label vector collection
Metric it is as follows:
Wherein, Tr [] represents to seek mark,I is unit matrix, e be element value be all 1 n ranks to
Amount, n=v+u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, updated using iterative technique
YUValue, Q (Y) is stepped up to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y=
[YV, YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is
Frobenius norms, τ is redefined for 0.1, for avoiding YUThe excessive decrease Y of yardstickVFor the contribution rate of dependence degree;Figure
2 be the flow chart of step 6, is specifically comprised the following steps:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ=0.001;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solving-optimizing subproblem
Flow chart as shown in figure 3, comprising the following steps that:
Step 6.3.1, specified threshold δ=0.001 is the number of very little, order
M=(AU-λbHU)
N=2YV (AVU-λbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
Step 6.3.2, orderIfAnd As optimal solution, defeated
Go outTo be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, at random
InitializationSo thatMeet
Step 6.3.4, initializes w2=1 is constant, makes w1=-w2/2;Wherein, w2For representing next feasible direction
Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d:
Wherein,
Step 6.3.6, makes w1=α w1, w2=α w2, α=0.5 is given constant;
Step 6.3.7, ifOrderOtherwise step 6.3.5 is jumped to, continues to follow
Ring performs 6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues to follow
Ring performs step 6.3.4 to step 6.3.8;
Step 6.4, λ is madea=λb,
Step 6.5, λ is worked asb-λa< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U
(i, j) represents that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step
Rapid 6.5;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the figure
As the average value of all semantic confidence values, i.e.,:
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i
≤ m), if YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
The embodiment of the present invention and classics MLKNN (Zhang M L, et al.A k-nearest neighbor based
Algorithm for multi-label classification) and Binary Relevance (Boutell M R, et
Al.Learning multi-label scene classification) two kinds of mask methods have marked language at only 200
ROC curve (Receiver operating characteristic curve) under adopted image is as shown in Figure 4.In Fig. 4, this
Inventive embodiments are in desert, mountain peak, sea, and best AUC is achieved (under ROC curve in five classifications of the setting sun and trees
Aspect is accumulated), absolutely prove the present invention with good multi-semantic meaning mark effect.
Finally it should be noted that:Above example only not limits technology described in the invention to illustrate the present invention
Scheme;Therefore, although this specification with reference to each above-mentioned embodiment to present invention has been detailed description, this
Field it is to be appreciated by one skilled in the art that still can be modified to the present invention or equivalent substitution;And all do not depart from hair
The technical scheme of bright spirit and scope and its improvement, it all should cover among scope of the presently claimed invention.
Claims (3)
1. a kind of digital picture multi-semantic meaning mask method measured based on spatial dependence, it is characterised in that successively including following step
Suddenly:
Step 1, some semantic known digital pictures are inputted and need all digital pictures for carrying out semantic tagger extremely to calculate
Machine;It is rgb format by the unification of all picture formats, and size normalization is carried out to all images;
Step 2, using Gist descriptors extract image global textural characteristics, by above-mentioned all digital pictures be converted into
Amount, a width figure one column vector of correspondence, and by these characteristic vector composition of vector collection, it is designated as X=[x1, x2..., xv,
xv+1..., xv+u], wherein xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is the possible semantic classes sum of sample to make m, constructs and has marked image x under original stateiThe mark of (1≤i≤v)
Signing vector isIt is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1,
y2..., yv, yv+1..., yv+u];
Step 4, it is k (x to select the kernel function on set of eigenvectors Xi, xj), X Gram matrixes are calculated by kernel function, are designated as
K;
Step 5, utilization space dependence measure obtains the degree of dependence degree between set of eigenvectors and label vector collection
Value is as follows:
<mrow>
<mi>Q</mi>
<mrow>
<mo>(</mo>
<mi>Y</mi>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mfrac>
<mrow>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msup>
<mi>YHKHY</mi>
<mi>T</mi>
</msup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
</mrow>
<mrow>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msup>
<mi>YHY</mi>
<mi>T</mi>
</msup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
</mrow>
</mfrac>
</mrow>
Wherein, Tr [] represents to seek mark,I is unit matrix, and e is the n ranks vector that element value is all 1, n=v
+ u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, update Y using iterative techniqueU's
Value, steps up Q (Y) to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y=[YV,
YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is
Frobenius norms, τ > 0 are previously given constants, for avoiding YUThe excessive decrease Y of yardstickVFor the contribution of dependence degree
Rate, specific steps include:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mi>A</mi>
<mo>=</mo>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<msub>
<mi>A</mi>
<mi>V</mi>
</msub>
</mtd>
<mtd>
<msub>
<mi>A</mi>
<mrow>
<mi>V</mi>
<mi>U</mi>
</mrow>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<msub>
<mi>A</mi>
<mrow>
<mi>U</mi>
<mi>V</mi>
</mrow>
</msub>
</mtd>
<mtd>
<msub>
<mi>A</mi>
<mi>U</mi>
</msub>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
</mtd>
<mtd>
<mrow>
<mi>H</mi>
<mo>=</mo>
<mfenced open = "[" close = "]">
<mtable>
<mtr>
<mtd>
<msub>
<mi>H</mi>
<mi>V</mi>
</msub>
</mtd>
<mtd>
<msub>
<mi>H</mi>
<mrow>
<mi>V</mi>
<mi>U</mi>
</mrow>
</msub>
</mtd>
</mtr>
<mtr>
<mtd>
<msub>
<mi>H</mi>
<mrow>
<mi>U</mi>
<mi>V</mi>
</mrow>
</msub>
</mtd>
<mtd>
<msub>
<mi>H</mi>
<mi>U</mi>
</msub>
</mtd>
</mtr>
</mtable>
</mfenced>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<mrow>
<mi>f</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msup>
<mi>YAY</mi>
<mi>T</mi>
</msup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<mo>=</mo>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msub>
<mi>Y</mi>
<mi>V</mi>
</msub>
<msub>
<mi>A</mi>
<mi>V</mi>
</msub>
<msubsup>
<mi>Y</mi>
<mi>V</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<mn>2</mn>
<msub>
<mi>Y</mi>
<mi>V</mi>
</msub>
<msub>
<mi>A</mi>
<mrow>
<mi>V</mi>
<mi>U</mi>
</mrow>
</msub>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<msub>
<mi>A</mi>
<mi>U</mi>
</msub>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>g</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msup>
<mi>YHY</mi>
<mi>T</mi>
</msup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<mo>=</mo>
<mi>T</mi>
<mi>r</mi>
<mrow>
<mo>&lsqb;</mo>
<mrow>
<msub>
<mi>Y</mi>
<mi>V</mi>
</msub>
<msub>
<mi>H</mi>
<mi>V</mi>
</msub>
<msubsup>
<mi>Y</mi>
<mi>V</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<mn>2</mn>
<msub>
<mi>Y</mi>
<mi>V</mi>
</msub>
<msub>
<mi>H</mi>
<mrow>
<mi>V</mi>
<mi>U</mi>
</mrow>
</msub>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
<mo>+</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<msub>
<mi>H</mi>
<mi>U</mi>
</msub>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
</mrow>
<mo>&rsqb;</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ > 0 are constant;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solution obtains new
Step 6.4, λ is madea=λb,
Step 6.5, λ is worked asb-λa< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U(i, j)
Represent that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step
6.5;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the image institute
There is the average value of semantic confidence value, i.e.,:
<mrow>
<msub>
<mi>&epsiv;</mi>
<mi>j</mi>
</msub>
<mo>=</mo>
<mfrac>
<mrow>
<munder>
<mo>&Sigma;</mo>
<mi>i</mi>
</munder>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<mrow>
<mo>(</mo>
<mi>i</mi>
<mo>,</mo>
<mi>j</mi>
<mo>)</mo>
</mrow>
</mrow>
<mi>m</mi>
</mfrac>
</mrow>
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i≤m),
If YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
2. the digital picture multi-semantic meaning mask method according to claim 1 measured based on spatial dependence, its feature is existed
In:Kernel function in the step 3 includes radial direction base core, linear kernel, polynomial kernel, sigmoid cores.
3. the digital picture multi-semantic meaning mask method according to claim 1 measured based on spatial dependence, its feature is existed
In:The specific steps of the step 6.3 include:
Step 6.3.1, specified threshold δ > 0 is constant, order
M=(AU-λbHU)
N=2YV(AVU-λbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
<mfenced open = "" close = "">
<mtable>
<mtr>
<mtd>
<munder>
<mi>max</mi>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
</munder>
</mtd>
<mtd>
<mrow>
<mi>F</mi>
<mrow>
<mo>(</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mi>T</mi>
<mi>r</mi>
<mo>&lsqb;</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<msubsup>
<mi>MY</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
<mo>&rsqb;</mo>
<mo>+</mo>
<mi>T</mi>
<mi>r</mi>
<mo>&lsqb;</mo>
<msubsup>
<mi>NY</mi>
<mi>U</mi>
<mi>T</mi>
</msubsup>
<mo>&rsqb;</mo>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>s</mi>
<mo>.</mo>
<mi>t</mi>
<mo>.</mo>
</mrow>
</mtd>
<mtd>
<mrow>
<mo>|</mo>
<mo>|</mo>
<msub>
<mi>Y</mi>
<mi>U</mi>
</msub>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mi>F</mi>
<mn>2</mn>
</msubsup>
<mo>&le;</mo>
<mi>&tau;</mi>
</mrow>
</mtd>
</mtr>
</mtable>
</mfenced>
Step 6.3.2, orderIfAnd As optimal solution, is exported
To be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, it is random initial
ChangeSo thatMeet
Step 6.3.4, initializes w2To give constant, w is made1=-w2/2;Wherein, w2For representing next feasible direction
Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d it is as follows:
<mrow>
<mi>d</mi>
<mo>=</mo>
<mfrac>
<mrow>
<msup>
<mrow>
<mo>&lsqb;</mo>
<mo>&dtri;</mo>
<mi>F</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>&rsqb;</mo>
</mrow>
<mi>T</mi>
</msup>
<mo>-</mo>
<msub>
<mi>&xi;Y</mi>
<mi>U</mi>
</msub>
</mrow>
<mi>&eta;</mi>
</mfrac>
</mrow>
Wherein,
<mrow>
<mo>&dtri;</mo>
<mi>F</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>=</mo>
<mn>2</mn>
<mi>M</mi>
<msup>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msup>
<mo>+</mo>
<msup>
<mi>N</mi>
<mi>T</mi>
</msup>
</mrow>
<mrow>
<mi>&xi;</mi>
<mo>=</mo>
<mfrac>
<mrow>
<mi>T</mi>
<mi>r</mi>
<mo>&lsqb;</mo>
<mo>&dtri;</mo>
<mi>F</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>&rsqb;</mo>
<mo>-</mo>
<msub>
<mi>w</mi>
<mn>1</mn>
</msub>
<mi>&eta;</mi>
</mrow>
<mrow>
<mi>T</mi>
<mi>r</mi>
<mo>&lsqb;</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<msup>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mi>T</mi>
</msup>
<mo>&rsqb;</mo>
</mrow>
</mfrac>
</mrow>
<mrow>
<mi>&eta;</mi>
<mo>=</mo>
<msqrt>
<mfrac>
<mrow>
<mo>|</mo>
<mo>|</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mi>F</mi>
<mn>2</mn>
</msubsup>
<mo>|</mo>
<mo>|</mo>
<mo>&dtri;</mo>
<mi>F</mi>
<mrow>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
</mrow>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mi>F</mi>
<mn>2</mn>
</msubsup>
<mo>-</mo>
<msup>
<mrow>
<mo>(</mo>
<mi>T</mi>
<mi>r</mi>
<mo>&lsqb;</mo>
<mo>&dtri;</mo>
<mi>F</mi>
<mo>(</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>)</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>&rsqb;</mo>
<mo>)</mo>
</mrow>
<mn>2</mn>
</msup>
</mrow>
<mrow>
<msub>
<mi>w</mi>
<mn>2</mn>
</msub>
<mo>|</mo>
<mo>|</mo>
<msubsup>
<mi>Y</mi>
<mi>U</mi>
<mi>c</mi>
</msubsup>
<mo>|</mo>
<msubsup>
<mo>|</mo>
<mi>F</mi>
<mn>2</mn>
</msubsup>
<mo>-</mo>
<msup>
<msub>
<mi>w</mi>
<mn>1</mn>
</msub>
<mn>2</mn>
</msup>
</mrow>
</mfrac>
</msqrt>
</mrow>
Step 6.3.6, makes w1=α w1, w2=α w2, α < 1 are given normal numbers;
Step 6.3.7, whenOrderOtherwise step 6.3.5 is jumped to, execution is continued cycling through
6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues cycling through and holds
Row step 6.3.4 to step 6.3.8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410599268.1A CN104346456B (en) | 2014-10-31 | 2014-10-31 | The digital picture multi-semantic meaning mask method measured based on spatial dependence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410599268.1A CN104346456B (en) | 2014-10-31 | 2014-10-31 | The digital picture multi-semantic meaning mask method measured based on spatial dependence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104346456A CN104346456A (en) | 2015-02-11 |
CN104346456B true CN104346456B (en) | 2017-09-08 |
Family
ID=52502047
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410599268.1A Expired - Fee Related CN104346456B (en) | 2014-10-31 | 2014-10-31 | The digital picture multi-semantic meaning mask method measured based on spatial dependence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104346456B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105701509B (en) * | 2016-01-13 | 2019-03-12 | 清华大学 | A kind of image classification method based on across classification migration Active Learning |
CN107391599B (en) * | 2017-06-30 | 2021-01-12 | 中原智慧城市设计研究院有限公司 | Image retrieval method based on style characteristics |
CN109190060B (en) * | 2018-07-10 | 2021-05-14 | 天津大学 | Service annotation quality optimization method based on effective human-computer interaction |
CN111428733B (en) * | 2020-03-12 | 2023-05-23 | 山东大学 | Zero sample target detection method and system based on semantic feature space conversion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7814040B1 (en) * | 2006-01-31 | 2010-10-12 | The Research Foundation Of State University Of New York | System and method for image annotation and multi-modal image retrieval using probabilistic semantic models |
CN103336969A (en) * | 2013-05-31 | 2013-10-02 | 中国科学院自动化研究所 | Image meaning parsing method based on soft glance learning |
CN103605667A (en) * | 2013-10-28 | 2014-02-26 | 中国计量学院 | Automatic image annotation algorithm |
CN103955462A (en) * | 2014-03-21 | 2014-07-30 | 南京邮电大学 | Image marking method based on multi-view and semi-supervised learning mechanism |
-
2014
- 2014-10-31 CN CN201410599268.1A patent/CN104346456B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7814040B1 (en) * | 2006-01-31 | 2010-10-12 | The Research Foundation Of State University Of New York | System and method for image annotation and multi-modal image retrieval using probabilistic semantic models |
CN103336969A (en) * | 2013-05-31 | 2013-10-02 | 中国科学院自动化研究所 | Image meaning parsing method based on soft glance learning |
CN103605667A (en) * | 2013-10-28 | 2014-02-26 | 中国计量学院 | Automatic image annotation algorithm |
CN103955462A (en) * | 2014-03-21 | 2014-07-30 | 南京邮电大学 | Image marking method based on multi-view and semi-supervised learning mechanism |
Non-Patent Citations (2)
Title |
---|
Trace Ratio vs. Ratio Trace for Dimensionality Reduction;Huan Wang 等;《Computer Vision & Pattern》;20070716;第1-8页 * |
从希尔伯特-施密特独立性中学习的多标签半监督学习方法;张晨光 等;《中国科技论文》;20131031;第8卷(第10期);第998-1002页 * |
Also Published As
Publication number | Publication date |
---|---|
CN104346456A (en) | 2015-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105184303B (en) | A kind of image labeling method based on multi-modal deep learning | |
CN101916376B (en) | Local spline embedding-based orthogonal semi-monitoring subspace image classification method | |
Wang et al. | Remote sensing image retrieval by scene semantic matching | |
US20210089827A1 (en) | Feature representation device, feature representation method, and program | |
CN105117429A (en) | Scenario image annotation method based on active learning and multi-label multi-instance learning | |
CN105808752B (en) | A kind of automatic image marking method based on CCA and 2PKNN | |
CN109063112B (en) | Rapid image retrieval method, model and model construction method based on multitask learning deep semantic hash | |
Bui et al. | Scalable sketch-based image retrieval using color gradient features | |
CN111125411B (en) | Large-scale image retrieval method for deep strong correlation hash learning | |
CN104112018B (en) | A kind of large-scale image search method | |
CN107943856A (en) | A kind of file classification method and system based on expansion marker samples | |
EP3166020A1 (en) | Method and apparatus for image classification based on dictionary learning | |
CN101710334A (en) | Large-scale image library retrieving method based on image Hash | |
CN104346456B (en) | The digital picture multi-semantic meaning mask method measured based on spatial dependence | |
CN104834693A (en) | Depth-search-based visual image searching method and system thereof | |
CN114299362A (en) | Small sample image classification method based on k-means clustering | |
CN104036021A (en) | Method for semantically annotating images on basis of hybrid generative and discriminative learning models | |
CN115439715A (en) | Semi-supervised few-sample image classification learning method and system based on anti-label learning | |
CN114510594A (en) | Traditional pattern subgraph retrieval method based on self-attention mechanism | |
Lu et al. | Image categorization via robust pLSA | |
Dimitrovski et al. | Detection of visual concepts and annotation of images using ensembles of trees for hierarchical multi-label classification | |
CN108549915A (en) | Image hash code training pattern algorithm based on two-value weight and classification learning method | |
CN111914108A (en) | Discrete supervision cross-modal Hash retrieval method based on semantic preservation | |
Xiong et al. | A confounder-free fusion network for aerial image scene feature representation | |
CN114219047B (en) | Heterogeneous domain self-adaption method, device and equipment based on pseudo label screening |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170908 Termination date: 20211031 |
|
CF01 | Termination of patent right due to non-payment of annual fee |