CN104346456B - The digital picture multi-semantic meaning mask method measured based on spatial dependence - Google Patents

The digital picture multi-semantic meaning mask method measured based on spatial dependence Download PDF

Info

Publication number
CN104346456B
CN104346456B CN201410599268.1A CN201410599268A CN104346456B CN 104346456 B CN104346456 B CN 104346456B CN 201410599268 A CN201410599268 A CN 201410599268A CN 104346456 B CN104346456 B CN 104346456B
Authority
CN
China
Prior art keywords
mrow
msub
msubsup
mtd
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410599268.1A
Other languages
Chinese (zh)
Other versions
CN104346456A (en
Inventor
张晨光
张燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan University
Original Assignee
Hainan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan University filed Critical Hainan University
Priority to CN201410599268.1A priority Critical patent/CN104346456B/en
Publication of CN104346456A publication Critical patent/CN104346456A/en
Application granted granted Critical
Publication of CN104346456B publication Critical patent/CN104346456B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5862Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/26Techniques for post-processing, e.g. correcting the recognition result
    • G06V30/262Techniques for post-processing, e.g. correcting the recognition result using context analysis, e.g. lexical, syntactic or semantic context
    • G06V30/274Syntactic or semantic context, e.g. balancing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to digital picture multi-semantic meaning mask method, it is characterised in that in turn include the following steps:(1)Semantic known some digital pictures and all digital pictures to be marked are inputted to computer;(2)By feature extraction, the set of eigenvectors of all images is obtained;(2)Construction has marked the final label vector collection of label vector and all images of image;(3)Calculate the Gram matrixes of set of eigenvectors;(4)Utilization space dependence measure obtains the metric of dependence degree between set of eigenvectors and label vector collection;(6)Dependence metric is stepped up in iterative process to maximum, the value of the confidence that image to be marked belongs to each semantic category is obtained;(7)Given threshold, judges image, semantic to be marked.The invention has the advantages that:1)Mark effect is improved using largely non-semantic tagger image;2)Situation is marked suitable for multi-semantic meaning;3)Faster operation speed.

Description

The digital picture multi-semantic meaning mask method measured based on spatial dependence
Technical field
The present invention relates to a kind of semi-supervised multi-semantic meaning mask method of digital picture measured based on spatial dependence, belong to electricity Sub-information technical field.
Background technology
Linguistic indexing of pictures is intended to represent the semantic content of piece image using semantic key words, and it is for graphical analysis Understand and image retrieval all has very important significance.The linguistic indexing of pictures of early stage needs professional according to each image Semanteme it is artificial mark keyword, it is time-consuming and with subjectivity.In order to overcome these defects manually marked, researcher is in recent years The method for proposing many automatic marking image, semantic contents, including the translation model based on generation model, across media relevant modes The methods such as type, and the method such as asymmetric SVMs and hierarchical classification based on discrimination model.Usually, these method sheets The process of machine learning can be regarded in matter as:Learn on the sample data set for having marked image composition and construct one Statistical classification model, and obtain using the model semantic classes of image to be marked.
Although the proposition of numerous automatic semantic tagger technologies provides one and had for the analysis and understanding of mass image data The basis of benefit and premise, but there are still many bottleneck problem urgent need to resolve for the technology.Wherein, image multi-semantic meaning and marked This excessively rare two classes problem of image increasingly causes the extensive concern of researcher.Image multi-semantic meaning, which refers to a sub-picture, generally to be had Multiple different semantic, such as in landscape figure, piece image can possess the themes such as " sky ", " white clouds ", " grassland " simultaneously; In medical image, medical image can include simultaneously to " tumour ", " calculus " and etc. the related information of disease.Conventional machines learn Method, including nearest neighbour method, decision tree, neutral net and SVMs etc., belong to single label learning method, it is impossible to directly use more Linguistic indexing of pictures in the case of multi-semantic meaning.The situation that this single sample possesses many generics is referred to as many marks in machine learning field Label study.At present, multi-tag problem concerning study has Binary Relevance, Classifier Chains, MLKNN and Rank- The solutions such as SVM.These methods are that single stamp methods are changed by problem or algorithm improvement is obtained, and are respectively had in actual applications It is good and bad.
In addition to multi-semantic meaning problem, automatic semantic tagger technology, which is also existed, has marked the problem of image is excessively rare.Make Main cause into this problem is because the acquisition for having marked image is generally required for expending substantial amounts of man power and material.Especially It is that all kinds of to have marked the relative reduction of picture number with the increase of semantic classes, this problem just seems outstanding in the case of multi-semantic meaning Its is sharp.The decline of disaggregated model Generalization Capability can be caused by having marked sample excessively rareness, and then influence the accurate of semantic tagger Rate.An effective way for solving this problem is exactly to develop semi-supervised semanteme marking method.At present, although semi-supervised learning side Method has grown a lot, it is proposed that including TSVM, a variety of methods such as figure semi-supervised learning, but can be applied to multi-semantic meaning (many marks Label) problem concerning study semi-supervised learning method it is still rarer.
For above-mentioned two problems, the invention discloses a kind of digital picture based on spatial dependence measurement is semi-supervised many Semanteme marking method.Its theoretical foundation is spatial dependence measurement, using all samples, including has marked and do not marked sample This, the dependence to feature set and semantic classes collection is estimated, and will mark image pattern as boundary constraint, finally leads to Cross iterative technique and step up the estimate to maximum, so as to obtain all semantic classes of image to be marked.The present invention has Good technique effect.First, the present invention, can be by increasing sample number based on the dependence based on statistical theory Mesh, including do not mark the accuracy that number of samples improves dependence estimation, therefore it is a kind of first using not marking image Improve the semi-supervised mask method of mark accuracy rate;Secondly, no matter image has how many semantic classes simultaneously, and the present invention will The semantic combination of the image regards in semantic set a point as and maps to reproducing kernel Hilbert space, so it is also simultaneously Multi-semantic meaning image labeling method;The last present invention completes image labeling on the basis of feasible direction method by iteration, achieves With the comparable calculating speed of prior art.
The content of the invention
It is an object of the invention to provide a kind of semi-supervised multi-semantic meaning mask method of digital picture of precise and high efficiency.
The technical scheme is that:Receive semantic known some digital pictures and all digital pictures to be marked And extract characteristics of image and obtain set of eigenvectors, construction has marked the label vector and the final label vector of all images of image Collection, calculate set of eigenvectors Gram matrixes and according to spatial dependence measurement obtain image belong to each semantic category the value of the confidence and Image is finally semantic, specifically comprises the steps of:
Step 1, some semantic known digital pictures are inputted and all digital pictures of progress semantic tagger are needed extremely Computer;It is rgb format by the unification of all picture formats, and size normalization is carried out to all images;
Step 2, the global textural characteristics of image are extracted using Gist descriptors, above-mentioned all digital pictures are converted into Vector, a width figure one column vector of correspondence, and by these characteristic vector composition of vector collection, it is designated as X=[x1, x2..., xv, xv+1..., xv+u], wherein xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is the possible semantic classes sum of sample to make m, constructs and has marked image x under original statei(1≤i≤v) Label vector be It is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1, y2..., yv, yv+1..., yv+u];
Step 4, it is k (x to select the kernel function on set of eigenvectors Xi, xj), X Gram matrixes are calculated by kernel function, It is designated as K;
Step 5, utilization space dependence measure obtains dependence degree between set of eigenvectors and label vector collection Metric it is as follows:
Wherein, Tr [] represents to seek mark,I is unit matrix, e be element value be all 1 n ranks to Amount, n=v+u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, updated using iterative technique YUValue, Q (Y) is stepped up to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y= [YV, YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is Frobenius norms, τ > 0 are previously given smaller constants, for avoiding YUThe excessive decrease Y of yardstickVFor dependence journey The contribution rate of degree;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the figure As the average value of all semantic confidence values, i.e.,:
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i ≤ m), if YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
Kernel function in the step 3 includes radial direction base core, linear kernel, polynomial kernel, sigmoid cores.
The specific steps of the step 6 include:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ > 0 are the number of very little;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solution obtains new
Step 6.4, λ is madeab,
Step 6.5, λ is worked asba< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U (i, j) represents that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step Rapid 6.5.
The specific steps of the step 6.3 include:
Step 6.3.1, specified threshold δ > 0 is the number of very little, order
M=(AUbHU)
N=2YV(AVU-λbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
Step 6.3.2, orderBy KKT conditions, ifAnd I.e. For optimal solution, outputTo be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, at random InitializationSo thatMeet
Step 6.3.4, initializes w2For constant, w is made1=-w2/2;Wherein, w2For representing next feasible direction Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d;Wherein, direction d should ensure that next iteration point WithFrobenius norms it is consistent and in the direction optimization target values increase it is most fast, meet the feasible direction d of this two condition It can be calculated as follows:
Wherein,
Step 6.3.6, makes w1=α w1, w2=α w2, α < 1 are given normal numbers;
Step 6.3.7, whenOrderOtherwise step 6.3.5 is jumped to, continues to follow Ring performs 6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues to follow Ring performs step 6.3.4 to step 6.3.8.
The general principle of the present invention between the feature space and semantic space of image it is believed that have very strong dependence Property, on the basis of quantitative estimation is carried out to dependence, using the semantic classes for having marked image as constraints, pass through iteration Technology steps up the estimate to maximum, so as to obtain all semantic classes of image to be marked.
The present invention compared with prior art, with following obvious advantage and beneficial effect:
First, the present invention as a result of spatial dependence as theoretical foundation, be it is a kind of it is new be used to solving image it is many The new technology of semantic tagger problem;Secondly, of the invention or a kind of semi-supervised mask method, it can be big by what is inexpensively easily taken Amount does not mark image and learnt, therefore can often obtain the mark accuracy rate higher than prior art, is especially marking In the case of noting image rareness, lifting effect is obvious;The last present invention is complete by iterative technique on the basis of feasible direction method Into the mark of image, achieve and the comparable calculating speed of prior art.
Brief description of the drawings
Fig. 1 is the structured flowchart of the embodiment of the present invention.
Fig. 2 is the flow chart that the embodiment of the present invention obtains each sample the value of the confidence by iteration.
Fig. 3 is the flow chart that the embodiment of the present invention solves iterative process neutron optimization problem.
Fig. 4 is the ROC curve effect contrast figure of the embodiment of the present invention.
Embodiment
According to Fig. 1 dispose embodiments of the invention, comprising comprise the following steps that:
Step 1, digital picture known to 200 semantemes and remaining 1800 numeral for needing to carry out semantic tagger are inputted Image is to computer, including desert, mountain peak, sea, the setting sun and the class of trees 5;It is rgb format by the unification of all picture formats, and Size is carried out to all images and is normalized to 512 × 512;Here all images derive from Nanjing University's machine learning and data Image data base disclosed in Research on Mining, can be from network address http://lamda.nju.edu.cn/data_ Downloaded in MIMLimage.ashx;
Step 2, the global textural characteristics of image are extracted using Gist descriptors:Each secondary figure is converted into gray-scale map, 4 Individual yardstick, 8 directions carry out Gabor filtering, and filtered image carries out 4 × 4 piecemeals, obtain 512 dimensions of each secondary figure Gist feature column vectors;By these characteristic vector composition of vector collection, X=[x are designated as1, x2..., xv, xv+1..., xv+u], wherein V=200, u=1800, xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is semantic classes sum to make m=5;Image x has been marked under construction original stateiThe label of (1≤i≤v) Vector is It is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1, y2..., yv, yv+1..., yv+u];
Step 4, the kernel function k (x on set of eigenvectors X are selectedi, xj) it is radial direction base core, X is calculated by the kernel function Gram matrixes, be designated as K;
Step 5, utilization space dependence measure obtains dependence degree between set of eigenvectors and label vector collection Metric it is as follows:
Wherein, Tr [] represents to seek mark,I is unit matrix, e be element value be all 1 n ranks to Amount, n=v+u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, updated using iterative technique YUValue, Q (Y) is stepped up to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y= [YV, YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is Frobenius norms, τ is redefined for 0.1, for avoiding YUThe excessive decrease Y of yardstickVFor the contribution rate of dependence degree;Figure 2 be the flow chart of step 6, is specifically comprised the following steps:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ=0.001;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solving-optimizing subproblem Flow chart as shown in figure 3, comprising the following steps that:
Step 6.3.1, specified threshold δ=0.001 is the number of very little, order
M=(AUbHU)
N=2YV (AVUbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
Step 6.3.2, orderIfAnd As optimal solution, defeated Go outTo be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, at random InitializationSo thatMeet
Step 6.3.4, initializes w2=1 is constant, makes w1=-w2/2;Wherein, w2For representing next feasible direction Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d:
Wherein,
Step 6.3.6, makes w1=α w1, w2=α w2, α=0.5 is given constant;
Step 6.3.7, ifOrderOtherwise step 6.3.5 is jumped to, continues to follow Ring performs 6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues to follow Ring performs step 6.3.4 to step 6.3.8;
Step 6.4, λ is madeab,
Step 6.5, λ is worked asba< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U (i, j) represents that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step Rapid 6.5;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the figure As the average value of all semantic confidence values, i.e.,:
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i ≤ m), if YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
The embodiment of the present invention and classics MLKNN (Zhang M L, et al.A k-nearest neighbor based Algorithm for multi-label classification) and Binary Relevance (Boutell M R, et Al.Learning multi-label scene classification) two kinds of mask methods have marked language at only 200 ROC curve (Receiver operating characteristic curve) under adopted image is as shown in Figure 4.In Fig. 4, this Inventive embodiments are in desert, mountain peak, sea, and best AUC is achieved (under ROC curve in five classifications of the setting sun and trees Aspect is accumulated), absolutely prove the present invention with good multi-semantic meaning mark effect.
Finally it should be noted that:Above example only not limits technology described in the invention to illustrate the present invention Scheme;Therefore, although this specification with reference to each above-mentioned embodiment to present invention has been detailed description, this Field it is to be appreciated by one skilled in the art that still can be modified to the present invention or equivalent substitution;And all do not depart from hair The technical scheme of bright spirit and scope and its improvement, it all should cover among scope of the presently claimed invention.

Claims (3)

1. a kind of digital picture multi-semantic meaning mask method measured based on spatial dependence, it is characterised in that successively including following step Suddenly:
Step 1, some semantic known digital pictures are inputted and need all digital pictures for carrying out semantic tagger extremely to calculate Machine;It is rgb format by the unification of all picture formats, and size normalization is carried out to all images;
Step 2, using Gist descriptors extract image global textural characteristics, by above-mentioned all digital pictures be converted into Amount, a width figure one column vector of correspondence, and by these characteristic vector composition of vector collection, it is designated as X=[x1, x2..., xv, xv+1..., xv+u], wherein xi(1≤i≤v) correspondence has marked image, remaining correspondence image to be marked;
Step 3, it is the possible semantic classes sum of sample to make m, constructs and has marked image x under original stateiThe mark of (1≤i≤v) Signing vector isIt is m dimensional vectors, wherein:
Make m dimensional vectors yiRepresent image xiThe final label vector of (1≤i≤u+v), construction label vector collection Y=[y1, y2..., yv, yv+1..., yv+u];
Step 4, it is k (x to select the kernel function on set of eigenvectors Xi, xj), X Gram matrixes are calculated by kernel function, are designated as K;
Step 5, utilization space dependence measure obtains the degree of dependence degree between set of eigenvectors and label vector collection Value is as follows:
<mrow> <mi>Q</mi> <mrow> <mo>(</mo> <mi>Y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msup> <mi>YHKHY</mi> <mi>T</mi> </msup> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> <mrow> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msup> <mi>YHY</mi> <mi>T</mi> </msup> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mfrac> </mrow>
Wherein, Tr [] represents to seek mark,I is unit matrix, and e is the n ranks vector that element value is all 1, n=v + u represents image pattern sum;
Step 6, it is ensured that Y meets conditionAndIn the case of, update Y using iterative techniqueU's Value, steps up Q (Y) to maximum, so as to obtain the value of the confidence Y that image to be marked belongs to each semantic categoryU;Wherein, Y=[YV, YU], YVAnd YUIt is Y preceding v row and rear u row respectively, corresponds respectively to known to semanteme and unknown portions, | | | |FIt is Frobenius norms, τ > 0 are previously given constants, for avoiding YUThe excessive decrease Y of yardstickVFor the contribution of dependence degree Rate, specific steps include:
Step 6.1, remember A=HKH, A and H is divided into four parts according to having marked with to be marked:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>A</mi> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msub> <mi>A</mi> <mi>V</mi> </msub> </mtd> <mtd> <msub> <mi>A</mi> <mrow> <mi>V</mi> <mi>U</mi> </mrow> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>A</mi> <mrow> <mi>U</mi> <mi>V</mi> </mrow> </msub> </mtd> <mtd> <msub> <mi>A</mi> <mi>U</mi> </msub> </mtd> </mtr> </mtable> </mfenced> </mrow> </mtd> <mtd> <mrow> <mi>H</mi> <mo>=</mo> <mfenced open = "[" close = "]"> <mtable> <mtr> <mtd> <msub> <mi>H</mi> <mi>V</mi> </msub> </mtd> <mtd> <msub> <mi>H</mi> <mrow> <mi>V</mi> <mi>U</mi> </mrow> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>H</mi> <mrow> <mi>U</mi> <mi>V</mi> </mrow> </msub> </mtd> <mtd> <msub> <mi>H</mi> <mi>U</mi> </msub> </mtd> </mtr> </mtable> </mfenced> </mrow> </mtd> </mtr> </mtable> </mfenced>
Wherein, AVAnd HVMark part in correspondence image, AUAnd HUThen part to be marked in correspondence image, andOrder
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>f</mi> <mrow> <mo>(</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msup> <mi>YAY</mi> <mi>T</mi> </msup> </mrow> <mo>&amp;rsqb;</mo> </mrow> <mo>=</mo> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msub> <mi>Y</mi> <mi>V</mi> </msub> <msub> <mi>A</mi> <mi>V</mi> </msub> <msubsup> <mi>Y</mi> <mi>V</mi> <mi>T</mi> </msubsup> <mo>+</mo> <mn>2</mn> <msub> <mi>Y</mi> <mi>V</mi> </msub> <msub> <mi>A</mi> <mrow> <mi>V</mi> <mi>U</mi> </mrow> </msub> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>T</mi> </msubsup> <mo>+</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <msub> <mi>A</mi> <mi>U</mi> </msub> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>T</mi> </msubsup> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>g</mi> <mrow> <mo>(</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msup> <mi>YHY</mi> <mi>T</mi> </msup> </mrow> <mo>&amp;rsqb;</mo> </mrow> <mo>=</mo> <mi>T</mi> <mi>r</mi> <mrow> <mo>&amp;lsqb;</mo> <mrow> <msub> <mi>Y</mi> <mi>V</mi> </msub> <msub> <mi>H</mi> <mi>V</mi> </msub> <msubsup> <mi>Y</mi> <mi>V</mi> <mi>T</mi> </msubsup> <mo>+</mo> <mn>2</mn> <msub> <mi>Y</mi> <mi>V</mi> </msub> <msub> <mi>H</mi> <mrow> <mi>V</mi> <mi>U</mi> </mrow> </msub> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>T</mi> </msubsup> <mo>+</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <msub> <mi>H</mi> <mi>U</mi> </msub> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>T</mi> </msubsup> </mrow> <mo>&amp;rsqb;</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced>
Mark ratio Q (Y) is converted on YUFunction f (YU)/g(YU);
Step 6.2, given threshold value κ > 0 are constant;Random initializtionSo thatOrder
Step 6.3, F (Y are madeU)=f (YU)-λbg(YU), solution obtains new
Step 6.4, λ is madeab,
Step 6.5, λ is worked asba< κ, outputYUIn each row YU(:, j) i-th of number Y of (j=1 ..., u)U(i, j) Represent that j-th of sample belongs to the confidence level of the i-th class;Otherwise step 6.3 is jumped to, execution step 6.3 is continued cycling through to step 6.5;
Step 7, to any one secondary image x to be markedj(v+1≤j≤v+u), sets the confidence threshold ε of the imagejFor the image institute There is the average value of semantic confidence value, i.e.,:
<mrow> <msub> <mi>&amp;epsiv;</mi> <mi>j</mi> </msub> <mo>=</mo> <mfrac> <mrow> <munder> <mo>&amp;Sigma;</mo> <mi>i</mi> </munder> <msub> <mi>Y</mi> <mi>U</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> </mrow> <mi>m</mi> </mfrac> </mrow>
To any image image x to be marked to be markedj(v+1≤j≤v+u) and any given semantic classes i (1≤i≤m), If YU(i, j) > εj, then judgement sample is with i-th of semantic classes, and otherwise judgement sample is semantic without i-th.
2. the digital picture multi-semantic meaning mask method according to claim 1 measured based on spatial dependence, its feature is existed In:Kernel function in the step 3 includes radial direction base core, linear kernel, polynomial kernel, sigmoid cores.
3. the digital picture multi-semantic meaning mask method according to claim 1 measured based on spatial dependence, its feature is existed In:The specific steps of the step 6.3 include:
Step 6.3.1, specified threshold δ > 0 is constant, order
M=(AUbHU)
N=2YV(AVUbHVU)
ConstructionThe same solution problem of this optimization problem is as follows:
<mfenced open = "" close = ""> <mtable> <mtr> <mtd> <munder> <mi>max</mi> <msub> <mi>Y</mi> <mi>U</mi> </msub> </munder> </mtd> <mtd> <mrow> <mi>F</mi> <mrow> <mo>(</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mi>T</mi> <mi>r</mi> <mo>&amp;lsqb;</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <msubsup> <mi>MY</mi> <mi>U</mi> <mi>T</mi> </msubsup> <mo>&amp;rsqb;</mo> <mo>+</mo> <mi>T</mi> <mi>r</mi> <mo>&amp;lsqb;</mo> <msubsup> <mi>NY</mi> <mi>U</mi> <mi>T</mi> </msubsup> <mo>&amp;rsqb;</mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>s</mi> <mo>.</mo> <mi>t</mi> <mo>.</mo> </mrow> </mtd> <mtd> <mrow> <mo>|</mo> <mo>|</mo> <msub> <mi>Y</mi> <mi>U</mi> </msub> <mo>|</mo> <msubsup> <mo>|</mo> <mi>F</mi> <mn>2</mn> </msubsup> <mo>&amp;le;</mo> <mi>&amp;tau;</mi> </mrow> </mtd> </mtr> </mtable> </mfenced>
Step 6.3.2, orderIfAnd As optimal solution, is exported To be newOtherwise, optimal solution is now transferred to next step on border;
Step 6.3.3, initialization pointsIt is used as new zequin;IfOrderOtherwise, it is random initial ChangeSo thatMeet
Step 6.3.4, initializes w2To give constant, w is made1=-w2/2;Wherein, w2For representing next feasible direction Frobenius norms;
Step 6.3.5, calculates current pointFeasible direction d it is as follows:
<mrow> <mi>d</mi> <mo>=</mo> <mfrac> <mrow> <msup> <mrow> <mo>&amp;lsqb;</mo> <mo>&amp;dtri;</mo> <mi>F</mi> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <mo>&amp;rsqb;</mo> </mrow> <mi>T</mi> </msup> <mo>-</mo> <msub> <mi>&amp;xi;Y</mi> <mi>U</mi> </msub> </mrow> <mi>&amp;eta;</mi> </mfrac> </mrow>
Wherein,
<mrow> <mo>&amp;dtri;</mo> <mi>F</mi> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <mo>=</mo> <mn>2</mn> <mi>M</mi> <msup> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <mi>T</mi> </msup> <mo>+</mo> <msup> <mi>N</mi> <mi>T</mi> </msup> </mrow>
<mrow> <mi>&amp;xi;</mi> <mo>=</mo> <mfrac> <mrow> <mi>T</mi> <mi>r</mi> <mo>&amp;lsqb;</mo> <mo>&amp;dtri;</mo> <mi>F</mi> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>&amp;rsqb;</mo> <mo>-</mo> <msub> <mi>w</mi> <mn>1</mn> </msub> <mi>&amp;eta;</mi> </mrow> <mrow> <mi>T</mi> <mi>r</mi> <mo>&amp;lsqb;</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <msup> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <mi>T</mi> </msup> <mo>&amp;rsqb;</mo> </mrow> </mfrac> </mrow>
<mrow> <mi>&amp;eta;</mi> <mo>=</mo> <msqrt> <mfrac> <mrow> <mo>|</mo> <mo>|</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>|</mo> <msubsup> <mo>|</mo> <mi>F</mi> <mn>2</mn> </msubsup> <mo>|</mo> <mo>|</mo> <mo>&amp;dtri;</mo> <mi>F</mi> <mrow> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> </mrow> <mo>|</mo> <msubsup> <mo>|</mo> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <msup> <mrow> <mo>(</mo> <mi>T</mi> <mi>r</mi> <mo>&amp;lsqb;</mo> <mo>&amp;dtri;</mo> <mi>F</mi> <mo>(</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>)</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>&amp;rsqb;</mo> <mo>)</mo> </mrow> <mn>2</mn> </msup> </mrow> <mrow> <msub> <mi>w</mi> <mn>2</mn> </msub> <mo>|</mo> <mo>|</mo> <msubsup> <mi>Y</mi> <mi>U</mi> <mi>c</mi> </msubsup> <mo>|</mo> <msubsup> <mo>|</mo> <mi>F</mi> <mn>2</mn> </msubsup> <mo>-</mo> <msup> <msub> <mi>w</mi> <mn>1</mn> </msub> <mn>2</mn> </msup> </mrow> </mfrac> </msqrt> </mrow>
Step 6.3.6, makes w1=α w1, w2=α w2, α < 1 are given normal numbers;
Step 6.3.7, whenOrderOtherwise step 6.3.5 is jumped to, execution is continued cycling through 6.3.5 to step 6.3.7;
Step 6.3.8, whenOutputTo be newOtherwise step 6.3.4 is jumped to, continues cycling through and holds Row step 6.3.4 to step 6.3.8.
CN201410599268.1A 2014-10-31 2014-10-31 The digital picture multi-semantic meaning mask method measured based on spatial dependence Expired - Fee Related CN104346456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410599268.1A CN104346456B (en) 2014-10-31 2014-10-31 The digital picture multi-semantic meaning mask method measured based on spatial dependence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410599268.1A CN104346456B (en) 2014-10-31 2014-10-31 The digital picture multi-semantic meaning mask method measured based on spatial dependence

Publications (2)

Publication Number Publication Date
CN104346456A CN104346456A (en) 2015-02-11
CN104346456B true CN104346456B (en) 2017-09-08

Family

ID=52502047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410599268.1A Expired - Fee Related CN104346456B (en) 2014-10-31 2014-10-31 The digital picture multi-semantic meaning mask method measured based on spatial dependence

Country Status (1)

Country Link
CN (1) CN104346456B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701509B (en) * 2016-01-13 2019-03-12 清华大学 A kind of image classification method based on across classification migration Active Learning
CN107391599B (en) * 2017-06-30 2021-01-12 中原智慧城市设计研究院有限公司 Image retrieval method based on style characteristics
CN109190060B (en) * 2018-07-10 2021-05-14 天津大学 Service annotation quality optimization method based on effective human-computer interaction
CN111428733B (en) * 2020-03-12 2023-05-23 山东大学 Zero sample target detection method and system based on semantic feature space conversion

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7814040B1 (en) * 2006-01-31 2010-10-12 The Research Foundation Of State University Of New York System and method for image annotation and multi-modal image retrieval using probabilistic semantic models
CN103336969A (en) * 2013-05-31 2013-10-02 中国科学院自动化研究所 Image meaning parsing method based on soft glance learning
CN103605667A (en) * 2013-10-28 2014-02-26 中国计量学院 Automatic image annotation algorithm
CN103955462A (en) * 2014-03-21 2014-07-30 南京邮电大学 Image marking method based on multi-view and semi-supervised learning mechanism

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7814040B1 (en) * 2006-01-31 2010-10-12 The Research Foundation Of State University Of New York System and method for image annotation and multi-modal image retrieval using probabilistic semantic models
CN103336969A (en) * 2013-05-31 2013-10-02 中国科学院自动化研究所 Image meaning parsing method based on soft glance learning
CN103605667A (en) * 2013-10-28 2014-02-26 中国计量学院 Automatic image annotation algorithm
CN103955462A (en) * 2014-03-21 2014-07-30 南京邮电大学 Image marking method based on multi-view and semi-supervised learning mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Trace Ratio vs. Ratio Trace for Dimensionality Reduction;Huan Wang 等;《Computer Vision & Pattern》;20070716;第1-8页 *
从希尔伯特-施密特独立性中学习的多标签半监督学习方法;张晨光 等;《中国科技论文》;20131031;第8卷(第10期);第998-1002页 *

Also Published As

Publication number Publication date
CN104346456A (en) 2015-02-11

Similar Documents

Publication Publication Date Title
CN105184303B (en) A kind of image labeling method based on multi-modal deep learning
CN101916376B (en) Local spline embedding-based orthogonal semi-monitoring subspace image classification method
Wang et al. Remote sensing image retrieval by scene semantic matching
US20210089827A1 (en) Feature representation device, feature representation method, and program
CN105117429A (en) Scenario image annotation method based on active learning and multi-label multi-instance learning
CN105808752B (en) A kind of automatic image marking method based on CCA and 2PKNN
CN109063112B (en) Rapid image retrieval method, model and model construction method based on multitask learning deep semantic hash
Bui et al. Scalable sketch-based image retrieval using color gradient features
CN111125411B (en) Large-scale image retrieval method for deep strong correlation hash learning
CN104112018B (en) A kind of large-scale image search method
CN107943856A (en) A kind of file classification method and system based on expansion marker samples
EP3166020A1 (en) Method and apparatus for image classification based on dictionary learning
CN101710334A (en) Large-scale image library retrieving method based on image Hash
CN104346456B (en) The digital picture multi-semantic meaning mask method measured based on spatial dependence
CN104834693A (en) Depth-search-based visual image searching method and system thereof
CN114299362A (en) Small sample image classification method based on k-means clustering
CN104036021A (en) Method for semantically annotating images on basis of hybrid generative and discriminative learning models
CN115439715A (en) Semi-supervised few-sample image classification learning method and system based on anti-label learning
CN114510594A (en) Traditional pattern subgraph retrieval method based on self-attention mechanism
Lu et al. Image categorization via robust pLSA
Dimitrovski et al. Detection of visual concepts and annotation of images using ensembles of trees for hierarchical multi-label classification
CN108549915A (en) Image hash code training pattern algorithm based on two-value weight and classification learning method
CN111914108A (en) Discrete supervision cross-modal Hash retrieval method based on semantic preservation
Xiong et al. A confounder-free fusion network for aerial image scene feature representation
CN114219047B (en) Heterogeneous domain self-adaption method, device and equipment based on pseudo label screening

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170908

Termination date: 20211031

CF01 Termination of patent right due to non-payment of annual fee