CN103116893B

CN103116893B - Digital image labeling method based on multi-exampling multi-marking learning

Info

Publication number: CN103116893B
Application number: CN201310084956.XA
Authority: CN
Inventors: 周志华; 黄圣君
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2013-03-15
Filing date: 2013-03-15
Publication date: 2015-07-01
Anticipated expiration: 2033-03-15
Also published as: CN103116893A

Abstract

The invention discloses a digital image labeling method based on multi-exampling multi-marking learning and aims at solving the technical problems that digital images frequently have complex semantics and cannot be effectively expressed or learned through the technical method based on single exampling. The digital image labeling method includes: initializing a marking model; randomly selecting one image and a related mark of the image from a data set, and determining a representative exampling of the mark; obtaining a non-related mark arranged in the front of the related mark through random sampling, and determining a representative exampling of the non-related mark; and performing gradient descent aiming at a triad formed by the image, the related mark and the non-related mark to update the model. According to the digital image labeling method, online learning is performed by aid of a random gradient descent algorithm, time and internal memory expanse are greatly reduced, marking accuracy is ensured, and marking efficiency is improved.

Description

Based on the digital picture mask method of many examples Multi-label learning

Technical field

The present invention relates to digital picture label technology field, particularly relate to a kind of digital picture mask method based on many examples Multi-label learning.

Background technology

Popular along with the universal of digital product and all kinds of social network sites, has the digital picture of magnanimity produce and propagate every day.Will provide relevant service in so large-scale view data, one most crucial, and to be also the most difficult task be allows the semanteme of computer understanding image, and image labeling is then gordian technique wherein.

The task of automatic image annotation device carrys out the semantic marker of predicted picture.Concrete, first annotation equipment can extract visual signature to represent these images from digital picture, then based on these character representations, trains a marking model from the sets of image data of existing semantic marker.When after the character representation input marking model that will do not have a markd digital picture, model just can dope their semantic marker.

Image table is often shown as single example by current automatic image annotation technology.But image often has complicated semanteme, and comprise multiple object entity, the expression of so single example can cause information loss, cannot the semanteme of accurate description image, thus cannot Accurate Prediction image tagged.More effective ways are the input representations based on the machine learning of many examples multiple labeling (being called for short MIML), represent piece image with the set that multiple examples of features forms, wherein each example entity that then often correspondence one is fairly simple and semanteme.Have the automatic image annotation technology that minority represents based on MIML input at present, but their model complexity increases greatly and sharply along with the change of representation space, causes these technology very poor efficiency, cannot be applied in large-scale image labeling task.Therefore, a kind of efficient automatic image annotation technology that can represent based on MIML input urgently proposes.

Summary of the invention

Technical matters: often have complicated semantic for digital picture, and effective expression and study cannot be carried out to it based on the technology of single example, and the low inferior technical matters of annotating efficiency, the present invention proposes a kind of digital picture mask method based on many examples Multi-label learning.

Technical scheme: based on the digital picture mask method of many examples Multi-label learning, comprise the steps,

(1) initialization marking model;

(2) mark of correlation of Stochastic choice piece image and this image from data acquisition, and determine the representative example of this mark;

(3) obtain a uncorrelated mark come before mark of correlation by stochastic sampling, and determine the representative example of this uncorrelated mark;

(4) for image, the tlv triple that mark of correlation and uncorrelated mark are formed carries out Gradient Descent Renewal model;

(5) judge whether this model reaches requirement, if then return (2); Otherwise terminate and export marking model.

The present invention adopts technique scheme, there is following beneficial effect: the present invention is based on many examples Multi-label learning framework, to each mark of image, the representative example selecting "current" model to approve most from many examples multiple labeling machine learning input represents to represent this image, thus can make full use of the more information that the machine learning of many examples multiple labeling brings.Meanwhile, utilize stochastic gradient descent algorithm to carry out on-line study, greatly reduce time and memory cost, thus both ensure that the degree of accuracy of mark, turn improve annotating efficiency.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the training marking model of the embodiment of the present invention;

Fig. 2 is the method flow diagram of the embodiment of the present invention;

Fig. 3 is the sampled images of the embodiment of the present invention and determines the process flow diagram of representative example of its mark of correlation;

Fig. 4 is the uncorrelated mark of sampling of the embodiment of the present invention and determines that it represents the process flow diagram of example;

Fig. 5 is the process flow diagram of the renewal marking model of the embodiment of the present invention.

Embodiment

Below in conjunction with specific embodiment, illustrate the present invention further, these embodiments should be understood only be not used in for illustration of the present invention and limit the scope of the invention, after having read the present invention, the amendment of those skilled in the art to the various equivalent form of value of the present invention has all fallen within the application's claims limited range.

Fig. 1 is the process flow diagram of the digital picture automatic marking device training marking model of the embodiment of the present invention.Suppose that training image data acquisition is made up of N width image, every width image all marks.Device meets the feature of many examples Multi-label learning input to the image zooming-out in data acquisition, and every piece image is by a stack features vector representation, and each proper vector is called an example.As preferably, feature extraction can use the classical way in machine learning textbook to generate the characteristics of image be suitable for, such as, first carry out Iamge Segmentation, then extract the features such as color, texture, shape to each image block.After training out marking model with this data acquisition, device carries out same feature extraction by the unmarked image of input, and predicts by marking model, exports mark of correlation as annotation results.

Figure 2 shows that the method flow diagram of the embodiment of the present invention.Step S20 carries out initialization to marking model, mainly comprises two matrix W and V initialize.Suppose that the characteristics of image dimension extracted is d, the individual possible mark of total total L, then W is the matrix of L × 100 size, and V is the matrix of d × 100 size.Here numerical value 100 can need to change other value into according to user, and usually select larger value can improve mark accuracy, less value then can pick up speed.W and V is endowed random value and the average of each row that ensure them is 0, and standard deviation is step S21 is stochastic sampling piece image X from training data set, and suppose that its mark of correlation set is Y, uncorrelated tag set is device is Stochastic choice one mark from Y, is assumed to be y, and represents example x for y selects from set X, and detailed process is shown in Fig. 3.Step S22 is from the uncorrelated tag set of this image in one by one stochastic sampling mark, until run into one by the uncorrelated mark come before y and determine to represent example for it detailed process is shown in Fig. 4.Step S23 carries out Gradient Descent Renewal model, and detailed process is shown in Fig. 5.Whether step S24 judgment models reaches requirement, is, terminates training process, otherwise gets back to step S21.Here whether judgment models reaches standard and can to adopt in machine learning or pattern-recognition textbook conventional method, and such as iteration wheel number reaches the number of times that user specifies.

Figure 3 shows that the detailed process of step S21 in Fig. 2.Step S210 is for starting action.In step S211, in training image data acquisition, randomly draw piece image X, suppose that it comprises n example { x altogether ₁..., x _n, the mark of correlation set Y of image comprises m mark { y altogether ₁... y _m.Step S212 randomly draws a mark y from Y.The initial i of step S213 is 1, starts the sample count in X.Step S214 judges whether i is greater than n, if be greater than, the example in X has traveled through, and jumps to step S217, otherwise enters in next step S215.Step S215 is to example x _icalculate the value shown in following formula:

f _y(x _i)＝W _yV ^Tx _i(1)

Wherein W _yrepresent that the y of W is capable, f _y(x _i) can be understood as example x _iwith the degree of correlation of mark y.Counting i is added 1 by step S216, then returns step S214.The more all f calculated of step S217 _y(x _i) value, and the example representatively example that selective value is maximum, be assumed to be x, corresponding f _yx () can be understood as this width image and the degree of correlation marking y.Whole process ends at step S218.

Figure 4 shows that the detailed process of step S23 in Fig. 2.Suppose the uncorrelated tag set of image comprise t mark altogether step S230 is origination action.Step S231 is mark y calculated value f according to formula (1) _yx (), we suppose that x is the representative example of y here.Step S232 initialization i is 1, starts counting, and introduces an indieating variable Q, be initialized as 0.Step S233 judges whether that t uncorrelated mark has traveled through all, if i > is t, jumps to step S238 and terminates, otherwise enter next step.Step S234 is mark according to formula (1) calculate here we suppose for representative example.Step S235 compare two mark y and order, if then enter step S236, counting i is added and returns step S234 in the lump; If incoherent mark come before mark of correlation y, namely , then jumping to step S237, is i by Q assignment, represents the uncorrelated mark that have found violation order i-th sampling, and by the uncorrelated mark found and represent example and be recorded as respectively with , then enter step S238 and terminate.

Figure 5 shows that the detailed process by Gradient Descent Renewal model.Step S240 is origination action.Step S241 judges the mark whether having sampled violation order in step S23, and namely whether Q is greater than 0.If Q=0, represent the mark not sampling violation order, directly enter step S245 and terminate, otherwise enter step S242, estimate the rank r of mark of correlation y according to Q value, specific formula for calculation is as follows:

Wherein t is the number of middle mark, symbol represents and rounds downwards.The y line item of model variable W is a by step S243, the line item is b.It is capable that step S244 upgrades the y of W according to formula (3) respectively, upgrades the of W by formula (4) oK, and by formula (5) upgrade V.

W_{y} = a + γ Σ_{i = 1}^{r} \frac{1}{i} V^{T} x - - - (3)

W_{\overset{&OverBar;}{y}} = b - γ Σ_{i = 1}^{r} \frac{1}{i} V^{T} \overset{&OverBar;}{x} - - - (4)

V = V - γ Σ_{i = 1}^{r} \frac{1}{i} (\overset{&OverBar;}{x} b - xa) - - - (5)

Wherein γ is the step-length of stochastic gradient descent algorithm, and its value can be arranged according to method conventional in machine learning or pattern-recognition textbook.After having upgraded model variable, whole process ends at step S245.

Claims

1., based on the digital picture mask method of many examples Multi-label learning, it is characterized in that, comprise the steps,

(1) initialization marking model;

(2) mark of correlation of Stochastic choice piece image and this image from data acquisition, and represent based on many examples of this image the representative example determining this mark;

(3) obtain a uncorrelated mark come before mark of correlation by stochastic sampling, and represent based on many examples of selected image the representative example determining this uncorrelated mark;

(4) for selected image, the tlv triple Renewal model that mark of correlation and uncorrelated mark are formed;

(5) judge whether this model reaches requirement, if then terminate and export marking model; Otherwise return (2).

2. as claimed in claim 1 based on the digital picture mask method of many examples Multi-label learning, it is characterized in that, the method of described marking model is trained to be, for the marking image of input, data image automatic marking device is adopted the image zooming-out in training image data acquisition to be met to the feature of many examples Multi-label learning input, every piece image is by a stack features vector representation, and each proper vector is called an example; After training out marking model, and predict by marking model, export mark of correlation as annotation results; The unmarked image of described device to input carries out same feature extraction and predictive marker process.

3. as claimed in claim 2 based on the digital picture mask method of many examples Multi-label learning, it is characterized in that, under the input of many examples Multi-label learning represents, the representative example of each mark is determined by current marking model.

4., as claimed in claim 1 based on the digital picture mask method of many examples Multi-label learning, it is characterized in that, utilize the tlv triple Renewal model that stochastic gradient descent algorithm is formed for image, mark of correlation and uncorrelated mark.

5. as claimed in claim 1 based on the digital picture mask method of many examples Multi-label learning, it is characterized in that, whether judgment models reach the required standards and comprises iteration wheel number and reach the number of times that user specifies.