CN103186538A

CN103186538A - Image classification method, image classification device, image retrieval method and image retrieval device

Info

Publication number: CN103186538A
Application number: CN2011104448649A
Authority: CN
Inventors: 贾宇
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2011-12-27
Filing date: 2011-12-27
Publication date: 2013-07-03

Abstract

The invention provides an image classification method, an image classification device, an image retrieval method and an image retrieval device, wherein the image classification method concretely comprises the following steps that physical characteristics of images to be classified are extracted; the images to be classified are subjected to semantic annotation to obtain corresponding annotation words; by aiming at the annotation words of the images to be classified, the annotation words are matched with semantic words in a semantic network, and in addition, semantic characterization polybasic groups are generated according to the semantic characterization corresponding to the successfully matched semantic words; semantic words and a plurality of corresponding semantic characterizations are stored in the semantic network, and the semantic characterizations are described by physical characteristics; and characteristic vectors consisting of the physical characteristics of the images to be classified and the semantic characterization polybasic groups are input into an image classifier, and corresponding classification results are output, wherein the image classifier is a classifier obtained according to the image sample training under each image type, and the physical characteristics and the arity of the semantic characterization polybasic groups are identical in the training and classification process. The methods and the devices provided by the invention have the advantage that the image classification accuracy can be improved.

Description

A kind of image classification method and device, image search method and device

Technical field

The application relates to technical field of image information processing, particularly relates to a kind of image classification method and device, a kind of image search method and device.

Background technology

At present, along with developing rapidly of hyundai electronics computer technology and network, huge, various image information constantly occurs, and all trades and professions to the use of image more and more widely and then have promoted further developing of Image Information Processing.Existing image distributes on the internet widely, lacks effective tissue, is difficult to reach the purpose of resource sharing.Therefore, how finding own needed information from so numerous image informations, is the significant challenge that the Image Information Processing technology is proposed.

Image classification method is the different characteristic that reflects according in each comfortable image information, the image processing method that different classes of target is made a distinction, it can utilize computing machine that the great amount of images on the internet is carried out quantitative test, a certain in several classifications incorporated in each pixel in image or the image or zone, to replace people's vision interpretation.The image categorizing system is exactly that a width of cloth input picture is grouped into the classification that presets, and for example, the user submits piece image to the image categorizing system, and the image categorizing system will be assigned to it under known class and with the result and return to the user.

Existing image classification method comes image is classified based on the content of image such as the physical features of images such as color, shape, texture and locus mostly, it can provide effective taxonomic methods, but because it is handling on the characteristics of image the almost completely physical features of dependency graph picture, and the described image information of physical features is limited, therefore can occur the problem of accuracy deficiency in classification results unavoidably.For example, one width of cloth comprises the image of side horse of a redness and the image that a width of cloth comprises the positive horse of a white, because physical features (color, shape and the texture etc.) gap of the two is very big, therefore, merely rely on physical features, be easy to the two is grouped in the different classifications, and in fact the two all is under the jurisdiction of " horse " class, this inaccurate classification results occurred.

In a word, need the urgent technical matters that solves of those skilled in the art to be exactly: the accuracy that how can improve the image classification.

Summary of the invention

The application provides a kind of image classification method and device, to improve the accuracy of image classification.

The application also provides a kind of image search method and device, to improve the accuracy of image searching result, finds own needed information for the user from numerous image informations more exactly.

In order to address the above problem, the application discloses a kind of image classification method, comprising:

Extract the physical features of image to be classified;

This image to be classified is carried out semantic tagger, marked word accordingly;

At the mark word of this image to be classified, semantic word in itself and the semantic network is mated, and characterize polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

Physical features and polynary group of proper vector of forming of characterizing semantics of this image to be classified are input in the image sorter, export corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process.

Preferably, described image sorter is the sorter that obtains by following steps:

Collect the image pattern under each image category, set up training set;

Extract the physical features of each image pattern in this training set;

Each image pattern in the training set is carried out semantic tagger, marked word accordingly;

At the mark word of each image pattern, semantic word in itself and the semantic network is mated, and characterize polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

Physical features and polynary group of proper vector of forming of characterizing semantics according to each image pattern in this training set are trained each image pattern in this training set, obtain corresponding image sorter.

Preferably, the corresponding characterizing semantics generative semantics of the described foundation semantic word that the match is successful characterizes polynary group step, comprising:

When the match is successful, the corresponding characterizing semantics of the current semantic word of inquiry in described semantic network, and record this characterizing semantics and corresponding inquiry times;

Choose the maximum top n characterizing semantics of inquiry times, form polynary group of characterizing semantics, N is natural number.

Preferably, described characterizing semantics comprise semantic polynary group of the color that adopts color description for polynary group, adopt texture description semantic polynary group of texture, adopt semantic polynary group or adopt semantic polynary group of the space that the locus describes of the shape of shape description.

Preferably, first number of semantic polynary group of described color is the color semantic triple.

Preferably, that the dominant hue of first yuan of expression of described color semantic triple comprises is red, orange, brown, yellow, green, blue, purple, powder, cream colour, fuchsin and olive green, that the secondary tint of second yuan of expression of described color semantic triple comprises is red, brown, yellow, green, blue, purple and powder, and the brightness that the ternary of described color semantic triple is represented comprises black, dark, grey, bright and white.

Preferably, polynary group of proper vector of forming of described physical features and characterizing semantics comprises polynary group of proper vector of forming of characterizing semantics of physical features and character string forms, perhaps, polynary group of proper vector of forming of the characterizing semantics of physical features and digital form, wherein, polynary group of the characterizing semantics of described digital form is for to obtain according to polynary group of quantification of the characterizing semantics of character string forms.

On the other hand, disclosed herein as well is a kind of image classification device, comprising:

Extraction module is for the physical features that extracts image to be classified;

The semantic tagger module is used for this image to be classified is carried out semantic tagger, is marked word accordingly;

Matching module is used for the mark word at this image to be classified, and semantic word in itself and the semantic network is mated;

Generation module is used for characterizing polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe; And

The sorter sort module is used for physical features and polynary group of proper vector of forming of characterizing semantics of this image to be classified are input to the image sorter, exports corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process.

On the other hand, disclosed herein as well is a kind of image search method, comprising:

Use aforesaid method that image in the image data base is classified, and sorted image in this image data base is added keyword handle or extract the content characteristic processing;

When the text query request that receives the user, according to text query requests, in this image data base, carry out the matching inquiry of keyword, and the respective image that matching inquiry obtains is returned to the user;

When the image querying request that receives the user, according to this image querying request, in this image data base, carry out the matching inquiry of content characteristic, and the respective image that matching inquiry obtains is returned to the user.

On the other hand, disclosed herein as well is a kind of image retrieving apparatus, comprising:

The image sort module is used for using aforesaid device that the image data base image is classified;

The keyword processing module is used for that the sorted image of this image data base is added keyword and handles;

Extract the characteristic processing module, be used for that the sorted image of this image data base is extracted content characteristic and handle;

The text query processing module is used for according to text query requests, carrying out the matching inquiry of keyword in this image data base, and the respective image that matching inquiry obtains being returned to the user when the text query request that receives the user; And

The image querying processing module is used for according to this image querying request, carrying out the matching inquiry of content characteristic in this image data base, and the respective image that matching inquiry obtains being returned to the user when the image querying request that receives the user.

Compared with prior art, the application has the following advantages:

The application is on the basis of original physical features, add the polynary fabric of characterizing semantics and build the image sorter, because the polynary group of semantic attribute that can describe image of characterizing semantics, so the application can be on the basis of the described image information of original physical features, the completeness that enhancing is described image information makes this image sorter have feature descriptive power and error concealment ability more accurately.

The application's image assorting process just is based on physical features and polynary group of process of classifying of characterizing semantics of this image to be classified in this image sorter, at the physical features of two width of cloth images not simultaneously, characterizing semantics may be same or analogous for polynary group, otherwise, when the physical features of two width of cloth images is same or similar, characterizing semantics also might be different for polynary group, so the application can avoid the image under the different images classification is divided on the identical image category, simultaneously, the image under the same image category can be divided on the identical image category; Therefore, the application can effectively improve the accuracy of image classification.

In addition, at the branch time-like of above-mentioned image classification method for image in the image retrieval image data base, because the image classification results is as the basis of image retrieval, and at the image classification results accurately on the basis, the result of the interpolation keyword result of the image under of all categories in this image data base or extraction content characteristic also is accurately, therefore, the application can also improve the accuracy of image searching result, finds own needed information for the user from numerous image informations more exactly.

Description of drawings

Fig. 1 is the process flow diagram of a kind of image sorter of the application training method embodiment;

Fig. 2 is the example of a kind of image pattern of the application;

Fig. 3 is the example of a kind of semantic network of the application;

Fig. 4 is the process flow diagram of a kind of image classification method embodiment of the application;

Fig. 5 is the example of a kind of image to be classified of the application;

Fig. 6 is the example of the another kind of image to be classified of the application;

Fig. 7 is the example of another image to be classified of the application;

Fig. 8 is the structural drawing of a kind of image classification device embodiment of the application;

Fig. 9 is the process flow diagram of a kind of image search method embodiment of the application;

Figure 10 is the structural drawing of a kind of image retrieving apparatus embodiment of the application.

Embodiment

For above-mentioned purpose, the feature and advantage that make the application can become apparent more, below in conjunction with the drawings and specific embodiments the application is described in further detail.

One of core idea of the embodiment of the present application is, on the basis of original physical features, add the polynary fabric of characterizing semantics and build the image sorter, because the polynary group of semantic attribute that can describe image of characterizing semantics, so the application can be on the basis of the described image information of original physical features, the completeness that enhancing is described image information makes this image sorter have feature descriptive power and error concealment ability more accurately.Like this, the application's image assorting process just is based on physical features and polynary group of process of classifying of characterizing semantics of this image to be classified in this image sorter, and therefore, the application can effectively improve the accuracy of image classification.

With reference to Fig. 1, the process flow diagram that it shows a kind of image sorter of the application training method embodiment specifically can comprise:

Step 101, collect the image pattern under each image category, set up training set;

In practice, can collect image pattern according to the residing application of image to be classified, for example, if application is e-commerce field, then can collect the commodity image at each e-commerce website, and the commodity image of collecting is grouped into corresponding merchandise classification, finally set up training set; And for example, if facial image then can be collected in application behaviour face Expression Recognition field in various famous human face expression databases, be grouped into corresponding expression classification, finally set up training set; For another example, in the geography information field, can in various famous remote sensing image data storehouses, collect remote sensing images, be grouped into corresponding landform classification, finally set up training set, etc.; In a word, the application is not limited the residing application of image to be classified.

In general, the image pattern amount is more big, and the feature descriptive power that the image sorter has and error concealment ability are just more accurate.So those skilled in the art can arrange the size (as 100,1000,10000 etc.) of the image pattern amount under each image category according to the actual requirements, the application is not limited concrete image pattern amount.

Step 102, extract the physical features of each image pattern in this training set;

In practice, described physical features specifically can comprise color, shape, texture and locus etc.In specific implementation, can adopt methods such as color histogram, color set, color moment, color convergence vector or color correlogram to extract the color characteristic of each image pattern in this training set; Can adopt methods such as statistical method, geometric method, modelling or signal facture to extract the textural characteristics of each image pattern in this training set; Can adopt methods such as boundary characteristic method, Fourier's shape description symbols method, geometric parameter method or shape invariance moments method to extract the shape facility of each image pattern in this training set, etc.

Step 103, each image pattern in the training set is carried out semantic tagger, marked word accordingly;

The semantic tagger here can adopt mode artificial constructed and/or that computing machine makes up automatically to finish, and wherein, artificial constructed mode is more accurate, and effect can be got well, but slow and effort, and the automatic building mode of computing machine is not accurate enough, but speed is fast; In a word, the application is not limited the concrete mode of semantic tagger.

Step 104, at the mark word of each image pattern, semantic word in itself and the semantic network is mated, and characterizes polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

The application adopts polynary group of semantic attribute of describing image of characterizing semantics, with on the basis of the described image information of original physical features, strengthens the completeness that image information is described.

The purpose of image classification is that same target (image under the object correspondence image classification here) is assigned on the identical image category, because what physical features was described is the bottom perceptual property of piece image, is easy to the two is grouped in the different classifications so prior art merely relies on physical features; For example, a width of cloth comprises the image of side horse of a redness and the image that a width of cloth comprises the positive horse of a white, because physical features (color, shape and the texture etc.) gap of the two is very big, so the image classification results of the two is different probably in the prior art.

Because polynary group of description of characterizing semantics is the semantic attribute of piece image, at the physical features of two width of cloth images not simultaneously, characterizing semantics may be same or analogous for polynary group, for example, in the last example, polynary group of described semantic attribute of the characterizing semantics of two width of cloth images may all be " horse ", so the application can help the image under the same image category is divided on the identical image category.Otherwise when the physical features of two width of cloth images was same or similar, characterizing semantics also might be different for polynary group, and therefore, the application can also avoid the image under the different images classification is divided on the identical image category.In a word, the application can effectively improve the accuracy of image classification.

In addition, the definition of polynary group of described semantic attribute of the application's characterizing semantics derives from the mark word, and different images is the different mark word of correspondence certainly, so can guarantee the property distinguished.But even for same sub-picture, the mark word of its correspondence can have a plurality of, for example, with reference to the example of a kind of image pattern of the application shown in Figure 2, the mark word of its correspondence can comprise [sky, seabeach, parasol, the people, the ocean], here, [] represents the mark set of words of piece image correspondence.

The application is quantified as a plurality of units with polynary group of characterizing semantics and describes, and is equivalent to define the scope of mark word, the noise of also just having avoided the mark word to bring.Described characterizing semantics is orderly for polynary group, and wherein, first yuan importance is greater than second yuan, and second yuan importance is greater than ternary, and the rest may be inferred.

Because when the construct image sorter, need train each image pattern in this training set according to physical features and the polynary group of common proper vector of forming of characterizing semantics of each image pattern in this training set, and considering feature descriptive power and the error concealment ability of image sorter, the application adopts physical features to describe polynary group of characterizing semantics.Like this, described characterizing semantics specifically can comprise semantic polynary group or adopt semantic polynary group of the space that the locus describes of semantic polynary group of the color that adopts color description, semantic polynary group of the texture that adopts texture description, the shape that adopts shape description for polynary group.

In the embodiment of the present application, preferably, polynary group of physical features that adopts of characterizing semantics should be identical with the training time physical features that adopts, and for example, adopts the color histogram feature during training, and then characterizing semantics should adopt color characteristic to describe for polynary group; And for example, adopt textural characteristics during training, then characterizing semantics should adopt textural characteristics to describe etc. for polynary group.

Because color characteristic derives from psychology of vision, with the visual observation piece image time, the division of primary and secondary tone can be arranged at first, that is to say the obvious differentiation that has on the color, which accounts for mainly, and which is less important; And main color need provide semantical definition as much as possible, because itself has comprised maximum quantity of information, less important is less because of quantity of information, be not particular importance, so the definition of color is also a little less; As for brightness, the quantity of information that offers visual analysis is littler, only helps out, so the number of definition is minimum.So, in a preferred embodiment of the present application, first number that described color is semantic polynary group is the color semantic triple, that the dominant hue of first yuan of expression of described color semantic triple specifically can comprise is red, orange, brown, yellow, green, blue, purple, powder, cream colour, fuchsin and olive green etc., that the secondary tint of second yuan of expression of described color semantic triple specifically can comprise is red, brown, yellow, green, blue, purple and powder etc., and it is black, dark, grey, bright and white etc. that the brightness that the ternary of described color semantic triple is represented specifically can comprise.

With reference to Fig. 3, it shows the example of a kind of semantic network of the application, stores semantic word and corresponding some color characterizing semanticses in this semantic network, and described color characterizing semantics adopts color characteristic to describe; For example, it is red and blue spending corresponding color characterizing semantics among the figure, the color characterizing semantics of sea correspondence is blue, the color characterizing semantics of sky correspondence is blue, the color characterizing semantics that tree is corresponding is green, the color characterizing semantics that grass is corresponding is green, and the color characterizing semantics of automobile correspondence is red and green, and the color characterizing semantics of sun correspondence is red etc.

In a preferred embodiment of the present application, the corresponding characterizing semantics generative semantics of the described foundation semantic word that the match is successful characterizes polynary group step, may further include:

Choose the maximum top n characterizing semantics of inquiry times, form polynary group of characterizing semantics, N is natural number.First number that the number of N and characterizing semantics are polynary group is corresponding.

For making those skilled in the art understand the application better, below describe the mark word at each image pattern in detail, semantic word in itself and the semantic network is mated, and generate the implementation of color semantic triple according to the corresponding characterizing semantics of the semantic word that the match is successful;

This implementation relates to employing color semantic network database wmap＜word, colorlist 〉, store semantic word word and corresponding color characterizing semantics tabulation colorlist, comprise one or more color color among the colorlist;

Suppose that the mark word of image pattern stores with following mark collection database: for the data line in the mark collection database, corresponding descriptor theme and the mark word tabulation wlist of piece image, this implementation specifically can comprise:

Steps A 1, from mark collection database, the descriptor theme of correspondence image sample and mark word tabulation wlist are taken out;

Each mark word among steps A 2, the traversal visit wlist, with its with wmap in semantic word mate, if the match is successful, the color characterizing semantics of then the obtaining semantic word correspondence colorlist that tabulates;

Steps A 3, mapping table colormap of definition are used for the number of the statistic procedure 2 color characterizing semantics that obtains tabulation colorlist color, colormap[] initial value be 0, and colormap[] " unknown " keyword represent that the match is successful;

Each color in colorlist that steps A 4, traversal step 2 the are got tabulation, if this color in colormap, statistical figure+1 of corresponding color among the colormap so, otherwise this color is put in the colormap, its statistical figure are 1;

If the mark word among steps A 5 wlist does not exist in wmap, so colormap[unknown]+1;

If the last item number of steps A 6 colormap is less than 3, so none is added;

Steps A 7, at last according to the descending ordering of statistical figure of the correspondence of the item of colormap, get front three and namely obtain final color semantic triple.

Above-mentioned implementation is applied to image shown in Figure 2, and the theme theme that supposes image shown in Figure 2 is the seabeach, and mark word wlist is [sky, seabeach, parasol, people, ocean], specifically can comprise at described implementation:

Step B1, for ' sky ', find wmap[' sky ']=blue, black, so the colormap[blueness]=1, colormap[black]=1;

Step B2, for ' seabeach ', find wmap[' seabeach ']=white, yellow, so colormap[white]=1, the colormap[yellow]=1;

Step B3, for ' parasol ', find wmap[' parasol ']=' ', namely parasol does not exist in the color semantic network, so colormap[unknown]=1;

Step B4, for ' people ', find wmap[' people ']=yellow, black, so colormap[black]=2, the colormap[yellow]=2;

Step B5, for ' ocean ', find wmap[' ocean ']=blueness, so the colormap[blueness]=2;

Step B6, statistics colormap find that blueness, yellow and black all are 2, so the result of color semantic triple is exactly [blue, yellow, black].

Need to prove; for not comprising the abnormal conditions that mark word in the color semantic network; so showing perhaps that the color semantic network improves does not inadequately comprise abundant concept colouring information, shows that perhaps this mark word may very obscurely rarely not be Commonsense Concepts.But, the influence of this mark word can not be out in the cold, add in the mapping table as auxiliary expression so introduce " unknown " keyword, " unknown " keyword can remedy the scale deficiency of color semantic network, express in the knowledge processing to the not processing of intellectual.

In addition, for fear of the attribute disappearance of color semantic triple, the application introduces none and adds in the tlv triple as auxiliary unit; Also namely, less than 3 o'clock, use none to fill vacant position to guarantee the complete of color semantic triple at first number of color semantic triple.Compare semanteme that " unknown " keyword represents not intellectual and uncertainty, none has expressed non-importance semantically, and namely none has expressed zone or the part that thinks little of in the general knowledge that image comprises.

More than the generative process of color semantic triple is had been described in detail, need to prove, above-mentioned only is as example, in fact, the application's characterizing semantics can be described with other physical features such as texture, shape, locus for polynary group, and first number that the application's characterizing semantics is polynary group can also be the numeral more than 2 or 3, and the application is not limited this.

Step 105, according to the physical features of each image pattern in this training set and polynary group of proper vector of forming of characterizing semantics each image pattern in this training set is trained, obtain corresponding image sorter.

In practice, the physical features of each image pattern and characterizing semantics have independently vectorial dimension for polynary group in this training set.Suppose that step 102 obtains the physical features of 36 dimensions of each image pattern in this training set by feature extraction algorithm, then polynary group of feature that can be used as the 37th dimension of characterizing semantics of obtaining of step 104.Represent physical features and polynary group of 37 dimensional features of forming of characterizing semantics if adopt array, then array each the row corresponding piece image data, 36 numerals of the row 1-36 correspondence of array are used for the physical features of expression piece image, polynary group of the characterizing semantics of the row 37 expression piece images of array.

Need to prove, sorters such as Bayes can support that (as above [blue is used in [blue, yellow, black] in the example to character string forms, yellow, black] expression), so the time character string forms polynary group of physical features back that can directly be listed in digital form of characterizing semantics, and participate in the training of sorter; And support vector machine (SVM, support vector machine) etc. sorter can not be supported character string forms, so the time should at first polynary group of the characterizing semantics of character string forms be quantified as polynary group of the characterizing semantics of digital form, then polynary group of the characterizing semantics of digital form is listed in the physical features back of digital form, and participates in the training of sorter.Therefore, in the embodiment of the present application, polynary group of proper vector of forming of described physical features and characterizing semantics specifically can comprise polynary group of proper vector of forming of characterizing semantics of physical features and character string forms, perhaps, polynary group of proper vector of forming of the characterizing semantics of physical features and digital form, wherein, polynary group of the characterizing semantics of described digital form is for to obtain according to polynary group of quantification of the characterizing semantics of character string forms.

The application is not limited for the quantization scheme of character string forms to digital form, can provide an example at this, this example is carried out assignment to the characterizing semantics in polynary group of the characterizing semantics, wherein, first yuan assignment is greater than second yuan assignment, second yuan assignment is greater than the assignment of ternary, and the rest may be inferred; In the characterizing semantics with monobasic, assignment sorts (for example the assignment of the word of alphabetical a beginning greater than the assignment of the word of alphabetical b beginning etc.) in alphabetical order, and the assignment of none is less than the assignment with other characterizing semantics in the monobasic.Certainly, the above-mentioned example that goes up is as the application's application restriction.

In addition, the dimension of the proper vector that polynary group of the physical features of each image pattern and characterizing semantics are formed in this training set before the training image sorter, can also be carried out dimension-reduction treatment when higher; Described dimension-reduction treatment can be adopted linear discriminant analysis (LDA, linear discriminant analysis), pivot analysis (PCA, in the class principal component analysis) ,/class between method such as sorter realize, perhaps, for fear of characterizing semantics is impacted for polynary group, can only carry out dimensionality reduction etc. at physical features.In a word, the application is not limited concrete dimension-reduction treatment scheme.

With reference to Fig. 4, it shows the process flow diagram of a kind of image classification method embodiment of the application, specifically can comprise:

The physical features of step 401, extraction image to be classified;

Step 402, this image to be classified is carried out semantic tagger, marked word accordingly;

Step 403, at the mark word of this image to be classified, semantic word in itself and the semantic network is mated, and characterizes polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

Step 404, the physical features of this image to be classified and polynary group of proper vector of forming of characterizing semantics are input in the image sorter, export corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process.

In specific implementation, described image sorter can obtain by following step training:

Step C1, collect the image pattern under each image category, set up training set;

Step C2, extract the physical features of each image pattern in this training set;

Step C3, each image pattern in the training set is carried out semantic tagger, marked word accordingly;

Step C4, at the mark word of each image pattern, semantic word in itself and the semantic network is mated, and characterizes polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

Step C5, according to the physical features of each image pattern in this training set and polynary group of proper vector of forming of characterizing semantics each image pattern in this training set is trained, obtain corresponding image sorter.

For above-mentioned training step, because its implementation procedure with training method embodiment shown in Figure 1 is similar, just do not give unnecessary details one by one here.Because the image sorter obtains for physical features and the polynary group of proper vector training of forming of characterizing semantics according to the image pattern under each image category, so at a minute time-like, should adopt feature extracting method identical when training to obtain physical features identical when training, and, adopt method identical when training to obtain polynary group of the characterizing semantics of image to be classified, and first number of polynary group of the characterizing semantics of image to be classified should be identical with polynary group of characterizing semantics in the training process with physical features.

For making those skilled in the art understand the application better, below has classification capacity more accurately by a concrete example explanation the application;

This example relates to employing the application Fig. 5, Fig. 6 and image to be classified shown in Figure 7 is classified; Wherein,

The proper vector of image to be classified shown in Figure 5 is 37 dimensions, and preceding 36 dimensions are physical features, and the 37th dimension is the characterizing semantics tlv triple, and its available characters string form or digital form are described: 35/[white, blue, none];

The proper vector of image to be classified shown in Figure 5: 0.295237,0.316467,0.319722,0.319875,0.319885,0.320261,0.478892,0.565979,0.629964,0.679026,0.758321,1.000000,0.230936,0.357381,0.437896,0.498647,0.558766,0.631245,0.711090,0.781545,0.854125,0.916748,0.959238,1.000000,0.043802,0.271158,0.362487,0.448333,0.499094,0.541615,0.590627,0.787628,0.830108,0.904897,0.977121,1.000000,35/[white, blue, none];

The proper vector of image to be classified shown in Figure 6 is 37 dimensions, and preceding 36 dimensions are physical features, and the 37th dimension is the characterizing semantics tlv triple, and its available characters string form or digital form are described: 32/[white, black, light];

The proper vector of image to be classified shown in Figure 6: 0.097320,0.137593,0.165791,0.174967,0.180440,0.192779,0.413798,0.639963,0.746500,0.764190,0.782908,1.000000,0.089813,0.250640,0.375854,0.465565,0.548604,0.620025,0.676279,0.711466,0.747609,0.786987,0.843607,1.000000,0.118835,0.170725,0.212921,0.257385,0.321716,0.434722,0.533864,0.591461,0.673858,0.733083,0.970560,1.000000,32/[white, black, light];

The proper vector of image to be classified shown in Figure 7 is 37 dimensions, and preceding 36 dimensions are physical features, and the 37th dimension is the characterizing semantics tlv triple, and its available characters string form or digital form are described: 52/[white, none, light];

The proper vector of image to be classified shown in Figure 7: 0.022725,0.048472,0.056437,0.057230,0.057993,0.062276,0.372833,0.988789,0.995259,0.996266,0.996988,1.000000,0.028452,0.170511,0.335205,0.476460,0.583384,0.663747,0.749145,0.849873,0.961710,0.996948,0.998555,1.000000,0.002919,0.018585,0.038736,0.071156,0.110931,0.165527,0.305023,0.511393,0.729848,0.874715,0.970815,1.000000,52/[white, none, light];

Fig. 5 and Fig. 6 are the images of " motorbus " class, Fig. 7 is the image of " snow mountain " class, prior art often because the color feature vector of Fig. 5 and Fig. 7 calculate apart from the time more approaching, and and Fig. 2 far away, so Fig. 5 and Fig. 7 mistake are assigned to same class---" snow mountain "; Because the application has added the 37th dimensional feature vector, what the 37th dimensional feature vector was described is the semantic attribute of piece image, at the physical features of two width of cloth images not simultaneously, characterizing semantics may be same or analogous for polynary group, otherwise, when the physical features of two width of cloth images is same or similar, characterizing semantics also might be different for polynary group, therefore, the application can avoid the image under the different images classification is divided on the identical image category, simultaneously, the image under the same image category can be divided on the identical image category.

Particularly, the mark word of Fig. 5 and Fig. 6 is similarly, for example all is bus, sky, and the word that ground is such, the characterizing semantics triple form after causing quantizing is more approaching, because they have expressed similar semanteme; And Fig. 7 is as a width of cloth snow mountain image, and its mark word is jokul, sky, the word of mountain and so on, like this characterizing semantics tlv triple that forms quantize the back just with the generation semantic difference of Fig. 5 and Fig. 6; Therefore, the 37th dimensional feature of Fig. 5 and Fig. 6 is more near (35 and 32), and and Fig. 7 (52) far away.

In a word, because the application can strengthen the completeness that image information is described on the basis of the described image information of original physical features, make this image sorter have feature descriptive power and error concealment ability more accurately.Like this, the application's image assorting process just is based on physical features and polynary group of process of classifying of characterizing semantics of this image to be classified in this image sorter, and therefore, the application can effectively improve the accuracy of image classification.

For authentication method embodiment, because its training step is similar substantially to training method embodiment shown in Figure 1, so description is fairly simple, relevant part gets final product referring to the part explanation of training method embodiment.

Need to prove, for method embodiment, for simple description, so it all is expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not subjected to the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in the instructions all belongs to preferred embodiment, and related action and module might not be that the present invention is necessary.

With reference to Fig. 8, it shows the structural drawing of a kind of image classification device embodiment of the application, specifically can comprise:

Extraction module 801 is for the physical features that extracts image to be classified;

Semantic tagger module 802 is used for this image to be classified is carried out semantic tagger, is marked word accordingly;

Matching module 803 is used for the mark word at this image to be classified, and semantic word in itself and the semantic network is mated;

Generation module 804 is used for characterizing polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe; And

Sorter sort module 805 is used for physical features and polynary group of proper vector of forming of characterizing semantics of this image to be classified are input to the image sorter, exports corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process.

In a preferred embodiment of the present application, described device can also comprise the sorter training module, specifically can comprise:

Collect submodule, be used for collecting the image pattern under each image category, set up training set;

Submodule is extracted in training, is used for extracting the physical features of this each image pattern of training set;

Training semantic tagger submodule is used for each image pattern of training set is carried out semantic tagger, is marked word accordingly;

The training matched sub-block is used for the mark word at each image pattern, and semantic word in itself and the semantic network is mated;

Training generates submodule, is used for characterizing polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe; And

Sorter training submodule is used for according to physical features and polynary group of proper vector of forming of characterizing semantics of this each image pattern of training set each image pattern in this training set being trained, and obtains corresponding image sorter.

In another preferred embodiment of the application, described generation module 804 may further include:

The query note submodule is used for when the match is successful, the corresponding characterizing semantics of the current semantic word of inquiry in described semantic network, and record this characterizing semantics and corresponding inquiry times; And

Choose submodule, be used for choosing the maximum top n characterizing semantics of inquiry times, form polynary group of characterizing semantics, N is natural number.

In another preferred embodiment of the application, described characterizing semantics specifically can comprise semantic polynary group of the color that adopts color description for polynary group, adopt semantic polynary group of the texture of texture description, adopt semantic polynary group or adopt semantic polynary group of the space that the locus describes of the shape of shape description.

In the embodiment of the present application, preferably, first number that described color is semantic polynary group is the color semantic triple.More preferred, that the dominant hue of first yuan of expression of described color semantic triple specifically can comprise is red, orange, brown, yellow, green, blue, purple, powder, cream colour, fuchsin and olive green etc., that the secondary tint of second yuan of expression of described color semantic triple specifically can comprise is red, brown, yellow, green, blue, purple and powder etc., and it is black, dark, grey, bright and white etc. that the brightness that the ternary of described color semantic triple is represented specifically can comprise.

In the embodiment of the present application, preferably, polynary group of proper vector of forming of described physical features and characterizing semantics specifically can comprise polynary group of proper vector of forming of characterizing semantics of physical features and character string forms, perhaps, polynary group of proper vector of forming of the characterizing semantics of physical features and digital form, wherein, polynary group of the characterizing semantics of described digital form is for to obtain according to polynary group of quantification of the characterizing semantics of character string forms.

For device embodiment, because it is similar substantially to method embodiment, so description is fairly simple, relevant part gets final product referring to the part explanation of method embodiment.

Above-mentioned image sorting result can be used for image retrieval, and image retrieval is a common high-rise technology, and the image classification can be regarded the technical support of image retrieval bottom as.Image retrieval may further include TBIR (text-based image retrieval technology, Text-based Image Retrieval) and CBIR (CBIR, Content Based Image Retrieval); Generally, image retrieval returns a series of associated pictures according to an input (generally being one section text or a width of cloth example image); Be applied to image indexing system, then when the user submitted a query requests (image or text) to, image indexing system returned a series of relevant result images by calculating and gives the user.In a word, image retrieval can make things convenient for the user to find own needed information from numerous image informations.

With reference to Fig. 9, it shows the process flow diagram of a kind of image search method embodiment of the application, specifically can comprise:

Step 901, image in the image data base is classified, described assorting process specifically can comprise:

The physical features of substep 911, extraction image to be classified;

Substep 912, this image to be classified is carried out semantic tagger, marked word accordingly;

Substep 913, at the mark word of this image to be classified, semantic word in itself and the semantic network is mated, and characterizes polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe;

Substep 914, the physical features of this image to be classified and polynary group of proper vector of forming of characterizing semantics are input in the image sorter, export corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process;

Step 902, sorted image in this image data base is added that keyword is handled or extracts content characteristic and handle;

Step 903, when the text query request that receives the user, according to text query requests, in this image data base, carry out the matching inquiry of keyword, and the respective image that matching inquiry obtains returned to the user;

Step 904, when the image querying request that receives the user, according to this image querying request, in this image data base, carry out the matching inquiry of content characteristic, and the respective image that matching inquiry obtains returned to the user.

Above-mentioned interpolation keyword is handled mainly based on the TBIR technology, it avoids the analysis to the image viewing element, handles but from aspects such as image name, picture size, compression type, author, ages the image under of all categories this image data base is added keyword; Like this, when the text query request that receives the user, keyword and the text query requests of the image under of all categories in this image data base can be carried out matching inquiry, and the respective image that matching inquiry obtains is returned to the user.At the image classification results accurately on the basis, the interpolation keyword result of the image under of all categories in this image data base also is accurately, thereby can guarantee the accuracy of image searching result.

Above-mentioned extraction content characteristic is handled mainly based on the CBIR technology, and it faces the analysis to the image viewing element, extracts content characteristics such as color, texture, shape in the image under of all categories from this image data base usually; Like this, when the image querying request that receives the user, according to this image querying request (for example being sample image that the user submits to), adopt same abstracting method to extract the content characteristic of sample image that the user submits to, and in image data base retrieval and sample image that the user submits to consistent or similar image collection in terms of content, and return to the user.At the image classification results accurately on the basis, the extraction content characteristic result of the image under of all categories in this image data base also is accurately, thereby can guarantee the accuracy of image searching result.

In a word, the application's image search method can find own needed information for the user exactly from numerous image informations.

The application also provides a kind of image retrieving apparatus, with reference to Figure 10, specifically can comprise:

Image sort module 1001 is used for the image data base image is classified, and specifically can comprise:

Extract submodule 1011, be used for extracting the physical features of image to be classified;

Semantic tagger submodule 1012 is used for this image to be classified is carried out semantic tagger, is marked word accordingly;

Matched sub-block 1013 is used for the mark word at this image to be classified, and semantic word in itself and the semantic network is mated;

Generate submodule 1014, be used for characterizing polynary group according to the corresponding characterizing semantics generative semantics of semantic word that the match is successful; Store semantic word and corresponding some characterizing semanticses in the described semantic network, described characterizing semantics adopts physical features to describe; And

Sorter classification submodule 1015 is used for physical features and polynary group of proper vector of forming of characterizing semantics of this image to be classified are input to the image sorter, exports corresponding classification results; Wherein, the sorter of described image sorter for obtaining according to the physical features of the image pattern under each image category and polynary group of proper vector training of forming of characterizing semantics, first number of polynary group of characterizing semantics is identical with physical features in training and the assorting process;

Keyword processing module 1002 is used for that the sorted image of this image data base is added keyword and handles;

Extract characteristic processing module 1003, be used for that the sorted image of this image data base is extracted content characteristic and handle;

Text query processing module 1004 is used for according to text query requests, carrying out the matching inquiry of keyword in this image data base, and the respective image that matching inquiry obtains being returned to the user when the text query request that receives the user; And

Image querying processing module 1005 is used for according to this image querying request, carrying out the matching inquiry of content characteristic in this image data base, and the respective image that matching inquiry obtains being returned to the user when the image querying request that receives the user.

Each embodiment in this instructions all adopts the mode of going forward one by one to describe, and what each embodiment stressed is and the difference of other embodiment that identical similar part is mutually referring to getting final product between each embodiment.

More than a kind of image classification method and device, a kind of image search method and device that the application is provided be described in detail, used specific case herein the application's principle and embodiment are set forth, the explanation of above embodiment just is used for helping to understand the application's method and core concept thereof; Simultaneously, for one of ordinary skill in the art, according to the application's thought, the part that all can change in specific embodiments and applications, in sum, this description should not be construed as the restriction to the application.

Claims

1. an image classification method is characterized in that, comprising:

Extract the physical features of image to be classified;

2. the method for claim 1 is characterized in that, described image sorter is the sorter that obtains by following steps:

Collect the image pattern under each image category, set up training set;

Extract the physical features of each image pattern in this training set;

3. method as claimed in claim 1 or 2 is characterized in that, the corresponding characterizing semantics generative semantics of the described foundation semantic word that the match is successful characterizes polynary group step, comprising:

4. method as claimed in claim 1 or 2, it is characterized in that described characterizing semantics comprises semantic polynary group of the color that adopts color description for polynary group, adopt semantic polynary group of the texture of texture description, adopt semantic polynary group or adopt semantic polynary group of the space that the locus describes of the shape of shape description.

5. method as claimed in claim 4 is characterized in that, first number that described color is semantic polynary group is the color semantic triple.

6. method as claimed in claim 5, it is characterized in that, that the dominant hue of first yuan of expression of described color semantic triple comprises is red, orange, brown, yellow, green, blue, purple, powder, cream colour, fuchsin and olive green, that the secondary tint of second yuan of expression of described color semantic triple comprises is red, brown, yellow, green, blue, purple and powder, and the brightness that the ternary of described color semantic triple is represented comprises black, dark, grey, bright and white.

7. method as claimed in claim 1 or 2, it is characterized in that, polynary group of proper vector of forming of described physical features and characterizing semantics comprises polynary group of proper vector of forming of characterizing semantics of physical features and character string forms, perhaps, polynary group of proper vector of forming of the characterizing semantics of physical features and digital form, wherein, polynary group of the characterizing semantics of described digital form is for to obtain according to polynary group of quantification of the characterizing semantics of character string forms.

8. an image classification device is characterized in that, comprising:

9. an image search method is characterized in that, comprising:

Each described method is classified to image in the image data base in the right to use requirement 1 to 7, and sorted image in this image data base is added keyword handle or extract the content characteristic processing;

10. an image retrieving apparatus is characterized in that, comprising:

The image sort module is used for right to use and requires 8 described devices that the image data base image is classified;