CN102385592B

CN102385592B - Image concept detection method and device

Info

Publication number: CN102385592B
Application number: CN201010271693.XA
Authority: CN
Inventors: 冯明; 梁笃国; 张艳霞; 曹宁; 邓涛
Original assignee: China Telecom Corp Ltd
Current assignee: China Telecom Corp Ltd
Priority date: 2010-09-03
Filing date: 2010-09-03
Publication date: 2014-07-09
Anticipated expiration: 2030-09-03
Also published as: CN102385592A

Abstract

The invention discloses an image concept detection method and a device, wherein the method comprises the steps of acquiring data to be detected and local characteristics of training data of multiple concepts; gathering word lists with different lengths according to different quantization strategies and respectively performing statistics to the data to be detected and histograms of the local characteristics of training data of the multiple concepts; training a binary support vector machine classifier, calculating the average detection accuracy rate of the local characteristics of training data of each concept and training a classification model of each concept; selecting the best sub word list of each concept through cross validation and using the trained classification model corresponding to the best sub word list of each concept as the final concept detection classifier of each concept; and inputting the statistical histograms of the local characteristics of the data to be detected on the best sub word list of each concept to the final concept detection classifier of each concept to determine the probability of each concept emerging in the data to be detected.

Description

The detection method of image concept and device

Technical field

The present invention relates to multimedia messages detection technique field, more specifically, relate to a kind of detection method and device of image concept.

Background technology

In recent years along with the increase at full speed of the video on network, image resource, produced the digital picture resource of magnanimity, how helping user from so abundant Internet resources, to search rapidly effective resource just becomes the hot issue of recent numerous research units research.One of gordian technique addressing this problem for effective search method of image information.Since early 1990s, CBIR (Content-based Image Retrieval, CBIR) technology is valued by the people gradually.CBIR technology utilizes the low-level feature information such as color, shape, texture and the region of image to be described the index as image to image, calculate the similarity distance of query image and target image, retrieve by similarity coupling, return to one group of image that in image library, content description meets the demands most.

But; due to the similarity of image vision low-level feature and be not exclusively equal to the similarity of people's subjective judgement image; so user conventionally can propose conceptual retrieval requirement in the time carrying out image retrieval, and return to image and whether meet oneself needs from subjective judgement.Therefore,, in order to realize the natural language retrieval mode of the understandability of being more close to the users, the image retrieval technologies of research based on semantic become the developing direction of current field of image search.Concept detection technology is the key link of the image retrieval technologies based on semantic, and the development of concept detection technology can improve the image retrieval effect based on semantic to a great extent.

Concept detection technology is as the typical mode identification technology of one, and feature extraction is link very important in concept detection technology.Because high-layer semantic information cannot directly obtain from the visual signature of image, so the validity feature that feature extraction step extracts can directly affect sorter, so that the performance of whole mode identification procedure.What wish extraction most is that those have obvious differentiation meaning, easily extraction and the feature set to insensitive for noise.

In recent years, there are a lot of research units to do a large amount of research to Feature Extraction Technology both at home and abroad, can roughly characteristics of image be divided into global characteristics and local feature.Global characteristics is many features about color, texture, shape and region of extracting from original pixel value, global characteristics can be expressed the most essential characteristic of image, but global characteristics also has significant limitation, for example, color characteristic is subject to the impact of brightness of image and colourity to a great extent, and the image of the different colourity of same content, brightness is distinguished very large on color characteristic; The feature such as texture, shape is for the vicissitudinous image recognition poor effect of translation, rotation and yardstick.These problems have all embodied the limitation of global characteristics.

In order to address these problems, David G.Lowe has summed up the existing characteristic detection method based on invariant technology in 2004, and formally proposed a kind of based on metric space, to image scaling, rotate the image local feature yardstick invariant features transformation operator (Scale-Invariant Feature Transform, SIFT) that even affined transformation maintains the invariance.In the last few years, there are a lot of research institutions how to utilize SIFT operator to carry out having done a large amount of research aspect concept detection, the word bag model (Bag of words) being proposed by LiFei-fei is for the processing of SIFT feature, embody good effect concept identification is technical, obtained application very widely.

But said method is too single in the selection of the word list of word bag model, all concepts all adopt the word list of equal length, cause the detection efficiency of concept lowlyer, and the computing power of computing machine are had to very high requirement.

Summary of the invention

The technical matters that the present invention will solve is to provide a kind of detection method of image concept, can guarantee that image concept improves detection efficiency in the situation that of detecting effect.

The invention provides a kind of detection method of image concept, comprise the local feature that utilizes SIFT algorithm to obtain respectively the local feature of testing data and the training data of multiple concepts; According to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise the multiple sub-word list corresponding with multiple concepts, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1; The local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list; To calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter; The best concept that is input to the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept detects the probability to determine that each concept occurs in testing data in sorter.

According to the inventive method embodiment, according to different quantization strategies, utilize before the local feature of the training data of K means Method and each concept assembles the step about the word list of the different length of each concept, the method is also included as multiple concepts and adds a background classes.

According to another embodiment of the inventive method, 20≤K≤200.

According to the another embodiment of the inventive method, the method also comprises the training data of choosing the each concept that comprises markup information according to Sampling Strategy.

According to an embodiment again of the inventive method, Sampling Strategy is wherein, N _ifor the positive sample size of i concept before sampling, n _ifor the quantity of training data of i concept after sampling, a _ifor the Sampling Strategy parameter between 0 and 1.

Image concept detection method of the present invention is selected the sub-word list for each concept the best adaptively through cross validation, the concept less for local feature adopts shorter word list, not only obtain good detection effect but also improved detection efficiency, the effect for the sufficiently long word list of the abundant conceptual choice of local feature to guarantee to detect.The present invention is directed to different concepts and select respectively the word list of different length, in guaranteeing to detect effect, improved the efficiency detecting.

Another technical matters that the present invention will solve is to provide a kind of pick-up unit of image concept, can guarantee that image concept improves detection efficiency in the situation that of detecting effect.

The invention provides a kind of pick-up unit of image concept, comprise local feature extraction module, for utilizing SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts; Cluster module, be connected with local feature extraction module, be used for according to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise the multiple sub-word list corresponding with multiple concepts, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1; Disaggregated model training module, be connected with cluster module with local feature extraction module respectively, for the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list; Cross validation module, is connected with disaggregated model training module with cluster module, for to calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter; Concept detection module, be connected with cross validation module with cluster module, be input to best concept for the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept and detect the probability of sorter to determine that each concept occurs in testing data.

An embodiment of the apparatus according to the invention, this device also comprises class interpolation module, is connected with cluster module, is used to multiple concepts to add a background classes.

According to another embodiment of apparatus of the present invention, 20≤K≤200.

According to the another embodiment of apparatus of the present invention, this device also comprises sampling module, is connected, for choose the training data of the each concept that comprises markup information according to Sampling Strategy with local feature extraction module.

According to an embodiment again of apparatus of the present invention, Sampling Strategy is wherein, N _ifor the positive sample size of i concept before sampling, n _ifor the quantity of training data of i concept after sampling, a _ifor the Sampling Strategy parameter between 0 and 1.

Image concept pick-up unit of the present invention is selected the sub-word list for each concept the best adaptively through cross validation, the concept less for local feature adopts shorter word list, not only obtain good detection effect but also improved detection efficiency, the effect for the sufficiently long word list of the abundant conceptual choice of local feature to guarantee to detect.The present invention is directed to different concepts and select respectively the word list of different length, in guaranteeing to detect effect, improved the efficiency detecting.

Accompanying drawing explanation

Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part.In the accompanying drawings:

Fig. 1 is the schematic flow sheet of first embodiment of the inventive method.

Fig. 2 is the schematic flow sheet of second embodiment of the inventive method.

Fig. 3 is the schematic flow sheet of the 3rd embodiment of the inventive method.

Fig. 4 is the structural representation of first embodiment of apparatus of the present invention.

Fig. 5 is the structural representation of second embodiment of apparatus of the present invention.

Fig. 6 is the structural representation of the 3rd embodiment of apparatus of the present invention.

Fig. 7 is the structural representation of the 4th embodiment of apparatus of the present invention.

Embodiment

With reference to the accompanying drawings the present invention is described more fully, exemplary embodiment of the present invention is wherein described.Exemplary embodiment of the present invention and explanation thereof are used for explaining the present invention, but do not form inappropriate limitation of the present invention.

The object of the invention is to propose a kind of concept detection method and device of the word bag model based on the sub-word list of the best, it can overcome the lower defect of concept detection efficiency in prior art.The present invention is directed to each concept, through cross validation select appropriate length word list (, best sub-word list), utilize the sub-word list of the best of each concept to learn out disaggregated model by binary support vector machine, the disaggregated model of multiple concepts is merged into best concept and detect sorter, and utilize best concept detection sorter to treat detected image and carry out the detection of concept.The present invention is better than traditional word bag model in performance, and has obtained good experiment effect.

Fig. 1 is the schematic flow sheet of first embodiment of the inventive method.

As shown in Figure 1, this embodiment can comprise the following steps:

S102, utilizes SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

For example, for every width image (, testing data and training data), first between three adjacent yardsticks of difference of Gaussian metric space, find Local Extremum, this point is the extreme point in 26 points that close in this metric space and adjacent metric space, again by the three-dimensional quadratic function of matching accurately to determine position and the yardstick of Local Extremum, next the gradient direction distribution characteristic of utilizing definite Local Extremum neighbor is that each Local Extremum is formulated direction parameter, make operator possess rotational invariance, the Local Extremum of customizing messages in presentation video can be called as to point of interest, finally centered by each point of interest, 8 × 8 window extracts 128 proper vectors of tieing up, wherein, these points of interest in every width image are a subset of the Local Extremum that contains directivity information,

S104, according to different quantization strategies, utilizes the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, and the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise multiple sub-word lists corresponding with concept, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

Illustrate, all training images of each concept (, the training data of each concept) local feature that obtains through SIFT algorithm process is (, based on the proper vector of point of interest) all utilize the method for K mean cluster, according to the difference of quantization strategy (, in the time of cluster, get different K values), assemble the class of unequal number amount, each class can be regarded as to a word in word list, the class of unequal number amount, , the length difference of each word list, then the word list of the different length of all concepts is merged together, form word bag model, , build the word list of multiple different lengths, be designated as { B ₁, B ₂..., B _i..., B _n, the difference of quantization strategy refers to the value difference of K herein, K is integer, conventionally, K > 1, preferably, 20≤K≤200, for example, in the time that K equals 20, can utilize K means Method to calculate the word list B of M the concept that sub-word list length is 20 ₁in the time that K equals 30, can utilize K means Method to calculate the word list B2 of M the concept that sub-word list length is 30, the rest may be inferred, in the time that K equals 200, can utilize K means Method to calculate the word list BN of M the concept that sub-word list length is 200, the word list of the different length of M concept is merged into word bag model { B ₁, B ₂..., B _i..., B _n, then according to word bag model { B ₁, B ₂..., B _i..., B _nthe concentrated SIFT local feature that (comprising testing image and training image), all images extracted of statistical picture, (, the local feature of every width image is at word bag model { B about the histogram of every the sub-word list in each word list to obtain every width image ₁, B ₂..., B _i..., B _nin each word list B _iin each " sub-word list " in occur number of times), in other words, the representative of this histogram is being represented the frequency that these unique points of each concept occur in every sub-word list in statistical picture, the proper vector that this histogram can be used as every image is input in concept detection sorter classifies,

S106, the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

For example, can select binary support vector machine is basic classification device, on training image storehouse, train concept detection sorter by machine learning and mode identification technology, wherein, binary support vector machine is that a kind of on decision-making lineoid, in optimization training set, between sample, the algorithm on border is (, support vector machine is by DUAL PROBLEMS OF VECTOR MAPPING to more in the space of higher-dimension, in this space, set up a largest interval lineoid, both sides at the lineoid that separates data have two lineoid parallel to each other, thereby realize a kind of learning algorithm of Data classification);

Particularly, training image storehouse (, training data) the point of interest set after SIFT algorithm process can be divided into training set and checksum set two parts, choose word list B ₁(suppose word list B ₁comprise M sub-word list B ₁₁, B ₁₂.., B _1M), according to comprising concept C ₁all images of markup information and the SIFT local feature of this image at word list B ₁each sub-word list in the histogram information that counts on training set, train binary support vector machine classifier, adjust the parameter of binary support vector machine classifier, and test on checksum set, the correlation parameter of adjusting kernel function (generally adopts radial basis function (Radial Basis Function, RBF) core, wherein, parameters C and δ utilize checksum set data to obtain best parameter by cross validation to select), to determine word list B ₁in with concept C ₁the parameter of the binary support vector machine classifier that corresponding sub-word list is optimum condition,, binary support vector machine classifier test performance the best on checksum set, the concept detection Average Accuracy also calculating on checksum set is the highest, and training obtains and word list B ₁in sub-word list B ₁₁corresponding concept C ₁disaggregated model, adopt and use the same method, can train respectively and obtain and word list B ₁sub-word list B ₁₂.., B _1Mcorresponding other concepts (C ₂.., C _m) disaggregated model, change word list, repeat above-mentioned steps, with new word list B _ithe histogram of lower statistics is trained as proper vector, obtains and word list B _isub-word list B _i1, B _i2.., B _iMcorresponding concept (C ₁, C ₂.., C _m) disaggregated model and with word list B _iin sub-word list B _i1, B _i2.., B _iMthe detection Average Accuracy of the local feature of the training data of corresponding each concept;

S108, to calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter, alternatively, the sub-word list of multiple concepts the best separately can also be merged into best word bag model;

For example, (for example can obtain the performance table of a different concepts under the sub-word list of difference by the step of S106, detect Average Accuracy table), by cross validation (, the mutually relatively corresponding multiple detection Average Accuracies of sub-word lists different from identical concept) in this table, choose the best sub-word list of the best sub-word list of performance as each concept, and the disaggregated model that sub-the best of utilizing each concept word list is learnt out by binary support vector machine is as the final concept detection sorter of each concept, M the final concept detection sorter of concept merged into best concept and detect sorter, the concept comprising for detection of testing image,

S110, the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept is input to best concept and detects the probability to determine that each concept occurs in testing data in sorter;

For example, the SIFT local feature of image set to be detected statistic histogram on the selected sub-word list of the best in S108 is input to best concept and detects in sorter, best concept detects sorter and exports about M concept (, { C ₁, C ₂..., C _m) the testing result of all images to be detected, best concept detects the court verdict that detection of classifier goes out can be shown as probability judgement, that is, the decimal between export one 0～1, the degree of confidence of expression " existence " this concept.

This embodiment combining image treatment technology and mode identification technology realize the semantic concept of image are detected, and it can be the different most suitable word list length of conceptual choice, form the best word bag model of self-adaptation word list length.Meanwhile, this embodiment carries out respectively cluster to each concept, then obtains word list by merging, makes computing machine can carry out parallel computation to improve detection efficiency.In addition, this embodiment is in the time of the image retrieval carrying out based on semantic, and its performance is better than the word bag model that adopts original vocabulary, can significantly improve the retrieval performance of image.

As shown in Figure 2, this embodiment can comprise the following steps:

S202, utilizes SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

S204, for multiple concepts are added a background classes, , for all need detect concepts (for example, M concept) an additional background classes, adding of background classes can provide a lot of background informations for word bag model, the background information of detected data can be proposed on the one hand, to detect more accurately the concept in data to be tested, the pure background information that does not comprise any concept can also be grouped in background classes on the other hand, pure background information is grouped in the corresponding class of certain concept mistakenly preventing, thereby can improve significantly the detection Average Accuracy of concept to be detected,

S206, according to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept (concept herein not only comprises concept to be detected and also comprises added background classes), the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise the multiple sub-word list corresponding with multiple concepts, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

S208, the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

S210, to calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter;

S212, the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept is input to best concept and detects the probability to determine that each concept occurs in testing data in sorter.

As shown in Figure 3, this embodiment can comprise the following steps:

S302, chooses the training data of the each concept that comprises markup information according to Sampling Strategy, wherein, Sampling Strategy can be wherein, N _ifor the positive sample size of i concept before sampling, n _ifor the quantity of the training data (that is, just sample) of i concept after sampling, a _ifor the Sampling Strategy parameter between 0 and 1;

Illustrate, utilize Sampling Strategy for through artificial mark (, mark image in every pictures or video whether comprise certain/some concept) M concept in the training data of each concept choose, the M a selecting concept (can be expressed as to { C ₁, C ₂..., C _m) training data be expressed as { T ₁, T ₂..., T _m), wherein, choose T ₁, T ₂..., T _msampling Strategy Deng training data is

If the positive sample of certain concept (that is, comprising certain concept in this sample) quantity is less than or equal to 100, because its positive sample size is less, in order to make training data comprise abundant information, all positive sample datas are trained; If the positive sample size of certain concept, more than 100, is selected a Sampling Strategy parameter a _i(this Sampling Strategy parameter a conventionally, _ican be between 0 and 1), n samples out from the positive sample more than 100 _i=a _i× (N _i-100) individual positive sample training, illustrates, if " night " this concept has 252 positive samples, adopts a _i=0.5 Sampling Strategy parameter, has 76 positive samples to participate in training this concept for night;

S304, utilizes SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

S306, according to different quantization strategies, utilizes the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, and the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise the multiple sub-word list corresponding with multiple concepts, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

S308, the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

S310, to calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter;

S312, the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept is input to best concept and detects the probability to determine that each concept occurs in testing data in sorter.

Positive sample Sampling Strategy in this embodiment, in guaranteeing that positive sample contains information, has simplified training data, has improved the detection efficiency of concept.

In the above-described embodiments, preferably, 20≤K≤200, in the time that the value of K is less than 20, constructed word list generally can not be expressed the characteristic information of this concept fully, declines to a great extent thereby can make to detect effect; And in the time that the value of K is greater than 200, constructed word list information relative redundancy, has increased the computational burden of computing machine greatly, and do not promote significantly in effect.

The 4th embodiment of the inventive method can comprise the following steps:

Step 1, for, through the training data after semantic artificial mark, application Sampling Strategy is chosen, and uses N _irepresent the positive sample total that i concept before sampling has, use n _irepresent the quantity of the positive sample that is applied to training extracting, Sampling Strategy parameter a _iconventionally between 0 and 1, Sampling Strategy is step 2, all testing images and training image utilize SIFT algorithm to obtain its local feature, and for example, SIFT algorithm can adopt dimensional Gaussian linear transformation core (wherein, σ has represented the variance of Gauss normal distribution) set up metric space, for the two dimensional image of a width gray scale, metric space under different yardsticks represents to be obtained by image and gaussian kernel convolution: L (x, y, σ)=G (x, y, σ) * I (x, y), wherein (x, y) location of pixels of representative image, I (x, y) represents the gray-scale value of this pixel image, σ is called the metric space factor, and L has represented the metric space of image;

After metric space is set up, in order to find stable extreme point, can adopt the method for difference of Gaussian to detect those extreme points at local location, , adopt the image subtraction in two adjacent yardsticks: D (x, y, σ)=L (x, y, k σ)-L (x, y, σ), (for every width image, can between three of a difference of Gaussian metric space adjacent yardstick, find Local Extremum), afterwards by fitting three-dimensional quadratic function accurately to determine position and the yardstick of Local Extremum, next the gradient direction distribution characteristic of utilizing definite Local Extremum neighbor is that each Local Extremum is formulated direction parameter, make operator possess rotational invariance,

Finally 8 × 8 window centered by each point of interest, then on each 4 × 4 fritter, calculate the histogram of gradients of 8 directions, calculate the accumulated value in each direction, obtain 128 dimensional feature vectors of each point of interest;

Step 3, based on great many of experiments, the additional background classes of concept that detect at all needs are effectively to improve the effect of concept detection, represent altogether to need the number of detection concept with M, an additional background classes, total M+1 concept, each concept is utilized the method for K mean cluster, select different quantization strategies (that is, to choose respectively different K values, conventionally, K > 1, preferably, 20≤K≤200), assemble { c ₁, c ₂..., c _n(, the value of K is respectively c for the class of individual varying number ₁, c ₂..., c _n), each class can be regarded as to a word in word list, then the word list of multiple different lengths of this M concept is merged together and forms word bag model, that is, build length for { (M+1) c ₁, (M+1}c ₂..., (M+1) c _nword list, be designated as (B ₁, B ₂..., B _i..., B _n, then according to word bag model { B ₁, B ₂..., B _i..., B _nstatistical picture concentrates the SIFT local feature that extracted separately of all images, obtains every width image about each word list B _iin the histogram of every sub-word list, the proper vector that this histogram can be used as every image is input in concept detection sorter classifies;

Step 4, can select binary support vector machine is basic classification device, on training image storehouse, train concept detection sorter by machine learning and mode identification technology, wherein, binary support vector machine be a kind of on decision-making lineoid the algorithm on border between sample in optimization training set, particularly, training image storehouse can be divided into training set and checksum set two parts at the point of interest set after SIFT algorithm process, the image in every part all comprises for concept C _imarkup information and the SIFT local feature of this image at word list B _ieach sub-word list in the histogram information that counts, on training set, train binary support vector machine classifier based on above-mentioned information, adjust the parameter of binary support vector machine classifier, and test on checksum set, to determine word list B _iin with concept C _ithe parameter of the binary support vector machine classifier that corresponding sub-word list is optimum condition, that is, sorter test performance the best on checksum set, also concept detection Average Accuracy is the highest, obtains and word list B _ithe concept C that neutron word list is corresponding _idisaggregated model, and with word list B _ithe concept C that neutron word list is corresponding _ithe detection Average Accuracy of local feature of training data, in like manner, obtain and word list B _iin the disaggregated model of the corresponding each concept of other sub-word lists, and with word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of other sub-word lists;

Step 5, utilize different word lists, repeating step four, train as proper vector using the histogram of adding up under new word list, obtain the disaggregated model of the detection Average Accuracy of the each concept on new checksum set and all concepts corresponding with every sub-word list in new word list, same step, obtains the word list of all different lengths for the detection effect table P of different concepts _mn(m represents the sum of word list, and n represents the sum of concept), wherein, table P _mnin element be p _ij(i is expressed as the sequence number of different length word list, and j is expressed as the sequence number of concept), the meaning of its expression is the j conception of species, is (M+1) c in length _iword list B _iunder concept detection Average Accuracy, afterwards can be by P _ijchange into row vector (α ₀, α ₁..., α _n) ^t, get the maximum norm of each row vector || α _i|| _∞, that is, pick out for concept c _jthere is the sub-word list of the best of highest detection Average Accuracy, and using the disaggregated model that utilizes the sub-word list of this best to learn out by binary support vector machine as concept c _jfinal concept detection sorter, merges into best concept by M the final concept detection sorter of concept and detects sorter;

Step 6, detects sorter by the SIFT local feature of the image set to be detected input of statistic histogram on the selected sub-word list of the best best concept in step 5, and best concept detects sorter and exports about M concept (, { C ₁, C ₂..., C _m) the testing result of all images to be detected, the court verdict that best concept detection detection of classifier goes out can be shown as probability judgement, export the decimal between 0～1, represent the degree of confidence of " existence " this concept, if degree of confidence exceedes 0.5, be judged to be testing image and have this concept.

In the above-described embodiments, the image data base of employing is the key frame of the video data of TRECVID2008.TRECVID is authority's match in NBS (NIST) the video frequency searching field of holding.For example, choosing three semantic concepts such as " aloft aircraft ", " bus ", " night " detects.Whole image data base is divided into two parts: training plan image set and image set to be detected, wherein, every image in whole image data base all passes through artificial mark, choose 42, " aloft aircraft " positive sample, 200 of negative samples (, not containing " aloft aircraft " this concept in this sample); Choose 46, " bus " positive sample, 200 of negative samples; Choose 242, " night " positive sample, 500 of negative samples, all concepts one have 10680 images.The word bag model that experiment adopts detection Average Accuracy (Average Precision) to assess the best sub-word list of employing carries out the overall performance of concept detection.Detecting Average Accuracy is a kind of evaluation index that can accurately reflect retrieval performance, and it is widely used in information retrieval field.

Adopt the SIFT descriptor of finding point of interest based on difference of Gaussian metric space, extracted local feature for all training datas and the test data of following these three concepts.By the classification results of support vector machine, select the word list length applicable for these three concepts afterwards: the word list that adopts 50 word lengths for " aloft aircraft " this concept; Adopt the word list of 100 word lengths for " motorbus " this concept; Adopt the word list of 20 word lengths for " night " this concept.

Following table 1 shown on image set to be detected, and the highest Columbia University of the inventive method and TRECVID2008 annual accuracy rate is these three the notional comparisons of test, and with the contrast of the concept detection result that only adopts global characteristics to obtain.

Table 1

As can be seen from Table 1, adopt local feature compared with adopting global characteristics, for example, color and textural characteristics etc. is greatly improved in concept detection.Adopt the word list of suitable length, the Sampling Strategy of suitable positive sample can effectively improve and adopts local feature to carry out the effect of concept detection simultaneously.

Above-described embodiment improves traditional word bag model, traditional word bag model be the word list of a selected regular length for the concept detection of image, but different semantic information,, different concepts, best word list length can be different.Word in so-called word list is exactly the set of the similar local feature that obtains by K means clustering algorithm.For some concept (can be understood as again semantic concept), the feature in this concept just can be expressed completely in simple tens words, if choose the long word list of length, not only increase the burden of computing machine, reduce detection efficiency, but also be mingled with a lot of interfere informations for this concept, reduce on the contrary detection effect.For example, for " night " this scene genus, the local feature containing is less, adopts shorter word list to add up local characteristic information and has not only improved detection efficiency but also strengthened detection effect; For " motorbus " this object genus, owing to containing abundant local feature information, adopt short word list cannot contain the full detail in concept completely, therefore can utilize longer word list,, the informative word list of local feature is effectively to detect this genus.

Compared with prior art, the present invention selects the best word list length of each concept adaptively through cross validation, for the few concept of local feature, adopt shorter word list, has not only obtained good detection effect but also improved detection efficiency; For the abundant concept of local feature, still select sufficiently long word list length, with the effect that guarantees to detect.

As shown in Figure 4, the device of this embodiment comprises:

Local feature extraction module 11, for utilizing SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

For example, for every width image (, testing data and training data), first local feature extraction module 11 finds Local Extremum between three adjacent yardsticks of difference of Gaussian metric space, this point is the extreme point in 26 points that close in this metric space and adjacent metric space, again by the three-dimensional quadratic function of matching accurately to determine position and the yardstick of Local Extremum, next the gradient direction distribution characteristic of utilizing definite Local Extremum neighbor is that each Local Extremum is formulated direction parameter, make operator possess rotational invariance, the Local Extremum of customizing messages in presentation video can be called as to point of interest, finally centered by each point of interest, 8 × 8 window extracts 128 proper vectors of tieing up, wherein, these points of interest in every width image are a subset of the Local Extremum that contains directivity information,

Cluster module 12, be connected with local feature extraction module 11, be used for according to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, the word list of multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of testing data and the local feature of the training data of multiple concepts, wherein, histogram is that local feature is at word bag model { B ₁, B ₂..., B _i..., B _n) each word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of multiple concepts under i quantization strategy, word list B _icomprise multiple sub-word lists corresponding with concept, every sub-word list is the word list of each concept under i quantization strategy, and the length of every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

Illustrate, all training images of each concept (, the training data of each concept) local feature that obtains through SIFT algorithm process is (, based on the proper vector of point of interest) all utilize the method for K mean cluster, according to the difference of quantization strategy (, in the time of cluster, get different K values), assemble the class of unequal number amount, each class can be regarded as to a word in word list, the class of unequal number amount, , the length difference of each word list, then the word list of the different length of all concepts is merged together, form word bag model, built the word list of multiple different lengths, be designated as { B ₁, B ₂..., B _i..., B _n, the difference of quantization strategy refers to the value difference of K herein, K is integer, conventionally, K > 1, preferably, 20≤K≤200, for example, in the time that K equals 20, can utilize K means Method to calculate the word list B of M the concept that sub-word list length is 20 ₁, in the time that K equals 30, can utilize K means Method to calculate the word list B of M the concept that sub-word list length is 30 ₂, the rest may be inferred, in the time that K equals 200, can utilize K means Method to calculate the word list B of M the concept that sub-word list length is 200 _n, the word list of the different length of M concept is merged into word bag model { B ₁, B ₂..., B _i..., B _n, then according to word bag model { B ₁, B ₂..., B _i..., B _nthe concentrated SIFT local feature that (comprising testing image and training image), all images extracted of statistical picture, (, the local feature of every width image is at word bag model { B about the histogram of every the sub-word list in each word list to obtain every width image ₁, B ₂..., B _i..., B _nin each word list B _iin each " sub-word list " in occur number of times), in other words, the representative of this histogram is being represented the frequency that these unique points of each concept occur in every sub-word list in statistical picture, the proper vector that this histogram can be used as every image is input in concept detection sorter classifies,

Disaggregated model training module 13, be connected with cluster module 12 with local feature extraction module 11 respectively, for the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of training set, checksum set, each concept and the local feature of the training data of each concept, and utilize checksum set on binary support vector machine classifier, to calculate and word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

Cross validation module 14, is connected with disaggregated model training module 13 with cluster module 12, for to calculate with word bag model { B ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at word bag model { B ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by multiple concepts separately final concept detection sorter merge into best concept and detect sorter, alternatively, the sub-word list of multiple concepts the best separately can also be merged into best word bag model;

For example, (for example can obtain the performance table of a different concepts under the sub-word list of difference by above-mentioned disaggregated model training module 13, detect Average Accuracy table), by cross validation (, the mutually relatively corresponding multiple detection Average Accuracies of sub-word lists different from identical concept) in this table, choose the best sub-word list of the best sub-word list of performance as each concept, and the disaggregated model that sub-the best of utilizing each concept word list is learnt out by binary support vector machine is as the final concept detection sorter of each concept, M the final concept detection sorter of concept merged into best concept and detect sorter, the concept comprising for detection of testing image,

Concept detection module 15, be connected with cross validation module 14 with cluster module 12, be input to best concept for the histogram that the local feature of testing data is counted on the sub-word list of the best of each concept and detect the probability of sorter to determine that each concept occurs in testing data;

As shown in Figure 5, compared with embodiment in Fig. 4, the device of this embodiment can also comprise:

Class is added module 21, be connected with cluster module 12, be used to multiple concepts to add a background classes, , for all need detect concepts (for example, M concept) an additional background classes, adding of background classes can provide a lot of background informations for word bag model, the background information of detected data can be proposed on the one hand, to detect more accurately the concept in data to be tested, the pure background information that does not comprise any concept can also be grouped in background classes on the other hand, pure background information is grouped in the corresponding class of certain concept mistakenly preventing, thereby can improve significantly the detection Average Accuracy of concept to be detected.

As shown in Figure 6, compared with embodiment in Fig. 4, the device of this embodiment can also comprise:

Sampling module 31, is connected with local feature extraction module 11, for choose the training data of the each concept that comprises markup information according to Sampling Strategy.Wherein, Sampling Strategy is wherein, N _ifor the positive sample size of i concept before sampling, n _ifor the quantity of training data of i concept after sampling, a _ifor the Sampling Strategy parameter between 0 and 1;

Illustrate, utilize Sampling Strategy for through artificial mark (, mark image in every pictures or video whether comprise certain/some concept) M concept in the training data of each concept choose, the M a selecting concept (can be expressed as to { C ₁, C ₂..., C _m) training data be expressed as { T ₁, T ₂..., T _m, wherein, choose T ₁, T ₂..., T _msampling Strategy Deng training data is

If the positive sample of certain concept (that is, comprising certain concept in this sample) quantity is less than or equal to 100, because its positive sample size is less, in order to make training data comprise abundant information, all positive sample datas are trained; If the positive sample size of certain concept, more than 100, is selected a Sampling Strategy parameter a _i(this Sampling Strategy parameter a conventionally, _ican be between 0 and 1), n samples out from the positive sample more than 100 _i=a _i(N _i-100) individual positive sample training, illustrates, if " night " this concept has 252 positive samples, adopts a _i=0.5 Sampling Strategy parameter, has 76 positive samples to participate in training this concept for night.

As shown in Figure 7, compared with embodiment in Fig. 6, the device of this embodiment also comprises:

Class is added module 21, be connected with cluster module 12, be used to multiple concepts to add a background classes, , for all need detect concepts (for example, M concept) an additional background classes, adding of background classes can provide a lot of background informations for word bag model, the background information of detected data can be proposed on the one hand, to detect more accurately the concept in data to be tested, the pure background information that does not comprise any concept can also be grouped in background classes on the other hand, pure background information is grouped in the corresponding class of certain concept mistakenly preventing, thereby can significantly improve the detection Average Accuracy of concept to be detected.

In the above-described embodiments, 20≤K≤200, in the time that the value of K is less than 20, constructed word list generally can not be expressed the characteristic information of this concept fully, declines to a great extent thereby can make to detect effect; And in the time that the value of K is greater than 200, constructed word list information relative redundancy, has increased the computational burden of computing machine greatly, and do not promote significantly in effect.

Description of the invention provides for example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are apparent for the ordinary skill in the art.Selecting and describing embodiment is for better explanation principle of the present invention and practical application, thereby and makes those of ordinary skill in the art can understand the present invention's design to be suitable for the various embodiment with various modifications of special-purpose.

Claims

1. a detection method for image concept, is characterized in that, described method comprises:

Choose the training data of the each concept that comprises markup information according to Sampling Strategy;

Utilize SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

According to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, the word list of described multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of described testing data and the local feature of the training data of described multiple concepts, wherein, described histogram is for local feature is at the predicate bag model { B of institute ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of described multiple concepts under i quantization strategy, described word list B _icomprise multiple sub-word lists corresponding with concept, every sub-word list is the word list of each concept under i quantization strategy, and the length of described every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

The local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of described training set, described checksum set, each concept and the local feature of the training data of each concept, and utilize described checksum set on described binary support vector machine classifier, to calculate the predicate bag model { B with institute ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

To calculate with the predicate bag model { B of institute ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at the predicate bag model { B of institute ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by described multiple concepts separately final concept detection sorter merge into best concept and detect sorter;

The described best concept that is input to the histogram that the local feature of described testing data is counted on the sub-word list of the best of each concept detects the probability to determine that each concept occurs in described testing data in sorter.

2. method according to claim 1, it is characterized in that,, utilize before the local feature of the training data of K means Method and each concept assembles the step about the word list of the different length of each concept according to different quantization strategies described, described method also comprises:

For described multiple concepts are added a background classes.

3. method according to claim 1, is characterized in that, 20≤K≤200.

4. method according to claim 1, is characterized in that, described Sampling Strategy is

n_{i} = \{\begin{matrix} N_{i}, & N_{i} \leq 100 \\ a_{i} \times (N_{i} - 100), & N_{i} > 100 \end{matrix},

Wherein, N _ifor the positive sample size of i concept before sampling, n _ifor the quantity of training data of i concept after sampling, a _ifor the Sampling Strategy parameter between 0 and 1.

5. a pick-up unit for image concept, is characterized in that, described device comprises:

Sampling module, for choosing the training data of the each concept that comprises markup information according to Sampling Strategy;

Local feature extraction module, is connected with described sampling module, for utilizing SIFT algorithm to obtain respectively the local feature of the local feature of testing data and the training data of multiple concepts;

Cluster module, be connected with described local feature extraction module, be used for according to different quantization strategies, utilize the local feature of the training data of K means Method and each concept to assemble the word list about the different length of each concept, the word list of described multiple concepts different length is separately merged into word bag model { B ₁, B ₂..., B _i..., B _n, and add up respectively the histogram of the histogram of local feature of described testing data and the local feature of the training data of described multiple concepts, wherein, described histogram is for local feature is at the predicate bag model { B of institute ₁, B ₂..., B _i..., B _neach word list B _ievery sub-word list in the number of times that occurs, word list B _ifor the word list of described multiple concepts under i quantization strategy, described word list B _icomprise multiple sub-word lists corresponding with concept, every sub-word list is the word list of each concept under i quantization strategy, and the length of described every sub-word list is determined by quantization strategy, quantization strategy is determined by K value, 1≤i≤N, N>=2, K > 1;

Disaggregated model training module, be connected with described cluster module with described local feature extraction module respectively, for the local feature of the training data of each concept is divided into training set and checksum set, utilize the histogram training binary support vector machine classifier of the concept markup information of training data of described training set, described checksum set, each concept and the local feature of the training data of each concept, and utilize described checksum set on described binary support vector machine classifier, to calculate the predicate bag model { B with institute ₁, B ₂..., B _i..., B _neach word list B _iin the corresponding each concept of every sub-word list training data local feature detection Average Accuracy and train and each word list B _iin the disaggregated model of the corresponding each concept of every sub-word list;

Cross validation module, is connected with described disaggregated model training module with described cluster module, for to calculate with the predicate bag model { B of institute ₁, B ₂..., B _i..., B _neach word list B _iin the detection Average Accuracy of local feature of training data of the corresponding each concept of every sub-word list carry out cross validation, with at the predicate bag model { B of institute ₁, B ₂..., B _i..., B _nin select with the maximum of each concept and detect the best sub-word list of the corresponding sub-word list of Average Accuracy as each concept, and using the disaggregated model corresponding with the sub-word list of the best each concept that train as the final concept detection sorter of each concept, by described multiple concepts separately final concept detection sorter merge into best concept and detect sorter;

Concept detection module, be connected with described cross validation module with described cluster module, be input to described best concept for the histogram that the local feature of described testing data is counted on the sub-word list of the best of each concept and detect the probability of sorter to determine that each concept occurs in described testing data.

6. device according to claim 5, is characterized in that, described device also comprises:

Class is added module, is connected with described cluster module, is used to described multiple concept to add a background classes.

7. device according to claim 5, is characterized in that, 20≤K≤200.

8. device according to claim 5, is characterized in that, described Sampling Strategy is

n_{i} = \{\begin{matrix} N_{i}, & N_{i} \leq 100 \\ a_{i} \times (N_{i} - 100), & N_{i} > 100 \end{matrix},