CN107636678B - Method and apparatus for predicting attributes of image samples - Google Patents

Method and apparatus for predicting attributes of image samples Download PDF

Info

Publication number
CN107636678B
CN107636678B CN201580080731.4A CN201580080731A CN107636678B CN 107636678 B CN107636678 B CN 107636678B CN 201580080731 A CN201580080731 A CN 201580080731A CN 107636678 B CN107636678 B CN 107636678B
Authority
CN
China
Prior art keywords
samples
training image
image samples
splitting
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201580080731.4A
Other languages
Chinese (zh)
Other versions
CN107636678A (en
Inventor
汤晓鸥
黄琛
吕健勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sensetime Technology Development Co Ltd
Original Assignee
Beijing Sensetime Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sensetime Technology Development Co Ltd filed Critical Beijing Sensetime Technology Development Co Ltd
Publication of CN107636678A publication Critical patent/CN107636678A/en
Application granted granted Critical
Publication of CN107636678B publication Critical patent/CN107636678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to methods and systems for predicting attributes of image samples. The method for predicting attributes of image samples includes: obtaining a plurality of image subsets from a training set comprising a plurality of training image samples; gradually splitting each subset of images to generate a decision forest for prediction; determining paths of nodes of the test image sample in the decision forest; merging training image samples at all leaf nodes in each determined path; clustering all the merged training image samples to obtain overlapping clusters, each merged training image sample being clustered into at least one of the overlapping clusters; and predicting attributes of the test image sample from the overlapping clusters.

Description

Method and apparatus for predicting attributes of image samples
Technical Field
The present application relates to machine learning, and in particular to a method and apparatus for predicting attributes of image samples.
Background
Data imbalance exists in many visual tasks, and high-level face age estimation and head pose estimation are detected from low-level edges. In the widely used FG-NET and MORPH databases, young images are often much more numerous than older images, the human head rarely exhibits extreme poses, and the various image edge image data structures obey a power-law distribution on the BSDS500 database.
Without dealing with this imbalance problem, conventional vision algorithms tend to have a strong learning bias for the majority classes and poor prediction accuracy for the minority classes, and the interestingness is usually equal or greater (e.g., few edges may convey the most important semantic information about natural images). The lack of learning for a few classes is due to a complete lack of representation due to a limited number of examples or even no examples, especially in the presence of small data sets. For example, FG-NET and Pointing' 04 head pose datasets have only 1002 and 2790 images total (8 images with 60+ age and 60 images with 90 ° pitch angle), respectively; and FG-NET has no images for some age classes above 60. This may present a greater challenge in inferring invisible (unseen) data from a few classes of samples that typically have high variability. Worse yet, small unbalanced data sets may be accompanied by class overlap issues, which further exacerbate the difficulty of learning.
In the field of machine learning, there are three common approaches to deal with the negative effects of data imbalances: resampling, cost-sensitive learning (cost-sensitive learning), and ensemble learning (ensemble learning). The goal of the resampling method is to equalize the classes a priori by undersampling the majority class or oversampling the minority class (or both), but it will be easy to exclude valuable information or introduce noise. By adjusting the misclassification cost (misclassification cost) associated with the samples, the cost-sensitive learning (cost-sensitive learning) method is generally considered to work better than the random resampling method, however, the true cost (true cost) is often unknown. One effective technique to be further improved is to resort to ensemble learning (even without any a priori). Chen et al combines bagging with cost sensitive decision trees to generate a weighted version of a random forest, which, to our knowledge, is only based on an unbalanced learning method of the random forest. They use class weights to balance the Gini criterion during node splitting and aggregation at leaf nodes.
The above methods have two common disadvantages: 1) they are designed for classification or regression, but there is no general solution for both. 2) They have limited interpretation ability with respect to unseen appearance and synthesis of new tags on the observed data space. This is more severe in the case of a combination of imbalance and small number of samples, where a few classes are represented too little due to an excessively reduced (or even no) number of samples/labels. Herein, the problems of data imbalance and invisible data inference in both classification and regression cases are addressed.
Disclosure of Invention
One aspect of the present application discloses a method for predicting attributes of image samples. The method for predicting attributes of image samples may comprise: obtaining a plurality of image subsets from a training set comprising a plurality of training image samples; gradually splitting (splitting) each of these image subsets to generate a decision forest for prediction; determining paths of nodes of the test image sample in the decision forest; merging training image samples at all leaf nodes in each determined path; clustering all of the merged training image samples to obtain overlapping clusters, each of the merged training image samples being clustered into at least one of the overlapping clusters; and predicting attributes of the test image sample from the overlapping clusters.
According to embodiments of the present application, splitting may comprise: clustering training image samples into different classes at each node of a decision forest; assigning weights to the clustered classes, wherein greater weights are assigned to classes with fewer training image samples and lesser weights are assigned to classes with more training image samples; and splitting the training image samples based on the assigned weights.
According to embodiments of the present application, the decision forest may have a depth such that all training image samples in each class have the same attributes.
According to embodiments of the present application, the information gain of a decision forest may be below a first fixed threshold.
According to an embodiment of the present application, the training image sample size at a leaf node of the decision forest may have a size below a second fixed threshold.
According to embodiments of the present application, splitting may comprise: the training image samples are split by a cost sensitive linear support vector machine for classification.
According to embodiments of the present application, splitting may comprise: the training image samples are split by cost sensitive linear support vector regression for regression.
According to an embodiment of the present application, clustering may include: calculating the distance between the offset points between two samples in the combined training image samples; and assigning each of the merged training image samples to at least one cluster based on the inter-bias-point distance to obtain an overlapping cluster, wherein the inter-bias-point distance is the euclidean distance of two of the merged training image samples multiplied by a factor equal to or greater than 1 if the two of the merged training image samples have the same attribute, and otherwise the inter-bias-point distance is the euclidean distance multiplied by a factor less than 1.
According to an embodiment of the present application, the predicting may include: finding clusters of the overlapping clusters that approximate the test image sample; calculating a coefficient estimate for the test image sample from the found cluster; updating the coefficient estimate via a nearest neighbor approximation; the updated coefficient estimates are used to predict properties of the test image sample.
Another aspect of the application discloses a system for predicting attributes of image samples. The system for predicting attributes of image samples may comprise: splitting means for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples and progressively splitting each of these subsets to generate a decision forest for prediction; a determining device electrically connected with the splitting device and used for determining the path of the node of the test image sample in the decision forest; clustering means electrically connected to the determining means and adapted to combine the training samples at all leaf nodes in each determined path and to locally cluster all combined training samples to obtain overlapping clusters, each of the overlapping clusters having at least two attributes; and a prediction means electrically connected to the clustering means and for predicting the property of the test sample from the overlapping clusters.
According to embodiments of the present application, the cleaving apparatus may further comprise: a clustering unit for clustering the training image samples into different classes at each node of the decision forest; a first assigning unit electrically connected to the clustering unit and configured to assign weights to the clusters-processed classes, wherein a greater weight is assigned to the class having fewer training image samples and a smaller weight is assigned to the class having more training image samples; and a splitting unit electrically connected with the assigning unit and configured to split the training image samples based on the assigned weights.
According to an embodiment of the application, the splitting unit may be a cost sensitive linear support vector machine for classification.
According to embodiments of the present application, the splitting unit may be a cost sensitive linear support vector regression for regression.
According to an embodiment of the present application, the clustering device may further include: a calculating unit for calculating an inter-bias point distance between two of the merged training image samples; and a second assigning unit electrically connected to the calculating unit and configured to assign each of the merged training image samples to at least one cluster based on the inter-offset-point distance to obtain an overlapped cluster, wherein if two of the merged training image samples have the same attribute, the calculating unit may calculate the inter-offset-point distance by calculating a euclidean distance of the two of the merged training image samples multiplied by a factor equal to or greater than 1, and otherwise calculate the inter-offset-point distance by calculating the euclidean distance multiplied by a factor less than 1.
According to an embodiment of the present application, the prediction apparatus may further include: a finding unit for finding a cluster of the overlapping clusters that approximates the test image sample; an estimating unit electrically connected to the finding unit and configured to calculate a coefficient estimation value of the test image sample from the found cluster; an updating unit electrically connected to the estimating unit and configured to update the coefficient estimation value via a nearest neighbor approximation; and a prediction unit electrically connected to the update unit and configured to predict an attribute of the test image sample using the updated coefficient estimation value.
Another aspect of the application relates to a system for predicting attributes of image samples. The system may include: a memory that may store executable components; and a processor electrically coupled to the memory that can execute components to perform operations of the system, wherein the executable components can include: a splitting component for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples and progressively splitting each subset to generate a decision forest for prediction; determining means for determining paths of nodes of the test image sample in the decision forest; clustering means for merging training samples at all leaf nodes in each determined path and locally clustering all merged training samples to obtain overlapping clusters; and a prediction component for predicting an attribute of the test sample from the overlapping cluster.
According to embodiments of the present application, the splitting component may further comprise: a clustering subcomponent for clustering the training image samples into distinct classes at each node of the decision forest; a first assigning sub-component for assigning weights to the cluster-processed classes, wherein a greater weight is assigned to the class with fewer training image samples and a lesser weight is assigned to the class with more training image samples; and a splitting subcomponent for splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, the clustering means may further include: a calculating subcomponent for calculating an offset point-to-point distance between two of the merged training image samples; and a second assigning subcomponent for assigning one of the merged training image samples to at least one cluster based on the inter-bias-point distance to obtain an overlapping cluster, wherein the calculating subcomponent can calculate the inter-bias-point distance by calculating the euclidean distance of two of the merged training image samples multiplied by a factor equal to or greater than 1 if the two of the merged training image samples have the same attribute, and otherwise by calculating the euclidean distance multiplied by a factor less than 1.
According to an embodiment of the application, the prediction component may further comprise: a finding subcomponent for finding a cluster in the overlapping clusters that approximates the test image sample; an estimation subcomponent for calculating coefficient estimates for the test image samples from the found clusters; an update subcomponent for updating the coefficient estimates via a near-neighbor like approximation; and a prediction subcomponent for predicting an attribute of the test image sample using the updated coefficient estimate.
The present application combines ensemble learning with cost sensitive learning in a natural way without resampling, thereby avoiding information loss and adding noise.
Drawings
Illustrative, non-limiting embodiments of the invention are described below with reference to the accompanying drawings. The figures are illustrative and are generally not drawn to exact scale. The same reference numbers will be used throughout the drawings to refer to the same or like elements.
Fig. 1 illustrates a method for predicting attributes of image samples according to an embodiment of the present application.
Fig. 2 illustrates sub-steps of generating a decision forest according to an embodiment of the present application.
Fig. 3 illustrates sub-steps of obtaining overlapping clusters according to an embodiment of the present application.
Fig. 4 illustrates a system for predicting attributes of image samples according to an embodiment of the present application.
Figure 5 illustrates a schematic block diagram of a cleaving apparatus according to an embodiment of the present application.
Fig. 6 illustrates a schematic block diagram of a clustering apparatus according to an embodiment of the present application.
Fig. 7 illustrates a schematic block diagram of a prediction apparatus according to an embodiment of the present application.
Fig. 8 illustrates a system for predicting attributes of image samples according to an embodiment of the present application.
Detailed Description
Hereinafter, embodiments of the present application will be described in detail with reference to the detailed description and the accompanying drawings.
Will refer to the training set
Figure GDA0003107381480000071
Various embodiments of the present application are described, wherein xi∈RDIs a sample siA feature vector of, and yiIs a sample siThe label of (1). The present application aims at unbiased prediction of sample features x (even in the presence of severe imbalance and small data sets). The label y ∈ C refers to class coefficients for classification (e.g., edge classes) and numerical values for regression (e.g., age and pose angle). To identify the correct decision regions for the majority classes and the more important decision regions for the minority classes, the present application relies on efficient and robust random decision forests. A random decision forest is an ensemble of decision trees learned from a plurality of random subsets of data. Each tree recursively divides the input space into unconnected partitions, generating candidate decision regions in a coarse-to-fine manner.
Fig. 1 illustrates a method 1000 for predicting attributes of an image sample according to an embodiment of the present application.
In step S100, a training set is received
Figure GDA0003107381480000072
And multiple image subsets are obtained from the training set by, for example, sampling.
Then, in step S200, each of the image subsets is progressively split to generate a decision tree. The generated decision tree constitutes a decision forest for predicting attributes of the test image sample.
Step S200 will now be described in detail with reference to fig. 2.
As shown in fig. 2Shown therein, in step S210, training samples S at node j are formed, for example, by employing the well-known K-means techniquejClustering into two classes
Figure GDA0003107381480000073
For classification, training sample S at node jjAre clustered into two classes to be split into left or right nodes. For a multi-class case (e.g., a 10-class case), 10 classes are classified into one part including 5 similar classes and another part including the other 5 classes, and then the two parts are gradually split. Then, in step S220, to prevent the bias toward the majority class (majority class), a weight is assigned to the class subjected to the clustering process. In this application, weights are defined as a function of cluster distribution. For example, the weight may be associated with a factor f (p)k)=(1-pk)/pkAre associated with wherein
Figure GDA0003107381480000081
Obviously, f (p)k) The minority class is given more weight without losing overall performance. Then, in step S230, for the local sample SjFor node j, SjCan be cost-sensitively split into
Figure GDA0003107381480000082
And
Figure GDA0003107381480000083
in particular, cost sensitive splitting may employ a factor f (p)k). When the maximum depth or local sample size | S is reachedjWhen | falls below the second fixed threshold, step S230 stops. For classification, step S230 may also stop if the information gain described in equation (1) falls below the first fixed threshold. The information gain is defined as:
Figure GDA0003107381480000084
where H represents class entropy. To enter intoLine regression, the information gain may be replaced with a label variance defined as: h(s) ═ Σy(y-μ)2V S | where μ ═ Σyy/| S |. Thus, a decision forest is obtained.
For example, to perform the classification, the splitting function used in step S230 may be determined from a cost-sensitive version of the linear SVM:
Figure GDA0003107381480000085
where w is the weight vector, C is the regularization parameter, and if
Figure GDA0003107381480000086
Then z isi1, otherwise ziIs-1. Finally, pass sgn (w)Txi) Sending each training sample to
Figure GDA0003107381480000087
Or
Figure GDA0003107381480000088
To perform regression, the splitting function used in step S230 may be determined from a linear SVR that is sensitive to cost:
Figure GDA0003107381480000091
wherein epsilon is more than or equal to 0. Node by predicting value wTxiLocal mean of } and label
Figure GDA0003107381480000092
And branches to the left or right in comparison.
Then, in step S300, a test image sample is input to each decision tree of the decision forest generated in step S200. According to the splitting criterion of each node of the decision tree, the nodes that can be reached by the test image sample can be determined in each of the decision trees, and thus the paths of the nodes of the test image sample in the decision forest can be determined.
Then, in step S400, the training samples at all leaf nodes in each determined path are merged, so that a broader decision region covering as many as possible few samples is segmented out. That is, all sample sets of leaf nodes reachable by test samples
Figure GDA0003107381480000093
Are combined into a larger sample set
Figure GDA0003107381480000094
Then, in step S500, the merged training samples are clustered into overlapping clusters. That is, each training image sample that has been merged may belong to at least one cluster, such that overlapping clusters may have complementary shapes, thereby enriching the cluster representation.
Step S500 will now be described in detail with reference to fig. 3.
As shown in fig. 3, in step S510, the inter-bias point distance between two of the merged training samples is calculated. For example, xiAnd xjInter-dot distance therebetween
Figure GDA0003107381480000095
Attribute with label bias (label-biased):
Figure GDA0003107381480000096
where d is the Euclidean distance, if xiAnd xjClass of (2) is different, then 1 (y)i≠yj) 1 and g (y) τ y/(max { y } -y) is a reciprocal increasing function, τ being a trade-off parameter. The offset distance makes clustering discriminable because it favors "same class" data over different classes of data. In extreme cases (e.g., in the case of classification), it still forms multiple clusters all from exactly one class (even if the cluster members differ significantly in shape), which is suitable for proceedingAnd (5) line classification processing. The offset inter-point distance may be used in a K-means technique for clustering.
Then, in S520, each of the merged training samples is assigned to at least one cluster based on the inter-bias point distance. For example, by in each iteration being based on the sample xiNearest centroid of
Figure GDA0003107381480000101
Sample xiIs released to more than one centroid
Figure GDA0003107381480000102
Allowing the clusters to overlap each other (empirically ω 0.8).
Hereinafter, step S500 will now be discussed in detail in examples according to embodiments of the present application.
Given N training samples, to cluster the N training samples into K overlapping clusters, the following steps are performed:
I. determining the centroids of the K clusters;
calculating offset distances between the N-K image samples and the centroids of the K clusters;
assigning each of the image samples to more than one centroid, i.e. allowing the clusters to overlap each other by: in each iteration, based on sample xiNearest centroid of
Figure GDA0003107381480000103
Sample xiIs released to more than one centroid
Figure GDA0003107381480000104
The N image samples are then clustered into K clusters by using a modified K-means technique (based on offset distance and multiple assignments). This results in overlapping clusters each containing some "inter-class" samples, but with complementary shapes to enrich the cluster representation;
updating the centroids of the K clusters; and
v. repeat II-IV until the centroids of the K overlapping clusters converge.
Then, in step S600, the properties of the test samples may be predicted from the overlapping clusters.
Step S600 will be described in detail below. Given step 500 to generate K overlapping clusters
Figure GDA0003107381480000111
These overlapping clusters have their feature matrices
Figure GDA0003107381480000112
And
Figure GDA0003107381480000113
the label of the sample q is predicted in step S600.
Specifically, in step S600, first, the affine bag model (affine hull model) AH is passed throughkEach of the overlapping clusters is modeled, and the affine package model is capable of accounting for different patterns of invisible data. Each AHkCovers all possible affine combinations of its samples and can be parameterized as:
AHk={x=μk+Ukvk,k=1,...,K} (5)
wherein
Figure GDA0003107381480000114
Is the center of mass, UkIs from the central LkThe obtained orthonormal base of SVD, and vkIs a coefficient vector.
Then, through calculation
Figure GDA0003107381480000115
(i.e., determine the exponent k) to determine which affine package model is used to approximate the sample q. Using muk+UkvkThe sample q is updated.
Then, based on the updated q and the determined exponent k, the robust estimation value α is estimatedkEstimated as:
Figure GDA0003107381480000116
based on the estimated
Figure GDA0003107381480000117
The sparse coefficient α can be determined byk
Figure GDA0003107381480000118
Therefore, the sparse coefficient αkEstimated value
Figure GDA0003107381480000119
Of (3) is performed.
Then, the joint optimization of the cluster and its approximation to the nearest neighbor sparse prior is formulated as:
Figure GDA0003107381480000121
where ε ≧ 0, and λ and γ are the tuning parameters.
These operations are repeated until convergence is reached. The labels for sample q are then predicted as being used for regression
Figure GDA0003107381480000122
Or by reaction at yk(with sparse components for classification) among them is majority voted to predict the label of sample q.
The following presents a list of operations for a method for predicting properties of a sample according to an embodiment of the present application.
Figure GDA0003107381480000123
Figure GDA0003107381480000131
The application also relates to a system for predicting attributes of image samples according to embodiments of the application.
Fig. 4 illustrates a system 2000 for predicting attributes of image samples according to an embodiment of the present application. Reference will be made to the training set as mentioned above
Figure GDA0003107381480000132
System 2000 is described.
As shown in fig. 4, the system 2000 includes a splitting means 100, a determining means 200, a clustering means 300, and a predicting means 400.
As shown in fig. 5, the splitting apparatus 100 includes a clustering unit 110, a first allocating unit 120, and a splitting unit 130. Training set
Figure GDA0003107381480000133
Is input into the clustering unit 110. The clustering unit 110 is used to generate a plurality of image subsets from the training set, e.g. by sampling. In addition, the clustering unit 110 clusters the training samples S at node j by employing, for example, the well-known K-means techniquejClustering into two classes
Figure GDA0003107381480000134
The first allocation unit 120 is electrically connected to the clustering unit 110. The first assigning unit 120 is configured to assign weights to the clusters processed classes according to the output of the clustering unit 110. The weight is the same as the weight mentioned in step S220, and a detailed description thereof will not be repeated herein.
The splitting unit 130 is electrically connected to the first distribution unit 120. Based on the assigned weights, splitting unit 130 may split the local sample S at node jjCost sensitive splitting into
Figure GDA0003107381480000135
And
Figure GDA0003107381480000136
the splitting unit 130 may employ a factor f (p)k) To exchangeSample SjA cost sensitive split is performed. When the maximum depth or local sample size | S is reachedjWhen | falls below a second fixed threshold, splitting unit 130 may stop splitting. For classification, the splitting unit 130 may also stop splitting if the information gain described in equation (1) falls below a first fixed threshold. To perform regression, the information gain may be replaced with the above-mentioned tag variance. Thus, a decision forest is obtained.
The determination device 200 is electrically connected to the splitting device 100. The generated decision forest is output by the splitting means 100 to the determining means 200. The test sample is input to the determination device 200. The determining means 200 is then used to determine the nodes in each decision tree that can be reached by the test image sample, and hence determine the path of the nodes of the test image sample in the decision forest.
The clustering means 300 is electrically connected to the determining means 200. The clustering means 300 is arranged to merge the training samples at all leaf nodes in each determined path, thereby segmenting a broader decision region covering as many as a few samples as possible. That is, the clustering apparatus 300 sets all samples of the leaf nodes that can be reached by the test sample
Figure GDA0003107381480000141
Combined into a larger sample set
Figure GDA0003107381480000142
Then, the clustering means 300 clusters the merged training samples into overlapping clusters.
As shown in fig. 6, the clustering apparatus 300 further includes a calculating unit 310 and a second allocating unit 320.
The calculation unit 310 is configured to calculate an inter-bias distance between two of the merged training samples. For example, the inter-bias-point distance may be an inter-bias-point distance as defined in equation (4).
The second dispensing unit 320 is electrically connected to the calculating unit 310. The inter-bias-point distance is output by the calculation unit 310 to the second dispensing unit 320. The second dispensing unit 320 is then used to dispense the material based on the offset inter-point distanceEach of the combined training samples is assigned to at least one cluster. For example, the second allocating unit 320 may allocate the first allocation unit by, in each iteration, based on the sample xiNearest centroid of
Figure GDA0003107381480000143
Sample xiIs released to more than one centroid
Figure GDA0003107381480000144
Allowing the clusters to overlap each other (empirically ω 0.8).
The prediction apparatus 400 is electrically connected to the clustering apparatus 300. The overlapping clusters are output by the clustering means 300 to the prediction means 400. The prediction means 400 is then used to predict the properties of the test samples from the overlapping clusters.
As shown in fig. 7, the prediction apparatus 400 includes a discovery unit 410, an estimation unit 420, an update unit 430, and a prediction unit 440.
The finding unit 410 is used to find clusters that approximate the test sample among the overlapping clusters. The estimation unit 420 is electrically connected to the discovery unit 410 and is configured to calculate coefficient estimates for the test image samples from the discovered clusters. The updating unit 430 is electrically connected to the estimating unit 420 and is configured to update the coefficient estimation values via a nearest neighbor approximation. The prediction unit 440 is electrically connected to the update unit 430 and is configured to predict an attribute of the test image sample using the updated coefficient estimation value. The operation of the prediction means is substantially the same as the step described in step S600.
The present application also relates to a system 3000 for predicting properties of a test sample according to embodiments of the present application.
As shown in fig. 8, system 3000 includes: a memory 3100 storing executable components; and a processor 3200 coupled to memory 3100 and for executing the executable components to perform the operations of system 3000. These executable components include: a splitting component 3110 for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and progressively splitting each of the subsets to generate a decision forest for prediction; a determining component 3120 for determining paths of nodes of the test image sample in a decision forest; clustering component 3130 for merging training samples at all leaf nodes in each determined path and locally clustering all merged training samples to obtain overlapping clusters; and a prediction unit 3140 for predicting the properties of the test samples from the overlapping clusters.
According to an embodiment of the present application, the splitting component 3110 further comprises: a clustering subcomponent for clustering the training image samples into distinct classes at each node of the decision forest; a first assigning sub-component for assigning weights to the cluster-processed classes, wherein a greater weight is assigned to the class with fewer training image samples and a lesser weight is assigned to the class with more training image samples; and a splitting subcomponent for splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, clustering component 3130 further comprises: a calculating subcomponent for calculating an offset point-to-point distance between two of the merged training image samples; and a second assigning subcomponent for assigning the merged one of the training image samples to at least one cluster based on the inter-bias-point distance to obtain an overlapping cluster, wherein the calculating subcomponent calculates the inter-bias-point distance by calculating a euclidean distance of two of the merged training image samples multiplied by a factor equal to or greater than 1 if the two of the merged training image samples have the same attribute, and otherwise calculates the inter-bias-point distance by calculating a euclidean distance multiplied by a factor less than 1.
According to an embodiment of the present application, the prediction part 3140 further includes: a finding subcomponent for finding a cluster in the overlapping clusters that approximates the test image sample; a coefficient estimation value calculation sub-component for calculating a coefficient estimation value of the test image sample from the found cluster; an update subcomponent for updating the coefficient estimates via a near-neighbor like approximation; and a prediction subcomponent for predicting an attribute of the test image sample using the updated coefficient estimate.
Embodiments within the scope of the present invention may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Apparatus within the scope of the invention may be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method acts within the scope of the invention may be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
Embodiments within the scope of the present invention are advantageously implemented in one or more computer programs that are executable on a programmable processor including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program may be implemented in a high-level procedural or object-oriented programming language, or in assembly or machine language (if desired); and in any case, the language may be a compiled or interpreted language. By way of example, suitable processors include both general purpose processors and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory. Generally, a computer will include one or more mass storage devices for storing data files.
Embodiments within the scope of the present invention include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer system. Examples of computer readable media may include: physical storage media such as RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices; or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which can be accessed by a general purpose or special purpose computer system. Any of the above may be supplemented by, or incorporated in, ASICs (application-specific integrated circuits). While embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the present invention.
While preferred examples of the present invention have been described, variations or modifications in those examples may occur to those skilled in the art upon learning of the basic inventive concepts. It is intended that the appended claims be construed to include preferred examples and that all such variations or modifications are within the scope of the invention.
It will be apparent to those skilled in the art that variations and modifications of the present invention can be made without departing from the spirit and scope of the invention. Thus, if these changes or modifications fall within the scope of claims and equivalent techniques, they may also fall within the scope of the present invention.

Claims (16)

1. A method for predicting attributes of image samples, comprising:
obtaining a plurality of image subsets from a training set comprising a plurality of training image samples;
gradually splitting each of the image subsets to generate a decision forest for prediction;
determining paths of nodes of the test image sample in the decision forest;
merging training image samples at all leaf nodes in each determined path;
clustering all merged training image samples to obtain overlapping clusters, each merged training image sample being clustered into at least one of the overlapping clusters; and
predicting properties of the test image sample from the overlapping clusters,
wherein the splitting comprises:
clustering the training image samples into different classes at each node of the decision forest;
the clustered classes are assigned weights, with classes with fewer training image samples being assigned more weight and classes with more training image samples being assigned less weight.
2. The method of claim 1, wherein the splitting further comprises:
splitting the training image samples based on the assigned weights.
3. A method as claimed in claim 2, wherein the decision forest has a depth such that all training image samples of each said class have the same attribute.
4. The method of claim 2, wherein an information gain of the decision forest is below a first fixed threshold.
5. The method of claim 1, wherein a training image sample size at the leaf nodes of the decision forest has a size below a second fixed threshold.
6. The method of claim 2, wherein the splitting comprises:
the training image samples are split by a cost sensitive linear support vector machine for classification.
7. The method of claim 2, wherein the splitting comprises:
the training image samples are split by cost sensitive linear support vector regression for regression.
8. The method of claim 1, wherein the clustering comprises:
calculating the distance between the offset points between two samples in the combined training image samples; and
assigning each of the training image samples that has been combined to at least one cluster based on the inter-bias point distance to obtain the overlapping cluster,
wherein the inter-bias-point distance is the Euclidean distance multiplied by a factor equal to or greater than 1 for the two samples of the merged training image samples if the two samples of the merged training image samples have the same attribute, otherwise the inter-bias-point distance is the Euclidean distance multiplied by a factor less than 1.
9. A system for predicting attributes of image samples, comprising:
splitting means for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples and progressively splitting each of said image subsets to generate a decision forest for prediction;
determining means electrically connected to the splitting means and for determining paths of nodes of a test image sample in the decision forest;
clustering means electrically connected to the determining means and configured to combine the training samples at all leaf nodes in each determined path and to locally cluster all combined training samples to obtain overlapping clusters, each of the overlapping clusters having at least two attributes; and
prediction means electrically connected to said clustering means and for predicting an attribute of a test sample from said overlapping clusters,
wherein the splitting apparatus further comprises:
a clustering unit that clusters the training image samples into different classes at each node of the decision forest;
a first assigning unit electrically connected to the clustering unit and assigning weights to the clustered classes, wherein a greater weight is assigned to a class having fewer training image samples and a lesser weight is assigned to the class having more training image samples.
10. The system of claim 9, wherein the cleaving device further comprises:
a splitting unit electrically connected with the assigning unit and splitting the training image samples based on the assigned weights.
11. The system of claim 10, wherein the splitting unit is a cost sensitive linear support vector machine for classification.
12. The system of claim 10, wherein the splitting unit is a cost sensitive linear support vector regression for regression.
13. The system of claim 9, wherein the clustering means further comprises:
a calculating unit for calculating an offset point-to-point distance between two samples of the merged training image samples; and
a second assigning unit electrically connected to the calculating unit and assigning one of the merged training image samples to at least one cluster based on the inter-bias point distance to obtain the overlapping cluster,
wherein the calculation unit calculates the inter-bias-point distance by calculating a euclidean distance of the two samples of the combined training image samples multiplied by a factor equal to or greater than 1 if the two samples of the combined training image samples have the same attribute, and calculates the inter-bias-point distance by calculating the euclidean distance multiplied by a factor less than 1 otherwise.
14. A system for predicting attributes of image samples, comprising:
a memory storing executable components; and
a processor electrically coupled to the memory to execute the executable components to perform operations of the system, wherein the executable components comprise:
a splitting component for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples and progressively splitting each of said image subsets to generate a decision forest for prediction;
determining means for determining paths of nodes of a test image sample in the decision forest;
clustering means for merging training image samples at all leaf nodes in each determined path and clustering all merged training samples to obtain overlapping clusters, each merged training image sample being clustered into at least one of the overlapping clusters; and
a prediction component for predicting an attribute of a test sample from the overlapping clusters,
wherein the splitting component further comprises:
a clustering subcomponent for clustering the training image samples into distinct classes at each node of the decision forest;
a first assigning subcomponent for assigning weights to the cluster-processed classes, wherein greater weights are assigned to classes with fewer training image samples and lesser weights are assigned to classes with more training image samples.
15. The system of claim 14, wherein the splitting component further comprises:
a splitting subcomponent for splitting the training image samples based on the assigned weights.
16. The system of claim 14, wherein the clustering component further comprises:
a calculating subcomponent for calculating an offset point-to-point distance between two of the merged training image samples; and
a second assigning component for assigning one of the merged training image samples to at least one cluster based on the inter-bias point distance to obtain the overlapping cluster,
wherein the computation subcomponent computes the inter-bias-point distance by computing the Euclidean distance of the two samples in the combined training image samples multiplied by a factor equal to or greater than 1 if the two samples in the combined training image samples have the same attribute, and otherwise computes the inter-bias-point distance by computing the Euclidean distance multiplied by a factor less than 1.
CN201580080731.4A 2015-06-29 2015-06-29 Method and apparatus for predicting attributes of image samples Active CN107636678B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/082645 WO2017000118A1 (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attribute for image sample

Publications (2)

Publication Number Publication Date
CN107636678A CN107636678A (en) 2018-01-26
CN107636678B true CN107636678B (en) 2021-12-14

Family

ID=57607394

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201580080731.4A Active CN107636678B (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attributes of image samples

Country Status (2)

Country Link
CN (1) CN107636678B (en)
WO (1) WO2017000118A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11531927B2 (en) * 2017-11-28 2022-12-20 Adobe Inc. Categorical data transformation and clustering for machine learning using natural language processing
EP3806065A1 (en) 2019-10-11 2021-04-14 Aptiv Technologies Limited Method and system for determining an attribute of an object at a pre-determined time point
CN112215186B (en) * 2020-10-21 2024-04-05 深圳市赛为智能股份有限公司 Classification method, device, computer equipment and storage medium for marsh wetland vegetation

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592147A (en) * 2011-12-30 2012-07-18 深圳市万兴软件有限公司 Method and device for detecting human face
CN103049514A (en) * 2012-12-14 2013-04-17 杭州淘淘搜科技有限公司 Balanced image clustering method based on hierarchical clustering
CN103268317A (en) * 2012-02-06 2013-08-28 微软公司 System and method for semantically annotating images
CN103679132A (en) * 2013-07-15 2014-03-26 北京工业大学 A sensitive image identification method and a system
CN103971112A (en) * 2013-02-05 2014-08-06 腾讯科技(深圳)有限公司 Image feature extracting method and device
CN103971097A (en) * 2014-05-15 2014-08-06 武汉睿智视讯科技有限公司 Vehicle license plate recognition method and system based on multiscale stroke models
CN103984953A (en) * 2014-04-23 2014-08-13 浙江工商大学 Cityscape image semantic segmentation method based on multi-feature fusion and Boosting decision forest
CN104573715A (en) * 2014-12-30 2015-04-29 百度在线网络技术(北京)有限公司 Recognition method and device for image main region
CN104603291A (en) * 2012-06-22 2015-05-06 Htg分子诊断有限公司 Molecular malignancy in melanocytic lesions
CN104683686A (en) * 2013-11-27 2015-06-03 富士施乐株式会社 Image processing apparatus and image processing method
CN104680559A (en) * 2015-03-20 2015-06-03 青岛科技大学 Multi-view indoor pedestrian tracking method based on movement behavior mode

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004053778A2 (en) * 2002-12-11 2004-06-24 Koninklijke Philips Electronics N.V. Computer vision system and method employing illumination invariant neural networks
US9519868B2 (en) * 2012-06-21 2016-12-13 Microsoft Technology Licensing, Llc Semi-supervised random decision forests for machine learning using mahalanobis distance to identify geodesic paths
US20140015855A1 (en) * 2012-07-16 2014-01-16 Canon Kabushiki Kaisha Systems and methods for creating a semantic-driven visual vocabulary
CN104680118B (en) * 2013-11-29 2018-06-15 华为技术有限公司 A kind of face character detection model generation method and system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102592147A (en) * 2011-12-30 2012-07-18 深圳市万兴软件有限公司 Method and device for detecting human face
CN103268317A (en) * 2012-02-06 2013-08-28 微软公司 System and method for semantically annotating images
CN104603291A (en) * 2012-06-22 2015-05-06 Htg分子诊断有限公司 Molecular malignancy in melanocytic lesions
CN103049514A (en) * 2012-12-14 2013-04-17 杭州淘淘搜科技有限公司 Balanced image clustering method based on hierarchical clustering
CN103971112A (en) * 2013-02-05 2014-08-06 腾讯科技(深圳)有限公司 Image feature extracting method and device
CN103679132A (en) * 2013-07-15 2014-03-26 北京工业大学 A sensitive image identification method and a system
CN104683686A (en) * 2013-11-27 2015-06-03 富士施乐株式会社 Image processing apparatus and image processing method
CN103984953A (en) * 2014-04-23 2014-08-13 浙江工商大学 Cityscape image semantic segmentation method based on multi-feature fusion and Boosting decision forest
CN103971097A (en) * 2014-05-15 2014-08-06 武汉睿智视讯科技有限公司 Vehicle license plate recognition method and system based on multiscale stroke models
CN104573715A (en) * 2014-12-30 2015-04-29 百度在线网络技术(北京)有限公司 Recognition method and device for image main region
CN104680559A (en) * 2015-03-20 2015-06-03 青岛科技大学 Multi-view indoor pedestrian tracking method based on movement behavior mode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Supervising Random Forest Using Attribute Interaction Networks;Qinxin Pan等;《Proceedings of the 11th European conference on Evolutionary computation》;20130430;全文 *
基于回归森林的面部姿态分析;乔体洲等;《计算机辅助设计与图形学学报》;20140731;第26卷(第7期);全文 *

Also Published As

Publication number Publication date
CN107636678A (en) 2018-01-26
WO2017000118A1 (en) 2017-01-05

Similar Documents

Publication Publication Date Title
US9002101B2 (en) Recognition device, recognition method, and computer program product
US7542954B1 (en) Data classification by kernel density shape interpolation of clusters
CN106557485B (en) Method and device for selecting text classification training set
Carbonera et al. A density-based approach for instance selection
US10540566B2 (en) Image processing apparatus, image processing method, and program
US9563822B2 (en) Learning apparatus, density measuring apparatus, learning method, computer program product, and density measuring system
US11636164B2 (en) Search system for providing web crawling query prioritization based on classification operation performance
JP6897749B2 (en) Learning methods, learning systems, and learning programs
JP2012038244A (en) Learning model creation program, image identification information giving program, learning model creation device, image identification information giving device
CN107636678B (en) Method and apparatus for predicting attributes of image samples
Yones et al. Genome-wide pre-miRNA discovery from few labeled examples
US10671663B2 (en) Generation device, generation method, and non-transitory computer-readable recording medium
CN114972737B (en) Remote sensing image target detection system and method based on prototype contrast learning
WO2019184480A1 (en) Item recommendation
Mostafaei et al. OUBoost: boosting based over and under sampling technique for handling imbalanced data
US20210158901A1 (en) Utilizing a neural network model and hyperbolic embedded space to predict interactions between genes
US11727464B2 (en) Utilizing machine learning models to determine and recommend new releases from cloud providers to customers
Vahidipour et al. Comparing weighted combination of hierarchical clustering based on Cophenetic measure
Jung et al. Scaling of class-wise training losses for post-hoc calibration
Bouchachia et al. Incremental spectral clustering
CN111723199A (en) Text classification method and device and computer readable storage medium
Rosyid et al. Optimizing K-Means Initial Number of Cluster Based Heuristic Approach: Literature Review Analysis Perspective
US20240134616A1 (en) Intelligent adaptive self learning framework for data processing on cloud data fusion
Dik et al. A new dynamic algorithm for unsupervised learning
CN103150574A (en) Image spam detection method based on nearest tag propagation algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant