WO2017000118A1 - Method and apparatus for predicting attribute for image sample - Google Patents

Method and apparatus for predicting attribute for image sample Download PDF

Info

Publication number
WO2017000118A1
WO2017000118A1 PCT/CN2015/082645 CN2015082645W WO2017000118A1 WO 2017000118 A1 WO2017000118 A1 WO 2017000118A1 CN 2015082645 W CN2015082645 W CN 2015082645W WO 2017000118 A1 WO2017000118 A1 WO 2017000118A1
Authority
WO
WIPO (PCT)
Prior art keywords
training image
image samples
splitting
predicting
samples
Prior art date
Application number
PCT/CN2015/082645
Other languages
French (fr)
Inventor
Xiaoou Tang
Chen Huang
Chen Change Loy
Original Assignee
Xiaoou Tang
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaoou Tang filed Critical Xiaoou Tang
Priority to CN201580080731.4A priority Critical patent/CN107636678B/en
Priority to PCT/CN2015/082645 priority patent/WO2017000118A1/en
Publication of WO2017000118A1 publication Critical patent/WO2017000118A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers

Definitions

  • the present application relates to machine learning, and in particular to a method and an apparatus for predicting an attribute for an image sample.
  • Resampling approaches aim to make class priors equal by under-sampling the majority class or over-sampling the minority class (or both) , but can easily eliminate valuable information or introduce noise.
  • Cost-sensitive learning is often reported to outperform random re-sampling by adjusting misclassification costs associated with samples, however the true costs are often unknown.
  • An effective technique for further improvement is to resort to ensemble learning even without any priors. Chen et al. combined bagging and cost-sensitive decision trees to generate a weighted version of random forest, which is the only imbalanced learning method based on random forest to the best of our knowledge. They used the class weights for balancing the Gini criterion during node splitting and aggregation at the leaf nodes.
  • the method for predicting the attribute for the image sample may comprise: obtaining a plurality of image subsets from a training set comprising a plurality of training image samples; splitting progressively each of the image subsets to generate a decision forest for prediction; determining paths of nodes in the decision forest for a test image sample; merging the training image samples at all leaf nodes in each of the determined paths; clustering all the merged training image samples to obtain overlapping clusters, each of the merged training image samples being clustered into at least one of the overlapping clusters; and predicting, from the overlapping clusters, an attribute for the test image sample.
  • the splitting may comprise: clustering the training image samples into different classes at each node of the decision forest; assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and splitting the training image samples based on the assigned weights.
  • the decision forest may have a depth such that all training image samples in each of the classes have a same attribute.
  • an information gain of the decision forest may be lower than a fixed threshold.
  • the training image samples at the leaf node of the decision forest may have a size lower than a fixed threshold.
  • the splitting may comprise: splitting the training image samples by a cost-sensitive linear support vector machine for classification.
  • the splitting may comprise: splitting the training image samples by a cost-sensitive linear support vector regression for regression.
  • the clustering may comprise: calculating a biased inter-point distance between two of the merged training image samples; and assigning, based on the biased inter-point distance, each of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the biased inter-point distance is an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise the biased inter-point distance is the Euclidean distance multiplied by a factor less than one.
  • the predicting may comprise: finding a cluster of the overlapping clusters which approximates the test image sample; calculating a coefficient estimate for the test image sample from the found cluster; updating the coefficient estimate via a class-neighbor approximation; predicting the attribute for the test image sample using the updated coefficient estimate.
  • the system for predicting the attribute for the image sample may comprises a splitting device for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining device being electrically connected with the splitting device and for determining paths of the nodes in the decision forest for a test image sample; a clustering device being electrically connected with the determining device and for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters, each of which has at least two attributes; and a predicting device being electrically connected with the cluster and for predicting, from the overlapping clusters, an attribute for the test sample.
  • the splitting device may further comprise: a clustering unit for clustering the training image samples into different classes at each node of the decision forest; a first assigning unit being electrically connected with the clustering unit and for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting unit being electrically connected with the assigning unit and for splitting the training image samples based on the assigned weights.
  • the splitting unit may be a cost-sensitive linear support vector machine for classification.
  • the splitting unit may be a cost-sensitive linear support vector regression for regression.
  • the clustering device may further comprise: a calculating unit for calculating a biased inter-point distance between two of the merged training image samples; and a second assigning unit being electrically connected with the calculating unit and for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the calculating unit may calculate the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
  • the predicting device may further comprise: a finding unit for finding a cluster of the overlapping clusters which approximates the test image sample; a estimating unit being electrically connected with the finding unit and for calculating a coefficient estimate for the test image sample from the found cluster; an updating unit being electrically connected with the estimating unit and for the coefficient estimate via a class-neighbor approximation; and a predicting unit being electrically connected with the updating unit and for predicting the attribute for the test image sample using the updated coefficient estimate.
  • the system may comprise a memory that may store executable components; and a processor electrically coupled to the memory that may execute the executable components to perform operations of the system, wherein the executable components may comprise: a splitting component configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining component configured for determining paths of the nodes in the decision forest for a test image sample; a clustering component configured for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters; and a predicting component configured for predicting, from the overlapping clusters, an attribute for the test sample.
  • the executable components may comprise: a splitting component configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a
  • the splitting component may further comprise: a clustering sub-component configured for clustering the training image samples into different classes at each node of the decision forest; a first assigning sub-component configured for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting sub-component configured for splitting the training image samples based on the assigned weights.
  • the clustering component may further comprise: a calculating sub-component configured for calculating a biased inter-point distance between two of the merged training image samples; and a second assigning sub-component configured for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the calculating sub-component may calculate the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
  • the predicting component may further comprise: a finding sub-component configured for finding a cluster of the overlapping clusters which approximates the test image sample; a estimating sub-component configured for calculating a coefficient estimate for the test image sample from the found cluster; an updating sub-component configured for the coefficient estimate via a class-neighbor approximation; and a predicting sub-component configured for predicting the attribute for the test image sample using the updated coefficient estimate.
  • the present application combines ensemble-and cost-sensitive learning in a natural manner and without resampling, thereby avoiding information loss and added noise.
  • Fig. 1 illustrates a method for predicting an attribute for an image sample according to an embodiment of the present application.
  • Fig. 2 illustrates sub-steps of generating a decision forest according to an embodiment of the present application.
  • Fig. 3 illustrates sub-steps of obtaining overlapping clusters according to an embodiment of the present application.
  • Fig. 4 illustrates a system for predicting an attribute for an image sample according to an embodiment of the present application.
  • Fig. 5 illustrates a schematic block diagram of a splitting device according to an embodiment of the present application.
  • Fig. 6 illustrates a schematic block diagram of a clustering device according to an embodiment of the present application.
  • Fig. 7 illustrates a schematic block diagram of a predicting device according to an embodiment of the present application.
  • Fig. 8 illustrates a system for predicting an attribute for an image sample according to an embodiment of the present application.
  • the present application aims to make unbiased prediction for a sample feature x even in the presence of severely imbalanced and small datasets.
  • the label y ⁇ C refers to a class index (e.g. edge class) for classification and a numeric value (e.g. age and pose angle) for regression.
  • the present application resort to a random decision forest which is efficient and robust.
  • the random decision forest is an ensemble of decision trees learned from multiple random data subsets. Each tree recursively divides the input space into disjoint partitions generating candidate decision regions in a coarse-to-fine manner.
  • Fig. 1 illustrates a method 1000 for predicting an attribute for an image sample according to an embodiment of the present application.
  • step S100 the training set is received and a plurality of image subsets are obtained from the training set by, for example, sampling.
  • each of image subsets is progressively split to generate a decision tree.
  • the generated decision trees constitute the decision forest which is used for predicting an attribute of the test image sample.
  • step S200 will now be described in details with reference to Fig. 2.
  • the training samples S j at the node j is clustered into two classes for example by adopting the well-known K-means technique.
  • the training samples S j at the node j are clustered into two classes so as to be split into the left node or the right node.
  • the ten classes are clustered into a part comprising five similar classes and another part comprising the other five classes, and then the two parts are progressively split.
  • weights are assigned to the clustered classes.
  • the weight is defined as a function of the cluster distribution.
  • S j may be cost-sensitively split into and Specifically, the cost-sensitive splitting may employ the factor of f (p k ) .
  • Step S230 stops when a maximum depth is reached or local sample size
  • step S230 may also stop if information gain which is described in Equation (1) falls below a fixed threshold. The information gain is defined as:
  • H (S) ⁇ y (y- ⁇ ) 2 /
  • where ⁇ ⁇ y y/
  • the splitting function used in step S230 may be determined by a cost-sensitive version of linear SVM:
  • step S230 may be determined by a cost-sensitive version of linear SVR:
  • the node branches left or right by comparing the numeric predictions ⁇ w T x i ⁇ with the local mean of labels
  • step S300 the test image sample is inputted into each decision tree of the decision forest generated in step S200.
  • the nodes that can be reached by the test image sample can be determined in each of the decision trees, and thus paths of the nodes for the test image sample in the decision forest can be determined.
  • step S400 the training samples at all leaf nodes in each of the determined paths are merged, carving a broader decision region covering as many minority samples as possible. That is, all the sample sets of the leaf nodes that may be reached by the test sample are merged into a larger one
  • step S500 the merged training samples are clustered into overlapping clusters. That is, one of the merged training image samples may belong to at least one cluster so that the overlapping cluster may have complementary appearances, enriching cluster representations.
  • step S510 a biased inter-point distance between two of the merged training samples is calculated.
  • the inter-point distance between x i and x j is label-biased:
  • d is the Euclidean distance
  • 1 (y i ⁇ y j ) 1 if the class labels of x i and x j are different
  • g (y) ⁇ y/ (max ⁇ y ⁇ -y) is a reciprocal increasing function
  • is the trade-off parameter.
  • the biased distance makes clustering discriminative by preferring the “same-class” data-pairs to those from different classes. In extreme cases, for example, in classification scenarios, it forms clusters each purely from one class even if the cluster members differ remarkably in appearances, which is suitable for classification.
  • the biased inter-point distance may be used in the K-means technology for clustering.
  • each of the merged training samples are assigned to at least one cluster based on the biased inter-point distance.
  • step S500 will now be discussed in details in an example according to an embodiment of the present application.
  • step S600 the attribute of the test sample can be predicted from the overlapping clusters.
  • Step S600 will be described in details hereinafter. Given that step 500 generates K overlapping clusters with their feature matrices and labels the label for a sample q is predicted in step S600.
  • each of the overlapping clusters is modeled by an affine hull model AH k that is able to account for unseen data of different modes. Every single AH k covers all possible affine combinations of its samples and can be parameterized as
  • the label for the sample q is predicted as for regression or by majority voting among y k with sparse components for classification.
  • the present application also relates to a system for predicting an attribute for an image sample according to an embodiment of the present application.
  • Fig. 4 illustrates a system 2000 for predicting an attribute for an image sample according to an embodiment of the present application.
  • the system 2000 will be described with reference to the training set as mentioned above.
  • the system 2000 comprises a splitting device 100, a determining device 200, a clustering device 300 and a predicting device 400.
  • the splitting device 100 comprises a clustering unit 110, a first assigning unit 120, and a splitting unit 130.
  • the training set is inputted into the clustering unit 110.
  • the clustering unit 110 is configured for generating a plurality of image subsets from the training set by, for example, sampling. Further, the clustering unit 110 clusters the training samples S j at the node j into two classes by adopting, for example, the well-known K means technique.
  • the first assigning unit 120 is electrically connected with the clustering unit 110.
  • the first assigning unit 120 is configured for assigning weights to the clustered classes according to the output of the clustering unit 110.
  • the weight is the same as that mentioned in step S220, the detailed description of which will not be repeated herein.
  • the splitting unit 130 is electrically connected with the first assigning unit 120. Based on the assigned weights, the splitting unit 130 may cost-sensitively split the local samples S j at a node j into and The splitting unit 130 may employ the factor of f (p k ) to perform the cost-sensitive splitting to the local samples S j .
  • the splitting unit 130 may stop splitting when a maximum depth is reached or local sample size
  • the determining device 200 is electrically connected with the splitting device 100.
  • the generated decision forest is outputted by the splitting device 100 to the determining device 200.
  • a test sample is inputted into the determining device 200.
  • the determining device 200 is configured for determining the nodes that can be reached by the test image sample in each of the decision trees and thus determines paths of the nodes for the test image sample in the decision forest.
  • the clustering device 300 is electrically connected with the determining device 200.
  • the clustering device 300 is configured for merging the training samples at all leaf nodes in each of the determined paths, carving a broader decision region covering as many minority samples as possible. That is, the clustering device 300 merges all the sample sets of the leaf nodes that may be reached by the test sample into a larger one Then, the clustering device 300 clusters the merged training samples into overlapping clusters.
  • the clustering device 300 further comprises a calculating unit 310 and a second assigning unit 320.
  • the calculating unit 310 is configured for calculating a biased inter-point distance between two of the merged training samples.
  • the biased inter-point distance may be that as defined in Equation (4) .
  • the second assigning unit 320 is electrically connected with the calculating unit 310.
  • the biased inter-point distance is outputted by the calculating unit 310 to the second assigning unit 320.
  • the second assigning unit 320 is configured for assigning each of the merged training samples to at least one cluster based on the biased inter-point distance.
  • the predicting device 400 is electrically connected with the clustering device 300.
  • the overlapping clusters are outputted by the clustering device 300 to the predicting device 400.
  • the predicting device 400 is configured for predicting an attribute for the test sample from the overlapping clusters.
  • the predicting device 400 comprises a finding unit 410, a estimating unit 420, an updating unit 430 and a predicting unit 440.
  • the finding unit 410 is configured for finding a cluster of the overlapping clusters which approximates the test sample.
  • the estimating unit 420 is electrically connected with the finding unit 410 and is configured for calculating a coefficient estimate for the test image sample from the found cluster.
  • the updating unit 430 is electrically connected with the estimating unit 420 and is configured for updating the coefficient estimate via a class-neighbor approximation.
  • the predicting unit 440 is electrically connected with the updating unit 430 and is configured for predicting the attribute for the test image sample using the updated coefficient estimate.
  • the operations of the predicting device are substantially the same as the steps described in step S600.
  • the present application also relates to a system 3000 for predicting an attribute for a test sample according to an embodiment of the present application.
  • the system 3000 comprises a memory 3100 that stores executable components and a processor 3200 coupled to the memory 3100 and configured for executing the executable components to perform operations of the system 3000.
  • the executable components comprise: a splitting component 3110 configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining component 3120 configured for determining paths of the nodes in the decision forest for a test image sample; a clustering component 3130 configured for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters; and a predicting component 3140 configured for predicting, from the overlapping clusters, an attribute for the test sample.
  • the splitting component 3110 further comprises a clustering sub-component for clustering the training image samples into different classes at each node of the decision forest; a first assigning sub-component for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting sub-component for splitting the training image samples based on the assigned weights.
  • the clustering component 3130 further comprises: a calculating sub-component for calculating a biased inter-point distance between two of the merged training image samples; and a second assigning sub-component for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the calculating sub-component calculates the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
  • the predicting component 3140 further comprises: a finding sub-component for finding a cluster of the overlapping clusters which approximates the test image sample; a coefficient estimate calculating sub-component for calculating a coefficient estimate for the test image sample from the found cluster; a updating sub-component for the coefficient estimate via a class-neighbor approximation; and a predicting sub-component for predicting the attribute for the test image sample using the updated coefficient estimate.
  • Embodiments within the scope of the present invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Apparatus within the scope of the present invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method actions within the scope of the present invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
  • Embodiments within the scope of the present invention be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device.
  • Each computer program can be implemented in a high-level procedural or object oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language.
  • Suitable processors include, by way of example, both general and special purpose microprocessors.
  • a processor will receive instructions and data from a read-only memory and/or a random access memory.
  • a computer will include one or more mass storage devices for storing data files.
  • Embodiments within the scope of the present invention include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon.
  • Such computer-readable media may be any available media, which is accessible by a general-purpose or special-purpose computer system.
  • Examples of computer-readable media may include physical storage media such as RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other media which can be used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which may be accessed by a general-purpose or special-purpose computer system. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) . While particular embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the invention.

Abstract

The present application relates to a method and a system for predicting an attribute for an image sample. The method for predicting the attribute for the image sample comprises: obtaining a plurality of image subsets from a training set comprising a plurality of training image samples; splitting progressively each of the image subsets to generate a decision forest for prediction; determining paths of nodes in the decision forest for a test image sample; merging the training image samples at all leaf nodes in each of the determined paths; clustering all the merged training image samples to obtain overlapping clusters, each of the merged training image samples being clustered into at least one of the overlapping clusters; and predicting, from the overlapping clusters, an attribute for the test image sample.

Description

METHOD AND APPARATUS FOR PREDICTING ATTRIBUTE FOR IMAGE SAMPLE Technical Field
The present application relates to machine learning, and in particular to a method and an apparatus for predicting an attribute for an image sample.
Background of the Application
Data imbalance exists in many vision tasks ranging from low-level edge detection to high-level facial age estimation and head pose estimation. There are often much more images of the youth than the old in the widely used FG-NET and MORPH datasets, human head rarely exhibits extreme poses, and the various image edge image data structures obey a power-law distribution on the BSDS500 dataset.
Without handling this imbalance issue, conventional vision algorithms have a very strong learning bias towards the majority class with poor predictive accuracy for the minority class, usually of equal or more interest (e.g. rare edges may convey the most important semantic information about natural images) . The insufficient learning for the minority class is due to the complete lack of representation by a limited number of or even no examples, especially in the presence of small data sets. For instance, FG-NET and Pointing’ 04 head pose datasets have only 1002 and 2790 images in total, 8 images with 60+ ages and 60 images with pitch angle 90°, respectively; and FG-NET has no images for certain age classes above 60. This presents a bigger challenge to unseen data extrapolation from the few minority class samples that usually have high variability. Even worse, the small imbalanced data sets can be accompanied by the class overlap problem which further compounds the learning difficulty.
In the machine learning community, there are three common approaches to counter the negative impact of data imbalance: resampling, cost-sensitive learning and ensemble learning. Resampling approaches aim to make class priors equal by under-sampling the majority class or over-sampling the minority class (or both) , but can  easily eliminate valuable information or introduce noise. Cost-sensitive learning is often reported to outperform random re-sampling by adjusting misclassification costs associated with samples, however the true costs are often unknown. An effective technique for further improvement is to resort to ensemble learning even without any priors. Chen et al. combined bagging and cost-sensitive decision trees to generate a weighted version of random forest, which is the only imbalanced learning method based on random forest to the best of our knowledge. They used the class weights for balancing the Gini criterion during node splitting and aggregation at the leaf nodes.
The above approaches have two common drawbacks: 1) They are designed for either classification or regression without an universal solution to both. 2) They have a limited ability to account for unseen appearances or synthesize novel labels on the observed data space. This is more critical in case of the combination of imbalance and small sample size where minority class is underrepresented by an excessively reduced number of or even no samples/labels. In this paper the problems of data imbalance and unseen data extrapolation are addressed in both classification and regression scenarios.
Summary of the Application
One aspect of the present application discloses a method for predicting an attribute for an image sample. The method for predicting the attribute for the image sample may comprise: obtaining a plurality of image subsets from a training set comprising a plurality of training image samples; splitting progressively each of the image subsets to generate a decision forest for prediction; determining paths of nodes in the decision forest for a test image sample; merging the training image samples at all leaf nodes in each of the determined paths; clustering all the merged training image samples to obtain overlapping clusters, each of the merged training image samples being clustered into at least one of the overlapping clusters; and predicting, from the overlapping clusters, an attribute for the test image sample.
According to an embodiment of the present application, the splitting may comprise: clustering the training image samples into different classes at each node of the decision forest; assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, the decision forest may have a depth such that all training image samples in each of the classes have a same attribute.
According to an embodiment of the present application, an information gain of the decision forest may be lower than a fixed threshold.
According to an embodiment of the present application, the training image samples at the leaf node of the decision forest may have a size lower than a fixed threshold.
According to an embodiment of the present application, the splitting may comprise: splitting the training image samples by a cost-sensitive linear support vector machine for classification.
According to an embodiment of the present application, the splitting may comprise: splitting the training image samples by a cost-sensitive linear support vector regression for regression.
According to an embodiment of the present application, the clustering may comprise: calculating a biased inter-point distance between two of the merged training image samples; and assigning, based on the biased inter-point distance, each of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the biased inter-point distance is an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise the biased inter-point distance is the Euclidean distance multiplied by a factor less than one.
According to an embodiment of the present application, the predicting may comprise: finding a cluster of the overlapping clusters which approximates the test  image sample; calculating a coefficient estimate for the test image sample from the found cluster; updating the coefficient estimate via a class-neighbor approximation; predicting the attribute for the test image sample using the updated coefficient estimate.
Another aspect of the present application discloses a system for predicting an attribute for an image sample. The system for predicting the attribute for the image sample may comprises a splitting device for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining device being electrically connected with the splitting device and for determining paths of the nodes in the decision forest for a test image sample; a clustering device being electrically connected with the determining device and for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters, each of which has at least two attributes; and a predicting device being electrically connected with the cluster and for predicting, from the overlapping clusters, an attribute for the test sample.
According to an embodiment of the present application, the splitting device may further comprise: a clustering unit for clustering the training image samples into different classes at each node of the decision forest; a first assigning unit being electrically connected with the clustering unit and for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting unit being electrically connected with the assigning unit and for splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, the splitting unit may be a cost-sensitive linear support vector machine for classification.
According to an embodiment of the present application, the splitting unit may be a cost-sensitive linear support vector regression for regression.
According to an embodiment of the present application, the clustering device may further comprise: a calculating unit for calculating a biased inter-point  distance between two of the merged training image samples; and a second assigning unit being electrically connected with the calculating unit and for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the calculating unit may calculate the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
According to an embodiment of the present application, the predicting device may further comprise: a finding unit for finding a cluster of the overlapping clusters which approximates the test image sample; a estimating unit being electrically connected with the finding unit and for calculating a coefficient estimate for the test image sample from the found cluster; an updating unit being electrically connected with the estimating unit and for the coefficient estimate via a class-neighbor approximation; and a predicting unit being electrically connected with the updating unit and for predicting the attribute for the test image sample using the updated coefficient estimate.
Another aspect of the present application relates to a system for predicting an attribute for an image sample. The system may comprise a memory that may store executable components; and a processor electrically coupled to the memory that may execute the executable components to perform operations of the system, wherein the executable components may comprise: a splitting component configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining component configured for determining paths of the nodes in the decision forest for a test image sample; a clustering component configured for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters; and a predicting component configured for predicting, from the overlapping clusters, an attribute for the test sample.
According to an embodiment of the present application, the splitting component may further comprise: a clustering sub-component configured for clustering the training image samples into different classes at each node of the decision forest; a first assigning sub-component configured for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting sub-component configured for splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, the clustering component may further comprise: a calculating sub-component configured for calculating a biased inter-point distance between two of the merged training image samples; and a second assigning sub-component configured for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters, wherein the calculating sub-component may calculate the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
According to an embodiment of the present application, the predicting component may further comprise: a finding sub-component configured for finding a cluster of the overlapping clusters which approximates the test image sample; a estimating sub-component configured for calculating a coefficient estimate for the test image sample from the found cluster; an updating sub-component configured for the coefficient estimate via a class-neighbor approximation; and a predicting sub-component configured for predicting the attribute for the test image sample using the updated coefficient estimate.
The present application combines ensemble-and cost-sensitive learning in a natural manner and without resampling, thereby avoiding information loss and added noise.
Brief Description of the Drawing
Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
Fig. 1 illustrates a method for predicting an attribute for an image sample according to an embodiment of the present application.
Fig. 2 illustrates sub-steps of generating a decision forest according to an embodiment of the present application.
Fig. 3 illustrates sub-steps of obtaining overlapping clusters according to an embodiment of the present application.
Fig. 4 illustrates a system for predicting an attribute for an image sample according to an embodiment of the present application.
Fig. 5 illustrates a schematic block diagram of a splitting device according to an embodiment of the present application.
Fig. 6 illustrates a schematic block diagram of a clustering device according to an embodiment of the present application.
Fig. 7 illustrates a schematic block diagram of a predicting device according to an embodiment of the present application.
Fig. 8 illustrates a system for predicting an attribute for an image sample according to an embodiment of the present application.
Detailed Description
Hereinafter, the embodiments of the present application will be described in detail with reference to the detailed description as well as the drawings.
Various embodiments of the present application will be described with reference to a training set
Figure PCTCN2015082645-appb-000001
where xi∈RD is the feature vector of sample si, and yi is the label of the sample si, The present application aims to make unbiased prediction for a sample feature x even in the presence of severely imbalanced and  small datasets. The label y∈C refers to a class index (e.g. edge class) for classification and a numeric value (e.g. age and pose angle) for regression. In order to indentify correct decision regions for the majority class, and more important for the minority, the present application resort to a random decision forest which is efficient and robust. The random decision forest is an ensemble of decision trees learned from multiple random data subsets. Each tree recursively divides the input space into disjoint partitions generating candidate decision regions in a coarse-to-fine manner.
Fig. 1 illustrates a method 1000 for predicting an attribute for an image sample according to an embodiment of the present application.
In step S100, the training set
Figure PCTCN2015082645-appb-000002
is received and a plurality of image subsets are obtained from the training set by, for example, sampling.
Then, in step S200, each of image subsets is progressively split to generate a decision tree. The generated decision trees constitute the decision forest which is used for predicting an attribute of the test image sample.
The step S200 will now be described in details with reference to Fig. 2.
As shown in Fig. 2, in step S210, the training samples Sj at the node j is clustered into two classes
Figure PCTCN2015082645-appb-000003
for example by adopting the well-known K-means technique. For classification, the training samples Sj at the node j are clustered into two classes so as to be split into the left node or the right node. For a multi-class scenario, for example a ten-class scenario, the ten classes are clustered into a part comprising five similar classes and another part comprising the other five classes, and then the two parts are progressively split. Then, in step S220, in order to prevent being biased toward the majority class, weights are assigned to the clustered classes. In the present application, the weight is defined as a function of the cluster distribution. For example, the weight may be associated with a factor of f (pk) = (1 pk) /pk, where 
Figure PCTCN2015082645-appb-000004
Obviously, f (pk) gives larger weights to the minority classes without losing the overall performance. Then, in step S230, for a node j with local samples Sj,  Sj may be cost-sensitively split into
Figure PCTCN2015082645-appb-000005
and
Figure PCTCN2015082645-appb-000006
Specifically, the cost-sensitive splitting may employ the factor of f (pk) . Step S230 stops when a maximum depth is reached or local sample size |Sj| falls below a fixed threshold. For classification, step S230 may also stop if information gain which is described in Equation (1) falls below a fixed threshold. The information gain is defined as:
Figure PCTCN2015082645-appb-000007
whereH denotes the class entropy. For regression, the information gain can be replaced by the label variance which is defined as: H (S) =Σy (y-μ) 2/|S| where μ=Σy y/|S|. Accordingly, the decision forest is obtained.
For example, for classification, the splitting function used in step S230 may be determined by a cost-sensitive version of linear SVM:
Figure PCTCN2015082645-appb-000008
where w is the weight vector, C is a regularization parameter, and zi=1 if
Figure PCTCN2015082645-appb-000009
otherwise zi=-1. Each training sample is finally sent to
Figure PCTCN2015082645-appb-000010
or
Figure PCTCN2015082645-appb-000011
by sgn (wT xi) . For regression, the splitting function used in step S230 may be determined by a cost-sensitive version of linear SVR:
Figure PCTCN2015082645-appb-000012
where ε≥0. The node branches left or right by comparing the numeric predictions {wT xi} with the local mean of labels
Figure PCTCN2015082645-appb-000013
Then, in step S300, the test image sample is inputted into each decision tree of the decision forest generated in step S200. According to the splitting criteria of each node of the decision trees, the nodes that can be reached by the test image sample can be determined in each of the decision trees, and thus paths of the nodes for the test image sample in the decision forest can be determined.
Then, in step S400, the training samples at all leaf nodes in each of the determined paths are merged, carving a broader decision region covering as many minority samples as possible. That is, all the sample sets
Figure PCTCN2015082645-appb-000014
of the leaf nodes that may be reached by the test sample are merged into a larger one
Figure PCTCN2015082645-appb-000015
Then, in step S500, the merged training samples are clustered into overlapping clusters. That is, one of the merged training image samples may belong to at least one cluster so that the overlapping cluster may have complementary appearances, enriching cluster representations.
The step S500 will now be described in details with reference to Fig. 3.
As shown in Fig. 3, in step S510, a biased inter-point distance between two of the merged training samples is calculated. For example, the inter-point distance 
Figure PCTCN2015082645-appb-000016
between xi and xj is label-biased:
Figure PCTCN2015082645-appb-000017
where d is the Euclidean distance, 1 (yi ≠yj) =1 if the class labels of xi and xj are different, and g (y) =τy/ (max {y}-y) is a reciprocal increasing function, τ is the trade-off parameter. The biased distance makes clustering discriminative by preferring the “same-class” data-pairs to those from different classes. In extreme cases, for example, in classification scenarios, it forms clusters each purely from one class even if the cluster members differ remarkably in appearances, which is suitable for classification. The biased inter-point distance may be used in the K-means technology for clustering.
Then, in step S520, each of the merged training samples are assigned to at least one cluster based on the biased inter-point distance. For example, the clusters are allowed to overlap with each other by relaxing the cluster assignment of a sample xi based on its nearest centroid
Figure PCTCN2015082645-appb-000018
to more than one  centroid
Figure PCTCN2015082645-appb-000019
in each iteration (ω=0.8 empirically) .
Hereinafter, the step S500 will now be discussed in details in an example according to an embodiment of the present application.
Given N training samples, in order to cluster the N training samples into K overlapping clusters, the following steps will be performed:
I. Determining centroids of K clusters;
II. Calculating the biased distance between the N-K image samples and the centroids of the K clusters;
III. Assigning each of the image samples to more than one centroid , that is, the clusters are allowed to overlap with each other by relaxing the cluster assignment of a sample xi based on its nearest centroid
Figure PCTCN2015082645-appb-000020
to more than one centroid
Figure PCTCN2015082645-appb-000021
in each iteration and then clustering the N image samples into the K clusters by using the modified K-means technology which is based on the biased distance and the multi assignment. This results in overlapping clusters each containing some "inter-class" samples but with complementary appearances to enrich cluster representations;
IV. Updating the centroids of the K clusters; and
V. Repeating II-IV until the centroids of the K overlapping clusters are converged.
Then, in step S600, the attribute of the test sample can be predicted from the overlapping clusters.
Step S600 will be described in details hereinafter. Given that step 500 generates K overlapping clusters
Figure PCTCN2015082645-appb-000022
with their feature matrices
Figure PCTCN2015082645-appb-000023
and labels
Figure PCTCN2015082645-appb-000024
the label for a sample q is predicted in step S600.
Specifically, in step S600, first, each of the overlapping clusters is modeled by an affine hull model AHk that is able to account for unseen data of different  modes. Every single AHk covers all possible affine combinations of its samples and can be parameterized as
AHk= {x=μk+Ukvk, k=1,..., K}   (5) .
where
Figure PCTCN2015082645-appb-000025
is the centroid, Uk is the is the orthonormal basis obtained from the SVD of the centered Lk, and vk is the coefficient vector.
Then, it is determined which Affine Hull model is used to approximate the sample q by calculating
Figure PCTCN2015082645-appb-000026
that is, the index k is determined. The sample q is updated using μk+Ukvk.
Then, based on the updated q and the determined index k, a robust estimate
Figure PCTCN2015082645-appb-000027
is estimated as
Figure PCTCN2015082645-appb-000028
Based on the estimated
Figure PCTCN2015082645-appb-000029
asparse coefficient αk may be determined by:
Figure PCTCN2015082645-appb-000030
Accordingly, the sparse coefficient αk is constrained by the estimate
Figure PCTCN2015082645-appb-000031
Then, a joint optimization is formulated over the belonging cluster and its approximation with a class-neighbor sparsity prior as:
Figure PCTCN2015082645-appb-000032
where ε ≥ 0, and λ and γ are regulation parameters.
These operations are repeated until convergence is reached. Then, the label for the sample q is predicted as
Figure PCTCN2015082645-appb-000033
for regression or by majority voting among yk with sparse components for classification.
A list of operations of the method for predicting an attribute for a sample according to an embodiment of the present application is given below:
Figure PCTCN2015082645-appb-000034
The present application also relates to a system for predicting an attribute for an image sample according to an embodiment of the present application.
Fig. 4 illustrates a system 2000 for predicting an attribute for an image sample according to an embodiment of the present application. The system 2000 will be described with reference to the training set
Figure PCTCN2015082645-appb-000035
as mentioned above.
As shown in Fig. 4, the system 2000 comprises a splitting device 100, a determining device 200, a clustering device 300 and a predicting device 400.
As shown in Fig. 5, the splitting device 100 comprises a clustering unit 110, a first assigning unit 120, and a splitting unit 130. The training set
Figure PCTCN2015082645-appb-000036
is inputted into the clustering unit 110. The clustering unit 110 is configured for generating  a plurality of image subsets from the training set by, for example, sampling. Further, the clustering unit 110 clusters the training samples Sj at the node j into two classes
Figure PCTCN2015082645-appb-000037
by adopting, for example, the well-known K means technique.
The first assigning unit 120 is electrically connected with the clustering unit 110. The first assigning unit 120 is configured for assigning weights to the clustered classes according to the output of the clustering unit 110. The weight is the same as that mentioned in step S220, the detailed description of which will not be repeated herein.
The splitting unit 130 is electrically connected with the first assigning unit 120. Based on the assigned weights, the splitting unit 130 may cost-sensitively split the local samples Sj at a node j into
Figure PCTCN2015082645-appb-000038
and
Figure PCTCN2015082645-appb-000039
The splitting unit 130 may employ the factor of f (pk) to perform the cost-sensitive splitting to the local samples Sj. The splitting unit 130 may stop splitting when a maximum depth is reached or local sample size |Sj| falls below a fixed threshold. For classification, the splitting unit 130 may also stop splitting if information gain which is described in Equation (1) falls below a fixed threshold. For regression, the information gain can be replaced by the above-mentioned label variance. Accordingly, the decision forest is obtained.
The determining device 200 is electrically connected with the splitting device 100. The generated decision forest is outputted by the splitting device 100 to the determining device 200. A test sample is inputted into the determining device 200. Then, the determining device 200 is configured for determining the nodes that can be reached by the test image sample in each of the decision trees and thus determines paths of the nodes for the test image sample in the decision forest.
The clustering device 300 is electrically connected with the determining device 200. The clustering device 300 is configured for merging the training samples at all leaf nodes in each of the determined paths, carving a broader decision region covering as many minority samples as possible. That is, the clustering device 300 merges all the sample sets
Figure PCTCN2015082645-appb-000040
of the leaf nodes that may be reached by the test  sample into a larger one
Figure PCTCN2015082645-appb-000041
Then, the clustering device 300 clusters the merged training samples into overlapping clusters.
As shown in Fig. 6, the clustering device 300 further comprises a calculating unit 310 and a second assigning unit 320.
The calculating unit 310 is configured for calculating a biased inter-point distance between two of the merged training samples. For example, the biased inter-point distance may be that as defined in Equation (4) .
The second assigning unit 320 is electrically connected with the calculating unit 310. The biased inter-point distance is outputted by the calculating unit 310 to the second assigning unit 320. Then, the second assigning unit 320 is configured for assigning each of the merged training samples to at least one cluster based on the biased inter-point distance. For example, the second assigning unit 320 allows the clusters to overlap with each other by relaxing the cluster assignment of a sample xi based on its nearest centroid
Figure PCTCN2015082645-appb-000042
to more than one centroid
Figure PCTCN2015082645-appb-000043
in each iteration (ω=0.8 empirically) .
The predicting device 400 is electrically connected with the clustering device 300. The overlapping clusters are outputted by the clustering device 300 to the predicting device 400. Then, the predicting device 400 is configured for predicting an attribute for the test sample from the overlapping clusters.
As shown in Fig. 7, the predicting device 400 comprises a finding unit 410, a estimating unit 420, an updating unit 430 and a predicting unit 440.
The finding unit 410 is configured for finding a cluster of the overlapping clusters which approximates the test sample. The estimating unit 420 is electrically connected with the finding unit 410 and is configured for calculating a coefficient estimate for the test image sample from the found cluster. The updating unit 430 is electrically connected with the estimating unit 420 and is configured for updating the coefficient estimate via a class-neighbor approximation. The predicting unit 440 is  electrically connected with the updating unit 430 and is configured for predicting the attribute for the test image sample using the updated coefficient estimate. The operations of the predicting device are substantially the same as the steps described in step S600.
The present application also relates to a system 3000 for predicting an attribute for a test sample according to an embodiment of the present application.
As shown in Fig. 8, the system 3000 comprises a memory 3100 that stores executable components and a processor 3200 coupled to the memory 3100 and configured for executing the executable components to perform operations of the system 3000. The executable components comprise: a splitting component 3110 configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction; a determining component 3120 configured for determining paths of the nodes in the decision forest for a test image sample; a clustering component 3130 configured for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters; and a predicting component 3140 configured for predicting, from the overlapping clusters, an attribute for the test sample.
According to an embodiment of the present application, the splitting component 3110 further comprises a clustering sub-component for clustering the training image samples into different classes at each node of the decision forest; a first assigning sub-component for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and a splitting sub-component for splitting the training image samples based on the assigned weights.
According to an embodiment of the present application, the clustering component 3130 further comprises: a calculating sub-component for calculating a biased inter-point distance between two of the merged training image samples; and a second assigning sub-component for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping  clusters, wherein the calculating sub-component calculates the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
According to an embodiment of the present application, the predicting component 3140 further comprises: a finding sub-component for finding a cluster of the overlapping clusters which approximates the test image sample; a coefficient estimate calculating sub-component for calculating a coefficient estimate for the test image sample from the found cluster; a updating sub-component for the coefficient estimate via a class-neighbor approximation; and a predicting sub-component for predicting the attribute for the test image sample using the updated coefficient estimate.
Embodiments within the scope of the present invention can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations thereof. Apparatus within the scope of the present invention can be implemented in a computer program product tangibly embodied in a machine-readable storage device for execution by a programmable processor; and method actions within the scope of the present invention can be performed by a programmable processor executing a program of instructions to perform functions of the invention by operating on input data and generating output.
Embodiments within the scope of the present invention be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. Each computer program can be implemented in a high-level procedural or object oriented programming language, or in assembly or machine language if desired; and in any case, the language can be a compiled or interpreted language. Suitable processors include, by way of example, both general and special purpose microprocessors. Generally, a processor will receive instructions and data from a read-only memory and/or a random access memory.  Generally, a computer will include one or more mass storage devices for storing data files.
Embodiments within the scope of the present invention include computer-readable media for carrying or having computer-executable instructions, computer-readable instructions, or data structures stored thereon. Such computer-readable media may be any available media, which is accessible by a general-purpose or special-purpose computer system. Examples of computer-readable media may include physical storage media such as RAM, ROM, EPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other media which can be used to carry or store desired program code means in the form of computer-executable instructions, computer-readable instructions, or data structures and which may be accessed by a general-purpose or special-purpose computer system. Any of the foregoing can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits) . While particular embodiments of the present invention have been shown and described, changes and modifications may be made to such embodiments without departing from the true scope of the invention.
Although the preferred examples of the present invention have been described, those skilled in the art can make variations or modifications to these examples upon knowing the basic inventive concept. The appended claims are intended to be considered as comprising the preferred examples and all the variations or modifications fell into the scope of the present invention.
Obviously, those skilled in the art can make variations or modifications to the present invention without departing the spirit and scope of the present invention. As such, if these variations or modifications belong to the scope of the claims and equivalent technique, they may also fall into the scope of the present invention.

Claims (19)

  1. A method for predicting an attribute for an image sample, comprising:
    obtaining a plurality of image subsets from a training set comprising a plurality of training image samples;
    splitting progressively each of the image subsets to generate a decision forest for prediction;
    determining paths of nodes in the decision forest for a test image sample;
    merging the training image samples at all leaf nodes in each of the determined paths;
    clustering all the merged training image samples to obtain overlapping clusters, each of the merged training image samples being clustered into at least one of the overlapping clusters; and
    predicting, from the overlapping clusters, an attribute for the test image sample.
  2. The method according to claim 1, wherein the splitting comprises:
    clustering the training image samples into different classes at each node of the decision forest;
    assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and
    splitting the training image samples based on the assigned weights.
  3. The method according to claim 2, wherein the decision forest has a depth such that all training image samples in each of the classes have a same attribute.
  4. The method according to claim 2, wherein an information gain of the decision forest is lower than a fixed threshold.
  5. The method according to claim 1, wherein the training image samples at the leaf node of the decision forest have a size lower than a fixed threshold.
  6. The method according to claim 2, wherein the splitting comprises:
    splitting the training image samples by a cost-sensitive linear support vector machine for classification.
  7. The method according to claim 2, wherein the splitting comprises:
    splitting the training image samples by a cost-sensitive linear support vector regression for regression.
  8. The method according to claim 1, wherein the clustering comprises:
    calculating a biased inter-point distance between two of the merged training image samples; and
    assigning, based on the biased inter-point distance, each of the merged training image samples to at least one cluster to obtain the overlapping clusters,
    wherein the biased inter-point distance is an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise the biased inter-point distance is the Euclidean distance multiplied by a factor less than one.
  9. The method according to claim 1, wherein the predicting comprises:
    finding a cluster of the overlapping clusters which approximates the test image sample;
    calculating a coefficient estimate for the test image sample from the found cluster;
    updating the coefficient estimate via a class-neighbor approximation;
    predicting the attribute for the test image sample using the updated coefficient estimate.
  10. A system for predicting an attribute for an image sample, comprising:
    a splitting device for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction;
    a determining device being electrically connected with the splitting device and for determining paths of the nodes in the decision forest for a test image sample;
    a clustering device being electrically connected with the determining device and for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters, each of which has at least two attributes; and
    a predicting device being electrically connected with the cluster and for predicting, from the overlapping clusters, an attribute for the test sample.
  11. The system according to claim 10, wherein the splitting device further comprises:
    a clustering unit for clustering the training image samples into different classes at each node of the decision forest;
    a first assigning unit being electrically connected with the clustering unit and for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and
    a splitting unit being electrically connected with the assigning unit and for splitting the training image samples based on the assigned weights.
  12. The system according to claim 11, wherein the splitting unit is a cost-sensitive linear support vector machine for classification.
  13. The system according to claim 11, wherein the splitting unit is a cost-sensitive linear support vector regression for regression.
  14. The system according to claim 10, wherein the clustering device further comprises:
    a calculating unit for calculating a biased inter-point distance between two of the merged training image samples; and
    a second assigning unit being electrically connected with the calculating unit and for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters,
    wherein the calculating unit calculates the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
  15. The system according to claim 10, wherein the predicting device further comprises:
    a finding unit for finding a cluster of the overlapping clusters which approximates the test image sample;
    a estimating unit being electrically connected with the finding unit and for calculating a coefficient estimate for the test image sample from the found cluster;
    an updating unit being electrically connected with the estimating unit and for the coefficient estimate via a class-neighbor approximation; and
    a predicting unit being electrically connected with the updating unit and for predicting the attribute for the test image sample using the updated coefficient estimate.
  16. A system for predicting an attribute for an image sample, comprising:
    a memory that stores executable components; and
    a processor electrically coupled to the memory that executes the executable components to perform operations of the system, wherein the executable components comprise:
    a splitting component configured for obtaining a plurality of image subsets from a training set comprising a plurality of training image samples, and splitting progressively each of the subsets to generate a decision forest for prediction;
    a determining component configured for determining paths of the nodes in the decision forest for a test image sample;
    a clustering component configured for merging the training samples at all leaf nodes in each of the determined paths, and clustering locally all the merged training samples to obtain overlapping clusters; and
    a predicting component configured for predicting, from the overlapping clusters, an attribute for the test sample.
  17. The system according to claim 16, wherein the splitting component further comprises:
    a clustering sub-component configured for clustering the training image samples into different classes at each node of the decision forest;
    a first assigning sub-component configured for assigning weights to the clustered classes, wherein a greater weight is assigned to the class having less training image samples, and a smaller weight is assigned to the class having more training image samples; and
    a splitting sub-component configured for splitting the training image samples based on the assigned weights.
  18. The system according to claim 16, wherein the clustering component further comprises:
    a calculating sub-component configured for calculating a biased inter-point distance between two of the merged training image samples; and
    a second assigning sub-component configured for assigning, based on the biased inter-point distance, one of the merged training image samples to at least one cluster to obtain the overlapping clusters,
    wherein the calculating sub-component calculates the biased inter-point distance by calculating an Euclidean distance of the two of the merged training image samples multiplied by a factor equal or more than one if the two of the merged training image samples have a same attribute, and otherwise by calculating the Euclidean distance multiplied by a factor less than one.
  19. The system according to claim 16, wherein the predicting component further comprises:
    a finding sub-component configured for finding a cluster of the overlapping clusters which approximates the test image sample;
    a estimating sub-component configured for calculating a coefficient estimate for the test image sample from the found cluster;
    an updating sub-component configured for the coefficient estimate via a class-neighbor approximation; and
    a predicting sub-component configured for predicting the attribute for the test image sample using the updated coefficient estimate.
PCT/CN2015/082645 2015-06-29 2015-06-29 Method and apparatus for predicting attribute for image sample WO2017000118A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201580080731.4A CN107636678B (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attributes of image samples
PCT/CN2015/082645 WO2017000118A1 (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attribute for image sample

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/082645 WO2017000118A1 (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attribute for image sample

Publications (1)

Publication Number Publication Date
WO2017000118A1 true WO2017000118A1 (en) 2017-01-05

Family

ID=57607394

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/082645 WO2017000118A1 (en) 2015-06-29 2015-06-29 Method and apparatus for predicting attribute for image sample

Country Status (2)

Country Link
CN (1) CN107636678B (en)
WO (1) WO2017000118A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11531927B2 (en) * 2017-11-28 2022-12-20 Adobe Inc. Categorical data transformation and clustering for machine learning using natural language processing
EP3806065A1 (en) 2019-10-11 2021-04-14 Aptiv Technologies Limited Method and system for determining an attribute of an object at a pre-determined time point
CN112215186B (en) * 2020-10-21 2024-04-05 深圳市赛为智能股份有限公司 Classification method, device, computer equipment and storage medium for marsh wetland vegetation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346346A1 (en) * 2012-06-21 2013-12-26 Microsoft Corporation Semi-supervised random decision forests for machine learning
CN103971112A (en) * 2013-02-05 2014-08-06 腾讯科技(深圳)有限公司 Image feature extracting method and device
CN104680118A (en) * 2013-11-29 2015-06-03 华为技术有限公司 Method and system for generating face character detection model

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1573657A2 (en) * 2002-12-11 2005-09-14 Koninklijke Philips Electronics N.V. Computer vision system and method employing illumination invariant neural networks
CN102592147A (en) * 2011-12-30 2012-07-18 深圳市万兴软件有限公司 Method and device for detecting human face
US9239848B2 (en) * 2012-02-06 2016-01-19 Microsoft Technology Licensing, Llc System and method for semantically annotating images
EP2864500B1 (en) * 2012-06-22 2018-08-22 HTG Molecular Diagnostics, Inc. Molecular malignancy in melanocytic lesions
US20140015855A1 (en) * 2012-07-16 2014-01-16 Canon Kabushiki Kaisha Systems and methods for creating a semantic-driven visual vocabulary
CN103049514B (en) * 2012-12-14 2016-08-10 杭州淘淘搜科技有限公司 A kind of equilibrium image clustering method based on hierarchical cluster
CN103679132B (en) * 2013-07-15 2016-08-24 北京工业大学 A kind of nude picture detection method and system
JP6149710B2 (en) * 2013-11-27 2017-06-21 富士ゼロックス株式会社 Image processing apparatus and program
CN103984953B (en) * 2014-04-23 2017-06-06 浙江工商大学 Semantic segmentation method based on multiple features fusion Yu the street view image of Boosting decision forests
CN103971097B (en) * 2014-05-15 2015-05-13 武汉睿智视讯科技有限公司 Vehicle license plate recognition method and system based on multiscale stroke models
CN104573715B (en) * 2014-12-30 2017-07-25 百度在线网络技术(北京)有限公司 The recognition methods in image subject region and device
CN104680559B (en) * 2015-03-20 2017-08-04 青岛科技大学 The indoor pedestrian tracting method of various visual angles based on motor behavior pattern

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130346346A1 (en) * 2012-06-21 2013-12-26 Microsoft Corporation Semi-supervised random decision forests for machine learning
CN103971112A (en) * 2013-02-05 2014-08-06 腾讯科技(深圳)有限公司 Image feature extracting method and device
CN104680118A (en) * 2013-11-29 2015-06-03 华为技术有限公司 Method and system for generating face character detection model

Also Published As

Publication number Publication date
CN107636678B (en) 2021-12-14
CN107636678A (en) 2018-01-26

Similar Documents

Publication Publication Date Title
US10909455B2 (en) Information processing apparatus using multi-layer neural network and method therefor
McCann et al. Local naive bayes nearest neighbor for image classification
US9002101B2 (en) Recognition device, recognition method, and computer program product
CN110348579B (en) Domain self-adaptive migration feature method and system
JP3726263B2 (en) Document classification method and apparatus
JP5565190B2 (en) Learning model creation program, image identification information addition program, learning model creation device, and image identification information addition device
Long et al. Accurate object detection with location relaxation and regionlets re-localization
US20210295103A1 (en) Utilizing machine learning models, position based extraction, and automated data labeling to process image-based documents
US10783402B2 (en) Information processing apparatus, information processing method, and storage medium for generating teacher information
JP6897749B2 (en) Learning methods, learning systems, and learning programs
Sawant et al. A survey of band selection techniques for hyperspectral image classification
JP5621787B2 (en) Pattern recognition apparatus, pattern recognition method, and program for pattern recognition
US11636164B2 (en) Search system for providing web crawling query prioritization based on classification operation performance
Koço et al. On multi-class classification through the minimization of the confusion matrix norm
WO2021096799A1 (en) Deep face recognition based on clustering over unlabeled face data
WO2017000118A1 (en) Method and apparatus for predicting attribute for image sample
US10671663B2 (en) Generation device, generation method, and non-transitory computer-readable recording medium
WO2017188048A1 (en) Preparation apparatus, preparation program, and preparation method
Barz et al. Information-theoretic active learning for content-based image retrieval
Vogt et al. Unsupervised source selection for domain adaptation
CN110378384B (en) Image classification method combining privilege information and ordering support vector machine
Nikolaou et al. Calibrating AdaBoost for asymmetric learning
JP2016062249A (en) Identification dictionary learning system, recognition dictionary learning method and recognition dictionary learning program
Vogt et al. Boosted unsupervised multi-source selection for domain adaptation
Sahay et al. Architecture classification for Indian monuments

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15896649

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15896649

Country of ref document: EP

Kind code of ref document: A1