US8447119B2 - Method and system for image classification - Google Patents
Method and system for image classification Download PDFInfo
- Publication number
- US8447119B2 US8447119B2 US12/818,156 US81815610A US8447119B2 US 8447119 B2 US8447119 B2 US 8447119B2 US 81815610 A US81815610 A US 81815610A US 8447119 B2 US8447119 B2 US 8447119B2
- Authority
- US
- United States
- Prior art keywords
- image
- vector
- coding
- local
- descriptors
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/763—Non-hierarchical techniques, e.g. based on statistics of modelling distributions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
Definitions
- the invention relates to method and system for image classification.
- Image classification including object recognition and scene classification, remains to be a major challenge to the computer vision community. Perhaps one of the most significant developments in the last decade is the application of local features to image classification, including the introduction of “bag-of-visual-words” representation.
- VQ vector quantization
- a further extension is to incorporate the spatial information of local descriptors in an image, by partition images into regions in different locations and scales and compute region-based histograms, instead of computing the global histogram for the entire image. These region-based histograms are concatenated to form the feature vector for the image. Then nonlinear SVM is applied for classification. This approach is called “spatial pyramid matching kernel” (SPMK) method. SPMK is regarded the state-of-the-art method for image classification.
- SVMs use pyramid matching kernels, biologically-inspired models, and KNN methods.
- SPM spatial pyramid matching
- the recent improvements were often achieved by combining different types of local descriptors, without any fundamental change of the underlying classification method.
- Nonlinear SVMs scale at least quadratically to the size of training data, which makes it nontrivial to handle large-scale training data. It is thus necessary to design algorithms that are computationally more efficient.
- methods and systems for image classification coding an image by nonlinearly mapping an image descriptor to form a high-dimensional sparse vector; spatially pooling each local region to form an image-level feature vector using a probability kernel incorporating a similarity metric of local descriptors; and classifying the image.
- a method for image classification includes nonlinearly mapping one or more descriptors of an image to form a high-dimensional sparse vector using Super-Vector nonlinear coding; spatial pooling each local region by aggregating codes of the descriptors in each local region to form a single vector, and concatenating vectors of different regions to form the image-level feature vector using probability kernel incorporating the similarity metric of local descriptors; and image classifying by normalizing image-level feature vector using linear SVMs.
- a system for image classification includes means for coding descriptor of an image by nonlinearly mapping to form a high-dimensional sparse vector using Super-Vector nonlinear coding method; means for spatial pooling each local region by aggregating the codes of all the descriptors in each local region to form a single vector, and concatenating vectors of different regions to form the image-level feature vector using probability kernel incorporating the similarity metric of local descriptors; and means for image classifying by normalizing image-level feature vector using linear SVMs.
- a method for image classification includes extracting local image descriptors from a grid of locations in an image; nonlinearly coding extracted image descriptors to form a high-dimensional sparse vector; spatial pooling each image by partitioning into regions in different scales and locations, aggregating the codes of the descriptors in each region to form a single vector, and concatenating vectors of different regions to form the image-level feature vector; and linear classifying image-level feature vector.
- the system for image classification includes means for extracting local image descriptors from a grid of locations in an image; means for nonlinearly coding extracted image descriptors to form a high-dimensional sparse vector; means for spatial pooling each image by partitioning into regions in different scales and locations, aggregating the codes of all the descriptors in each region to form a single vector, means for concatenating vectors of different regions to form the image-level feature vector; and means for linear classifying image-level feature vector.
- Image classification can be done using local visual descriptors.
- the system is more scalable in computation, transparent in classification, and greater accuracy than conventional systems.
- the overall image classification framework enjoys a linear training complexity, and also a great interpretability that is missing from conventional systems.
- FIG. 1 is a flow chart showing image classification method.
- FIG. 2 shows an exemplary system to perform image classification.
- FIG. 1 is a flow chart showing image classification method.
- the method receives an input image in 110 .
- the method performs a descriptor extraction in 120 .
- This operation extracts local image descriptors, such as SIFT, SURF, or any other local features, from a grid of locations in the image.
- the image is represented as a set of descriptor vectors with their 2D location coordinates.
- the method performs nonlinear coding in 130 . Each descriptor of an image is nonlinearly mapped to form a high-dimensional sparse vector.
- the invention propose a novel nonlinear coding method called Super-Vector (SV) coding, which enjoys better theoretical properties than Vector Quantization (VQ) coding.
- the method performs spatial pooling where each image is partitioned into regions in different scales and locations. For each region, the codes of all the descriptors in it are aggregated to form a single vector, then vectors of different regions are concatenated to form the image-level feature vector.
- a probability kernel incorporating the similarity metric of local descriptors can be used in one embodiment as described in detail below.
- the process performs linear classification in 150 .
- the image-level feature vector is normalized and fed into a classifier to detect an object such as a cat in 160 .
- Linear SVMs which scale linearly to the size of training data, are used in the method. In contrast, the previous state-of-the-art systems used nonlinear SVMs which requires quadratic or higher-order computational complexity for training.
- the descriptor coding enjoys appealing theoretical properties. It is interested in learning a smooth nonlinear function ⁇ (x) defined on a high dimensional space R d .
- the question is, how to derive a good coding scheme (or nonlinear mapping) ⁇ (x) such that ⁇ (x) can be well approximated by a linear function on it, namely w T ⁇ (x). Assumption here is that ⁇ (x) should be sufficiently smooth.
- VQ Vector Quantization
- v * ⁇ ( x ) arg ⁇ ⁇ min v ⁇ ⁇ C ⁇ ⁇ x - v ⁇ , where P ⁇ P is the Euclidean norm (2-norm).
- ⁇ (x) is ⁇ Lipschitz derivative smooth if for all x,x′ ⁇ R d :
- ⁇ ⁇ ( x ) [ 0 , ... ⁇ , 0 ⁇ d + 1 ⁇ ⁇ dim . , s , ( x - v ) T ⁇ d + 1 ⁇ dim . , 0 , ... ⁇ , 0 ⁇ d + 1 ⁇ d ⁇ ⁇ i ⁇ ⁇ m . ] T ( 3 )
- SV coding may achieve a lower function approximation error than VQ coding. It should be noted that the popular bag-of-features image classification method essentially employs VQ to obtain histogram representations. The proposed SV coding is a simple extension of VQ, and may lead to a better approach to image classification.
- Each image can be represented as a set of descriptor vectors x that follows an image-specific distribution, represented as a probability density function p(x) with respect to an image independent back-ground measure d ⁇ (x). Let's first ignore the spacial locations of x, and address the spacial pooling later.
- a kernel-based method for image classification is based on a kernel on the probability distributions over x ⁇ , K:P ⁇ P R.
- K:P ⁇ P R A well-known example is the Bhattacharyya kernel:
- K b ⁇ ( p , q ) ⁇ ⁇ ⁇ p ⁇ ( x ) 1 2 ⁇ q ⁇ ( x ) 1 2 ⁇ ⁇ d ⁇ ⁇ ( x ) .
- KL Kullback Leibler
- K ( X , X ′ ⁇ ) 1 NN ′ ⁇ ⁇ x ⁇ X ⁇ ⁇ x ′ ⁇ X ′ ⁇ ⁇ p ⁇ ( x ) - 1 2 ⁇ q ⁇ ( x ′ ) - 1 2 ⁇ ⁇ ⁇ ( x , x ′ ) ( 4 )
- N and N′ are the sizes of the descriptor sets from two images.
- X k is the subset of X fallen into the k-th cluster.
- weighting by histogram p k is equivalent to treating density p(x) as piece-wise constant around each VQ basis, under a specific choice of background measure ⁇ (x) that equalizes different partitions.
- This representation is not sensitive to the choice of background measure ⁇ (x), which is image independent.
- each image be evenly partitioned into 1 ⁇ 1, 2 ⁇ 2, and 3 ⁇ 1 blocks, respectively in 3 different levels. Based on which block each descriptor comes from, the whole set X of an image is then organized into three levels of subsets: X 11 1 , X 11 2 , X 12 2 , X 21 2 , X 22 2 , X 11 3 , X 12 3 , and X 13 3 . Then the pooling operation introduced in the last subsection can be applied to each of the subsets.
- the above equation provides an interesting insight to the classification process: a patch-level pattern matching is operated everywhere in the image, and the responses are then aggregated together to generate the score indicating how likely a particular category of objects is present. This observation is well-aligned with the biologically-inspired vision models, like Convolution Neural Networks and HMAX model, which mostly employ feed-forward pattern matching for object recognition.
- the classification model enjoys the advantages of interpretability and computational scalability.
- Eq. (5) suggests that one can compute a response map based on g(x), which visualizes where the classifier focuses on in the image. Since the proposed method naturally requires a linear classifier, it enjoys a training scalability which is linear to the number of training images, while nonlinear kernel-based methods suffer quadratic or higher complexity.
- the classification model is more related to local coordinate coding (LCC), which points out that in some cases a desired sparsity of ⁇ (x) should come from a locality of the coding scheme.
- LCC local coordinate coding
- the proposed SV coding leads to a highly sparse representation ⁇ (x), as defined by Eq. (2), which activates those coordinates associated to the neighborhood of x.
- the computation of SV coding is much simpler than sparse coding approaches.
- the method can be further improved by considering a soft assignment of x to bases C.
- the underlying interpretation of ⁇ (x) ⁇ w T ⁇ (x) is the approximation ⁇ ( x ) ⁇ ( v * ( x ))+ ⁇ ( v * ( x )) T ( x ⁇ v * ( x )) which essentially uses the unknown function's Taylor expansion at a nearby location v * (x) to interpolate ⁇ (x).
- One natural idea to improve this is using several neighbors in C instead of the nearest one. Let's consider a soft K-means that computes p k (x), the posterior probability of cluster assignment for x. Then the function approximation can be handled as the expectation
- the invention may be implemented in hardware, firmware or software, or a combination of the three.
- the invention is implemented in a computer program executed on a programmable computer having a processor, a data storage system, volatile and non-volatile memory and/or storage elements, at least one input device and at least one output device.
- the computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus.
- RAM random access memory
- program memory preferably a writable read-only memory (ROM) such as a flash ROM
- I/O controller coupled by a CPU bus.
- the computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM.
- I/O controller is coupled by means of an I/O bus to an I/O interface.
- I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link.
- a display, a keyboard and a pointing device may also be connected to I/O bus.
- separate connections may be used for I/O interface, display, keyboard and pointing device.
- Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
- the system of FIG. 2 receives images to be classified. Each image is represented by a set of local descriptors with their spatial coordinates.
- the descriptor can be SIFT, or any other local features, computed from image patches at locations on a 2D grid.
- the images is processed by a descriptor coding module where each descriptor of an image is nonlinearly mapped to form a high-dimensional sparse vector.
- a nonlinear coding method called vector machine coding can be used, which is an extension of Vector Quantization (VQ) coding.
- VQ Vector Quantization
- the codes of all the descriptors in it are aggregated to form a single vector, then vectors of different regions are concatenated to form the image-level feature vector.
- This step is based on a novel probability kernel incorporating the similarity metric of local descriptors.
- the image-level feature vector is normalized and fed into a classifier. Linear SVMs, which scale linearly to the size of training data, is used in this step.
- Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
- the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
Abstract
Description
where γ(x)=[γv(x)]vεC is the coefficients, and sometimes Σvγv(x)=1. By restricting the cardinality of nonzeros of γ(x) to be 1 and γv(x)≧0, the Vector Quantization (VQ) method is obtained as:
where P·P is the Euclidean norm (2-norm). The VQ method uses the coding γv(x)=1 if v=v*(x) and γv(x)=0 otherwise. ƒ(x) is β Lipschitz derivative smooth if for all x,x′εRd:
It immediately implies the following simple function approximation bound via VQ coding: for all xεRd:
Eq. (1) also suggests that the approximation to ƒ(x) can be expressed as a linear function on a nonlinear coding scheme
ƒ(x)≈g(x)≡w Tφ(x),
where φ(x) is called the Super-Vector (SV) coding of x, defined by
φ(x)=[sγ v(x),γv(x)(x−v)T]vεC T (2)
where s is a nonnegative constant. It is not difficult to see that
which can be regarded as unknown parameters to be estimated. Because γv(x)=1 if v=v*(x), otherwise γv(x)=0, the obtained φ(x) a is highly sparse representation, with dimensionality |C|(d+1). For example, if |C|=3 and γ(x)=[0,1,0], then
where K(x,x′) is a RKHS kernel on Q that reflects the similarity structure of x. In the extreme case where K(x,x′)=δ(x−x′) is the delta-function with respect to μ(•), then the above kernel reduces to the Bhattacharyya kernel.
where N and N′ are the sizes of the descriptor sets from two images.
where Xk is the subset of X fallen into the k-th cluster. Furthermore, assume that p(x) remains constant within each cluster partition, i.e., p(x) gives rise to a histogram [pk]k=1 |C|, then
φs(X)=[Φ(X 11 1),Φ(X 11 2),Φ(X 12 2),Φ(X 21 2),Φ(X 22 2),Φ(X 11 3),Φ(X 12 3),Φ(X 13 3)]
where g(x)=wTΦ(x). The above equation provides an interesting insight to the classification process: a patch-level pattern matching is operated everywhere in the image, and the responses are then aggregated together to generate the score indicating how likely a particular category of objects is present. This observation is well-aligned with the biologically-inspired vision models, like Convolution Neural Networks and HMAX model, which mostly employ feed-forward pattern matching for object recognition.
ƒ(x)≈ƒ(v *(x))+∇ƒ(v *(x))T(x−v *(x))
which essentially uses the unknown function's Taylor expansion at a nearby location v*(x) to interpolate ƒ(x). One natural idea to improve this is using several neighbors in C instead of the nearest one. Let's consider a soft K-means that computes pk (x), the posterior probability of cluster assignment for x. Then the function approximation can be handled as the expectation
Then the pooling step becomes a computation of the expectation
where
and s comes from Eq. (2). This approach is different from the image classification using GMM. Basically, those GMM methods consider the distribution kernel, while the inventive method incorporates nonlinear coding into the distribution kernel. Furthermore, the model according to the invention requires the stickiness to VQ—the soft version requires all the components share the same isotropic diagonal covariance. That means a much less number of parameters to estimate, and therefore a significantly higher accuracy can be obtained.
Claims (6)
φ(x)=[sγ v(x),γv(x)(x−v)T]vεC T,
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/818,156 US8447119B2 (en) | 2010-03-16 | 2010-06-18 | Method and system for image classification |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US31438610P | 2010-03-16 | 2010-03-16 | |
US12/818,156 US8447119B2 (en) | 2010-03-16 | 2010-06-18 | Method and system for image classification |
Publications (2)
Publication Number | Publication Date |
---|---|
US20110229045A1 US20110229045A1 (en) | 2011-09-22 |
US8447119B2 true US8447119B2 (en) | 2013-05-21 |
Family
ID=44647307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/818,156 Active 2031-07-07 US8447119B2 (en) | 2010-03-16 | 2010-06-18 | Method and system for image classification |
Country Status (1)
Country | Link |
---|---|
US (1) | US8447119B2 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9424493B2 (en) | 2014-10-09 | 2016-08-23 | Microsoft Technology Licensing, Llc | Generic object detection in images |
CN106203396A (en) * | 2016-07-25 | 2016-12-07 | 南京信息工程大学 | Aerial Images object detection method based on degree of depth convolution and gradient rotational invariance |
CN107071858A (en) * | 2017-03-16 | 2017-08-18 | 许昌学院 | A kind of subdivision remote sensing image method for parallel processing under Hadoop |
US9858496B2 (en) | 2016-01-20 | 2018-01-02 | Microsoft Technology Licensing, Llc | Object detection and classification in images |
US20180053057A1 (en) * | 2016-08-18 | 2018-02-22 | Xerox Corporation | System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture |
US11134221B1 (en) | 2017-11-21 | 2021-09-28 | Daniel Brown | Automated system and method for detecting, identifying and tracking wildlife |
Families Citing this family (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120213426A1 (en) * | 2011-02-22 | 2012-08-23 | The Board Of Trustees Of The Leland Stanford Junior University | Method for Implementing a High-Level Image Representation for Image Analysis |
JP6050223B2 (en) | 2011-11-02 | 2016-12-21 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America | Image recognition apparatus, image recognition method, and integrated circuit |
US9412020B2 (en) | 2011-11-09 | 2016-08-09 | Board Of Regents Of The University Of Texas System | Geometric coding for billion-scale partial-duplicate image search |
CN103164713B (en) * | 2011-12-12 | 2016-04-06 | 阿里巴巴集团控股有限公司 | Image classification method and device |
JP2015529365A (en) * | 2012-09-05 | 2015-10-05 | エレメント,インク. | System and method for biometric authentication associated with a camera-equipped device |
CN103049760B (en) * | 2012-12-27 | 2016-05-18 | 北京师范大学 | Based on the rarefaction representation target identification method of image block and position weighting |
CN103106265B (en) * | 2013-01-30 | 2016-10-12 | 北京工商大学 | Similar image sorting technique and system |
US9141885B2 (en) * | 2013-07-29 | 2015-09-22 | Adobe Systems Incorporated | Visual pattern recognition in an image |
CN103499584B (en) * | 2013-10-16 | 2016-02-17 | 北京航空航天大学 | Railway wagon hand brake chain bar loses the automatic testing method of fault |
CN103544483B (en) * | 2013-10-25 | 2016-09-14 | 合肥工业大学 | A kind of joint objective method for tracing based on local rarefaction representation and system thereof |
CN103927540B (en) * | 2014-04-03 | 2019-01-29 | 华中科技大学 | A kind of invariant feature extraction method based on biological vision hierarchical mode |
US9913135B2 (en) | 2014-05-13 | 2018-03-06 | Element, Inc. | System and method for electronic key provisioning and access management in connection with mobile devices |
CN106664207B (en) | 2014-06-03 | 2019-12-13 | 埃利蒙特公司 | Attendance verification and management in relation to mobile devices |
EP3204888A4 (en) * | 2014-10-09 | 2017-10-04 | Microsoft Technology Licensing, LLC | Spatial pyramid pooling networks for image processing |
US10339422B2 (en) * | 2015-03-19 | 2019-07-02 | Nec Corporation | Object detection device, object detection method, and recording medium |
US9514391B2 (en) * | 2015-04-20 | 2016-12-06 | Xerox Corporation | Fisher vectors meet neural networks: a hybrid visual classification architecture |
CN105512677B (en) * | 2015-12-01 | 2019-02-01 | 南京信息工程大学 | Classifying Method in Remote Sensing Image based on Hash coding |
CN105654122B (en) * | 2015-12-28 | 2018-11-16 | 江南大学 | Based on the matched spatial pyramid object identification method of kernel function |
CN106909895B (en) * | 2017-02-17 | 2020-09-22 | 华南理工大学 | Gesture recognition method based on random projection multi-kernel learning |
CN107220659B (en) * | 2017-05-11 | 2019-10-25 | 西安电子科技大学 | High Resolution SAR image classification method based on total sparse model |
CA3076038C (en) | 2017-09-18 | 2021-02-02 | Element Inc. | Methods, systems, and media for detecting spoofing in mobile authentication |
US11080324B2 (en) * | 2018-12-03 | 2021-08-03 | Accenture Global Solutions Limited | Text domain image retrieval |
KR20220004628A (en) | 2019-03-12 | 2022-01-11 | 엘리먼트, 인크. | Detection of facial recognition spoofing using mobile devices |
CN110198473B (en) * | 2019-06-10 | 2021-07-20 | 北京字节跳动网络技术有限公司 | Video processing method and device, electronic equipment and computer readable storage medium |
US11507248B2 (en) | 2019-12-16 | 2022-11-22 | Element Inc. | Methods, systems, and media for anti-spoofing using eye-tracking |
CN112001399B (en) * | 2020-09-07 | 2023-06-09 | 中国人民解放军国防科技大学 | Image scene classification method and device based on local feature saliency |
CN113326880A (en) * | 2021-05-31 | 2021-08-31 | 南京信息工程大学 | Unsupervised image classification method based on community division |
CN113793327B (en) * | 2021-09-18 | 2023-12-26 | 北京中科智眼科技有限公司 | Token-based high-speed rail foreign matter detection method |
CN116092701B (en) * | 2023-03-07 | 2023-06-30 | 南京康尔健医疗科技有限公司 | Control system and method based on health data analysis management platform |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050058339A1 (en) * | 2003-09-16 | 2005-03-17 | Fuji Xerox Co., Ltd. | Data recognition device |
US20110044530A1 (en) * | 2009-08-19 | 2011-02-24 | Sen Wang | Image classification using range information |
-
2010
- 2010-06-18 US US12/818,156 patent/US8447119B2/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050058339A1 (en) * | 2003-09-16 | 2005-03-17 | Fuji Xerox Co., Ltd. | Data recognition device |
US20110044530A1 (en) * | 2009-08-19 | 2011-02-24 | Sen Wang | Image classification using range information |
Non-Patent Citations (2)
Title |
---|
Jianchao Yang,et al., Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification, IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009. |
Svetlana Lazebnik, et al., Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006. |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9424493B2 (en) | 2014-10-09 | 2016-08-23 | Microsoft Technology Licensing, Llc | Generic object detection in images |
US9858496B2 (en) | 2016-01-20 | 2018-01-02 | Microsoft Technology Licensing, Llc | Object detection and classification in images |
CN106203396A (en) * | 2016-07-25 | 2016-12-07 | 南京信息工程大学 | Aerial Images object detection method based on degree of depth convolution and gradient rotational invariance |
CN106203396B (en) * | 2016-07-25 | 2019-05-10 | 南京信息工程大学 | Aerial Images object detection method based on depth convolution sum gradient rotational invariance |
US20180053057A1 (en) * | 2016-08-18 | 2018-02-22 | Xerox Corporation | System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture |
US9946933B2 (en) * | 2016-08-18 | 2018-04-17 | Xerox Corporation | System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture |
CN107071858A (en) * | 2017-03-16 | 2017-08-18 | 许昌学院 | A kind of subdivision remote sensing image method for parallel processing under Hadoop |
US11134221B1 (en) | 2017-11-21 | 2021-09-28 | Daniel Brown | Automated system and method for detecting, identifying and tracking wildlife |
Also Published As
Publication number | Publication date |
---|---|
US20110229045A1 (en) | 2011-09-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8447119B2 (en) | Method and system for image classification | |
Zhou et al. | Image classification using super-vector coding of local image descriptors | |
US10796145B2 (en) | Method and apparatus for separating text and figures in document images | |
US8374442B2 (en) | Linear spatial pyramid matching using sparse coding | |
US9978002B2 (en) | Object recognizer and detector for two-dimensional images using Bayesian network based classifier | |
Wu et al. | Conformal transformation of kernel functions: A data-dependent way to improve support vector machine classifiers | |
US9053392B2 (en) | Generating a hierarchy of visual pattern classes | |
US8532399B2 (en) | Large scale image classification | |
US9031331B2 (en) | Metric learning for nearest class mean classifiers | |
JP5373536B2 (en) | Modeling an image as a mixture of multiple image models | |
US8233711B2 (en) | Locality-constrained linear coding systems and methods for image classification | |
US9141885B2 (en) | Visual pattern recognition in an image | |
Kiang et al. | An evaluation of self-organizing map networks as a robust alternative to factor analysis in data mining applications | |
US8428397B1 (en) | Systems and methods for large scale, high-dimensional searches | |
US20170061257A1 (en) | Generation of visual pattern classes for visual pattern regonition | |
US20210110215A1 (en) | Information processing device, information processing method, and computer-readable recording medium recording information processing program | |
De la Torre et al. | Multimodal oriented discriminant analysis | |
Wei et al. | Region ranking SVM for image classification | |
Kumar et al. | Semi-supervised robust mixture models in RKHS for abnormality detection in medical images | |
Han et al. | High-order statistics of microtexton for hep-2 staining pattern classification | |
Yang et al. | Projective non-negative matrix factorization with applications to facial image processing | |
Kumar et al. | Kernel generalized Gaussian and robust statistical learning for abnormality detection in medical images | |
Sohail et al. | Classification of ultrasound medical images using distance based feature selection and fuzzy-SVM | |
US11398057B2 (en) | Imaging system and detection method | |
Hammer et al. | How to visualize large data sets? |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC LABORATORIES AMERICA, INC.;REEL/FRAME:031998/0667 Effective date: 20140113 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE REMOVE 8538896 AND ADD 8583896 PREVIOUSLY RECORDED ON REEL 031998 FRAME 0667. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNOR:NEC LABORATORIES AMERICA, INC.;REEL/FRAME:042754/0703 Effective date: 20140113 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |