US20150371397A1 - Object Detection with Regionlets Re-localization - Google Patents

Object Detection with Regionlets Re-localization Download PDF

Info

Publication number
US20150371397A1
US20150371397A1 US14/716,435 US201514716435A US2015371397A1 US 20150371397 A1 US20150371397 A1 US 20150371397A1 US 201514716435 A US201514716435 A US 201514716435A US 2015371397 A1 US2015371397 A1 US 2015371397A1
Authority
US
United States
Prior art keywords
detector
location
regionlets
localization
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US14/716,435
Other versions
US9235904B1 (en
Inventor
Xiaoyu Wang
Yuanqing Lin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Laboratories America Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Laboratories America Inc filed Critical NEC Laboratories America Inc
Priority to US14/716,435 priority Critical patent/US9235904B1/en
Publication of US20150371397A1 publication Critical patent/US20150371397A1/en
Application granted granted Critical
Publication of US9235904B1 publication Critical patent/US9235904B1/en
Assigned to NEC CORPORATION reassignment NEC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NEC LABORATORIES AMERICA, INC.
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T7/0081
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/77Determining position or orientation of objects or cameras using statistical methods
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01BMEASURING LENGTH, THICKNESS OR SIMILAR LINEAR DIMENSIONS; MEASURING ANGLES; MEASURING AREAS; MEASURING IRREGULARITIES OF SURFACES OR CONTOURS
    • G01B11/00Measuring arrangements characterised by the use of optical techniques
    • G01B11/14Measuring arrangements characterised by the use of optical techniques for measuring distance or clearance between spaced objects or spaced apertures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06K9/6256
    • G06T7/0046
    • G06T7/0089
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • G06T2207/20141

Definitions

  • the present invention relates to object detection systems and methods.
  • Standard sliding window based object detection requires dense classifier evaluation on densely sampled locations in scale space in order to achieve an accurate localization.
  • selective search based algorithms only evaluate the classifier on a small subset of object proposals. Notwithstanding the demonstrated success, object proposals do not guarantee perfect overlap with the object, leading to a suboptimal detection accuracy.
  • an object detector includes a bottom-up object hypotheses generation unit; a top-down object search with supervised descent unit; and an object re-localization unit with a localization model.
  • a method takes advantage of the rich spatial information encoded in the Regionlets object detection model for location prediction.
  • the method transfers the Regionlets feature extracted from the Regionlets model to a high dimensional sparse binary vector. This binary vector implicitly encodes thousands of object locations. Then the method learns a regression model based on the binary vector to predict the actual object location.
  • Implementations of the above aspects may include one or more of the following.
  • the system first relaxes the dense sampling of the scale space with coarse object proposals generated from bottom-up segmentations. Based on detection results on these proposals, the system conducts a top-down search to more precisely localize the object using supervised descent.
  • This two-stage detection strategy dubbed location relaxation, is able to localize the object in the continuous parameter space.
  • the system and method leverage the rich spatial information learned from the Regionlets detection framework to determine where the object is precisely localized.
  • FIGS. 1A-1C show an exemplary process to perform accurate object detection with Location Relaxation and Regionlets Re-localization.
  • FIGS. 2A-2D show an illustration of an object detection frame work.
  • FIG. 3 shows a sample detection results on the PASCAL VOC 2007 dataset.
  • FIG. 4 shows an exemplary computer system to perform accurate object detection with Location Relaxation and Regionlets Re-localization.
  • An object may appear in any locations and scales in an image defined by the continuous parameter space spanned by (x, y, s, a), where (x, y) is the object center point, and s and a are the scale and aspect ratio of the object.
  • different aspect ratios generally correspond to different viewpoints, leaving a difficult open question for robust object detection.
  • FIG. 1 shows an exemplary process for detecting objects.
  • the process includes receiving an input image, extracting features therefrom, applying an object detector, and re-localizing the object ( 100 ).
  • the object detection framework Given a testing image, the object detection framework extracts features from the image, and then applies the learned object detector to each possible locations to detect the object. A binary decision, i.e., whether the location presents an object, is made based on scores provided by the object detector. In addition to traditional approaches, a location regression step improves the localization of the object. The approach has two steps shown in FIGS. 2 and 3 , respectively.
  • the system and method transfer the 1-D feature extracted from Regionlets into a 8 dimensional binary vector as shown in 101 .
  • These features have rich spatial information which helps to localize the object.
  • the system and method use the least square learning to learn the coefficients for location regression based on the binary vector as shown in 102 .
  • the detection framework is capable of precisely searching for the object in a full parameter space with favorable efficiency.
  • the system and method first relax dense sampling of the object location and scale, dubbed the name location relaxation, and only evaluate the detector at a much coarser set of locations and scales.
  • the system and method apply supervised descent search to find potential object hypothesis by simultaneously optimizing their center point, scale, and aspect ratio.
  • the resulting detections are much more improved with supervised descent search but still not sufficient in terms of accurate localization.
  • the system and method use Regionlets Re-localization, which is naturally built based on the quantized Regionlets features, to directly predict the true object location based on results from supervised descent search.
  • the system and method By applying an object detector to bottom-up object proposals, the system and method obtain coarse detections, i.e., the bounding boxes shown in FIG. 2( b ). Among them, the red box is relatively confident detection compared to others. Through the supervised descent search starting from the red bounding box, a better detection is obtained as the dash box in FIG. 2 ( c ). Finally the system and method apply Regionlets Re-localization to determine the object location as shown in FIG. 2 ( d ). We show some sample detection results on the PASCAL VOC 2007 dataset in FIG. 3 .
  • the system has three aspects. Firstly, coarse detection plus supervised descent search in a fully parameterized location space for generic object detection shows promising performance. Secondly, a novel Regionlets Re-localization method complements the suboptimal object localization performance given by object detectors. Finally, our detection framework achieves the best performance on the PASCAL VOC 2007 dataset without using any outside data. It also demonstrates superior performance on our self-collected car dataset.
  • Our object detection framework is composed of three key components: bottom-up object hypotheses generation, top-down object search with supervised descent and object re-localization with a localization model.
  • object hypotheses there are several alternatives to obtain object hypotheses. For example, through the objectness measurement, the saliency analysis or their combinations, or using segmentation cues. Because our top-down search algorithm is applied locally, the system and method expect the bottom-up object hypotheses to split the object location space evenly, to avoid the search algorithm converging to the same local minimum. To this end, the system and method employ low-level segmentation to propose the object hypotheses.
  • the superpixel segmentation merges similar pixels locally into disjoint sets which perfectly matches our need. However, over-segments only provide small object candidates. To obtain object hypotheses for large objects, the over segmented superpixels are gradually merged to produce larger candidates.
  • the detection with location relaxation takes coarse detection results from a detector applied on the bottom up object proposals. Then it searches the object location guided by discriminatively learned descent model inspired by Xiong and De la Torre.
  • the learned supervised descent model is used to predict the next more accurate object location to explore based on observations from the current location.
  • our method is applicable with any black box object detector, the system and method use the Regionlets detector due to its outstanding performance and flexibility to detect objects in any viewpoints.
  • the system and method employ a segmentation based bottom-up scheme to generate our initial set of candidate searching locations.
  • over-segments i.e., superpixels
  • a segmented region r i is described by several characteristics, i.e., the size of the region (total number of pixels), color histograms, and the texture information (gradient orientation histograms).
  • Four neighbor region similarities are defined based on these characteristics as shown in the following equations:
  • c i k is the kth dimension of the color histogram
  • sz(r i ) is the number of pixels in image region r i
  • im stands for the whole image
  • t i k is the k th dimension of the texture histogram
  • bb ij is the rectangular region which tightly bound region r i and r j .
  • S c , S s and S t are the color similarity, size similarity, texture similarities, respectively.
  • S f measures how the combined two regions will occupy the rectangular bounding box which tightly bounds them.
  • the similarity of two adjacent regions can be determined by any combination of the four similarities.
  • each merging step produces a bounding box which bounds the merged two regions.
  • the system and method want regions from the same object to be merged together.
  • Each low level cue contributes from its aspect.
  • the color similarity measures the color intensity correlation between neighbor regions which encourage regions similar in color to be merged together.
  • the size similarity encourages small regions to merge first.
  • the fill similarity encourages the bounding box to tightly bound the merged region.
  • the texture similarity measures the similarity of appearance in gradient, which is complementary to color similarity. The usage of similarity measures and segmentation parameters are detailed in the experiment section.
  • the system and method apply an object detector to determine relatively confident detections.
  • the top-down supervised descent search is only applied to these confident detections.
  • Supervised descent is a general approach to optimize an objective function which is neither analytically differentiable nor practical to be numerically approximated. It is very suitable for vision problems when visual feature is involved in optimizing the objective function, because most visual features such as SIFT, HOG, and LBP histogram are not differentiable with respect to locations.
  • supervised descent uses a large number of examples to train a regression model to predict the descent direction.
  • the training process requires features, which serves as the regressor, to be a fixed length vector, while bottom up segmentations naturally produces arbitrary size proposals.
  • the system and method normalize the bounding boxes to a fixed size. In the following, the system and method explain how the supervised descent is adopted to find objects in a full parameter space.
  • the goal of the supervised descent training process is hence to learn a sequence of K models to predict the optimal descent direction of the bounding box for each step of the supervised descent, where the needed supervised descent step K is also automatically identified from the training process.
  • ⁇ (o k-1 ) to be the n dimensional feature vector extracted from the bounding box defined by o k-1 in the k ⁇ 1 step of the supervised descent process
  • ⁇ (.) indicates the feature extracted which is HOG and LBP histogram in our experiments.
  • the system and method update the new locations determined by the previous model R k-1 and b k-1 ,
  • o k i o k-1 i +R k-1 T ⁇ ( o k-1 i )+ b k-1 . (6)
  • the system and method Given a testing image, the system and method firstly apply the cascade regionlets detector [23] to the coarse bottom-up object candidates. Object hypotheses which produces high detection scores are fed to the iterative supervised descent search process to perform local search. New locations output by supervised descent search are re-evaluated by the object detector to obtain the detection score. By ranking all the detection scores from searched locations, the system and method keep the most confident detections.
  • the supervised descent search introduced in the previous subsection significantly improve the detection rate by scanning more predicted object candidates.
  • the system and method assume the object has already been detected, but with non-perfect localization.
  • the system and method train a model specific for object localization taking advantage of features extracted from the Regionlets detection model.
  • the Regionlets detector is composed of thousands of weak classifiers learned with RealBoost. These weak classifiers are formed as several cascades for early rejection, yielding fast object detection. The cascade structure is not related to our re-localization approach and would not be included in the following presentation without any misunderstanding.
  • the input of each weak classifier in the Regionlets model is a 1-D feature extracted from a rectangular region in the detection window. In the trainging process, these 1-D features are greedily chosen to minimize the logistic loss over all training samples, which is based on classification errors.
  • Regionlets training process greedily select discriminative visual appearances, but also it determines the spatial regions to extract the 1-D feature.
  • the resulting weak features extracted from regionlets implicitly encode thousands of spatial locations, which could be used to further predict the precise location of an object. It is worth noting that the detector learning only targets on minimizing the classification error which does not necessarily guarantee that the localization error is also minimized at the same time.
  • the system and method let each Regionlet vote the object's position.
  • the problem is equivalent to predict the localization error ( ⁇ l n , ⁇ l t , ⁇ l r , ⁇ l b ) of the current detection so that the true object location is computed as:
  • (l*, t*, r*, b*) is the ground truth object location.
  • (l, t, r, b) is the bounding box detected with the Regionlets model.
  • ( ⁇ l n , ⁇ t n , ⁇ r n , ⁇ b n ) are the relative localization error between the ground truth and the current detection. It is normalized by the width and height of the detected objects. Detections from Regionlets model have various sizes, the system and method observe that normalizing displacement errors is critical to stabilize the training and prediction.
  • ⁇ L is either ⁇ l n , ⁇ t n , ⁇ r n , or ⁇ b n
  • R is the feature extracted for from regionlets.
  • Equation (9) is the regularization term, while C is a trade-off factor between the regularization and the sum of squared error, ⁇ is the tolerance factor.
  • the feature R is extracted from the discriminatively learned Regionlets detection model.
  • Regionlets features produces poor performance.
  • the system and method transfer the 1-D Regionlet feature into a sparse binary vector.
  • Each Regionlets weak classifier is a piece-wise linear function implemented using a lookup table:
  • f i is the 1-D feature extracted from a group of regionlets
  • Q(f i ) quantize the feature f into an integer from 1 to 8.
  • ⁇ w i,j ⁇ j-1 8 is the classifier weights learned in the boosting training process.
  • the Regionlets object detector is a combination of N weak classifiers:
  • each Regionlets feature f i has 8 options to vote for the actual object location depending on the binarized feature vector r i .
  • Learning the weight vector V in Equation (9) is to jointly determine the votes (regression coefficients) in 8 different scenarios for all Regionlets features.
  • the sparse binary features extracted from regionlets are very high dimensional. We observed significant over-fitting problem if there are not enough training samples. To avoid over-fitting during training, the system and method randomly sample 80 k bounding boxes around ground truth objects to train the localization model.
  • the supervised descent search is designed to search more object candidates in a principled way to increase the detection rate, and a following discriminative visual model (Regionlets detector) is mandatory to determine the detection scores of new locations.
  • Regionlets Re-localization is only used to predict the accurate object location. There is no detector followed to evaluate the new location as in the supervised search. Thus it adjusts the detection to a more precise location without changing the detection score. In contrast, using the object detector to re-evaluate the detection score decreases the performance. Because the newly predicted location usually gives lower detection score which causes the predicted location being eliminated in the post non-max suppression process.
  • the role of supervised descent search is to find objects based on detections with coarse locations. Regionlets Re-localization is conducted on fine detections from supervised descent search. It aims at further improvement in accurate localization based on reasonable good localizations from supervised descent search. Leaving out any of these two schemes would significantly hurt the detection performance according to our observation.
  • the computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus.
  • RAM random access memory
  • program memory preferably a writable read-only memory (ROM) such as a flash ROM
  • I/O controller coupled by a CPU bus.
  • the computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM.
  • I/O controller is coupled by means of an I/O bus to an I/O interface.
  • I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link.
  • a display, a keyboard and a pointing device may also be connected to I/O bus.
  • separate connections may be used for I/O interface, display, keyboard and pointing device.
  • Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
  • Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
  • the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

An object detector includes a bottom-up object hypotheses generation unit; a top-down object search with supervised descent unit; and an object re-localization unit with a localization model.

Description

  • This application claims priority to Provisional Application Ser. No. 62/014,787 filed Jun. 20, 2014, the content of which is incorporated by reference.
  • The present invention relates to object detection systems and methods.
  • BACKGROUND
  • Current object detection algorithms are focused on robustly detects the target object. Even the detection window is not precisely overlapping the object, the object detector can still response with a high detection score. It contradicts with some applications in which want as accurate localization as possible.
  • Standard sliding window based object detection requires dense classifier evaluation on densely sampled locations in scale space in order to achieve an accurate localization. To avoid such dense evaluation, selective search based algorithms only evaluate the classifier on a small subset of object proposals. Notwithstanding the demonstrated success, object proposals do not guarantee perfect overlap with the object, leading to a suboptimal detection accuracy.
  • SUMMARY
  • In one aspect, an object detector includes a bottom-up object hypotheses generation unit; a top-down object search with supervised descent unit; and an object re-localization unit with a localization model.
  • In another aspect, a method takes advantage of the rich spatial information encoded in the Regionlets object detection model for location prediction. The method transfers the Regionlets feature extracted from the Regionlets model to a high dimensional sparse binary vector. This binary vector implicitly encodes thousands of object locations. Then the method learns a regression model based on the binary vector to predict the actual object location.
  • Implementations of the above aspects may include one or more of the following. The system first relaxes the dense sampling of the scale space with coarse object proposals generated from bottom-up segmentations. Based on detection results on these proposals, the system conducts a top-down search to more precisely localize the object using supervised descent. This two-stage detection strategy, dubbed location relaxation, is able to localize the object in the continuous parameter space. Furthermore, there is a conflict between accurate object detection and robust object detection. That is because the achievement of the later requires the accommodation of inaccurate and perturbed object locations in the training phase. To address this conflict, the system and method leverage the rich spatial information learned from the Regionlets detection framework to determine where the object is precisely localized. Our proposed approaches are extensively validated on the PASCAL VOC 2007 dataset and a self-collected large scale car dataset. Our method boosts the mean average precision of the current state-of-the-art (41.7%) to 44.1% on PASCAL VOC 2007 dataset. To our best knowledge, it is the best performance reported without using outside data.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1A-1C show an exemplary process to perform accurate object detection with Location Relaxation and Regionlets Re-localization.
  • FIGS. 2A-2D show an illustration of an object detection frame work.
  • FIG. 3 shows a sample detection results on the PASCAL VOC 2007 dataset.
  • FIG. 4 shows an exemplary computer system to perform accurate object detection with Location Relaxation and Regionlets Re-localization.
  • DESCRIPTION
  • An object may appear in any locations and scales in an image defined by the continuous parameter space spanned by (x, y, s, a), where (x, y) is the object center point, and s and a are the scale and aspect ratio of the object. In particular, different aspect ratios generally correspond to different viewpoints, leaving a difficult open question for robust object detection.
  • FIG. 1 shows an exemplary process for detecting objects. The process includes receiving an input image, extracting features therefrom, applying an object detector, and re-localizing the object (100).
  • Given a testing image, the object detection framework extracts features from the image, and then applies the learned object detector to each possible locations to detect the object. A binary decision, i.e., whether the location presents an object, is made based on scores provided by the object detector. In addition to traditional approaches, a location regression step improves the localization of the object. The approach has two steps shown in FIGS. 2 and 3, respectively.
  • In FIG. 2, the system and method transfer the 1-D feature extracted from Regionlets into a 8 dimensional binary vector as shown in 101. These features have rich spatial information which helps to localize the object. We concatenate the binary vectors obtained from all Regionlets features for regression training.
  • In FIG. 3, the system and method use the least square learning to learn the coefficients for location regression based on the binary vector as shown in 102.
  • min V 2 + C m = 1 M ( Δ L m - V T R m ) 2 102
  • The detection framework is capable of precisely searching for the object in a full parameter space with favorable efficiency. To achieve this goal, the system and method first relax dense sampling of the object location and scale, dubbed the name location relaxation, and only evaluate the detector at a much coarser set of locations and scales. For coarse detection windows which have relatively high response, the system and method apply supervised descent search to find potential object hypothesis by simultaneously optimizing their center point, scale, and aspect ratio. The resulting detections are much more improved with supervised descent search but still not sufficient in terms of accurate localization. For this the system and method use Regionlets Re-localization, which is naturally built based on the quantized Regionlets features, to directly predict the true object location based on results from supervised descent search.
  • By applying an object detector to bottom-up object proposals, the system and method obtain coarse detections, i.e., the bounding boxes shown in FIG. 2( b). Among them, the red box is relatively confident detection compared to others. Through the supervised descent search starting from the red bounding box, a better detection is obtained as the dash box in FIG. 2 (c). Finally the system and method apply Regionlets Re-localization to determine the object location as shown in FIG. 2 (d). We show some sample detection results on the PASCAL VOC 2007 dataset in FIG. 3.
  • The system has three aspects. Firstly, coarse detection plus supervised descent search in a fully parameterized location space for generic object detection shows promising performance. Secondly, a novel Regionlets Re-localization method complements the suboptimal object localization performance given by object detectors. Finally, our detection framework achieves the best performance on the PASCAL VOC 2007 dataset without using any outside data. It also demonstrates superior performance on our self-collected car dataset.
  • Our object detection framework is composed of three key components: bottom-up object hypotheses generation, top-down object search with supervised descent and object re-localization with a localization model.
  • There are several alternatives to obtain object hypotheses. For example, through the objectness measurement, the saliency analysis or their combinations, or using segmentation cues. Because our top-down search algorithm is applied locally, the system and method expect the bottom-up object hypotheses to split the object location space evenly, to avoid the search algorithm converging to the same local minimum. To this end, the system and method employ low-level segmentation to propose the object hypotheses. The superpixel segmentation merges similar pixels locally into disjoint sets which perfectly matches our need. However, over-segments only provide small object candidates. To obtain object hypotheses for large objects, the over segmented superpixels are gradually merged to produce larger candidates.
  • The detection with location relaxation takes coarse detection results from a detector applied on the bottom up object proposals. Then it searches the object location guided by discriminatively learned descent model inspired by Xiong and De la Torre. The learned supervised descent model is used to predict the next more accurate object location to explore based on observations from the current location. Although our method is applicable with any black box object detector, the system and method use the Regionlets detector due to its outstanding performance and flexibility to detect objects in any viewpoints.
  • All the detection results, including the original coarse detections as well as detections generated by supervised descent search, are fed to our Regionlets Re-localization process to more accurately locate the target objects.
  • To complement our top-down searching strategy, the system and method employ a segmentation based bottom-up scheme to generate our initial set of candidate searching locations. We start with over-segments (i.e., superpixels) of an image and then hierarchically group these small regions to generate object hypotheses. We generate superpixel segments. A segmented region ri is described by several characteristics, i.e., the size of the region (total number of pixels), color histograms, and the texture information (gradient orientation histograms). Four neighbor region similarities are defined based on these characteristics as shown in the following equations:
  • S c ( r i , r j ) = k = 1 n min ( c i k , c j k ) , ( 1 ) S s ( r i , r j ) = 1 - sz ( r i ) + sz ( r j ) sz ( i m ) , ( 2 ) S t ( r i , r j ) = k = 1 n min ( t i k , t j k ) , ( 3 ) S f ( r i , r j ) = 1 - sz ( bb ij ) - sz ( r i ) - sz ( r j ) sz ( i m ) . ( 4 )
  • where ci k is the kth dimension of the color histogram, sz(ri) is the number of pixels in image region ri, im stands for the whole image, ti k is the k th dimension of the texture histogram, bbij is the rectangular region which tightly bound region ri and rj. Sc, Ss and St are the color similarity, size similarity, texture similarities, respectively. Sf measures how the combined two regions will occupy the rectangular bounding box which tightly bounds them. The similarity of two adjacent regions can be determined by any combination of the four similarities.
  • The two regions with the highest similarity w.r.t the similarity measurement are merged first and this greedy process is repeated following an agglomerative style clustering scheme. Each merging step produces a bounding box which bounds the merged two regions. In principle, the system and method want regions from the same object to be merged together. Each low level cue contributes from its aspect. For example, the color similarity measures the color intensity correlation between neighbor regions which encourage regions similar in color to be merged together. The size similarity encourages small regions to merge first. The fill similarity encourages the bounding box to tightly bound the merged region. The texture similarity measures the similarity of appearance in gradient, which is complementary to color similarity. The usage of similarity measures and segmentation parameters are detailed in the experiment section.
  • Once the coarse object hypotheses are obtained, the system and method apply an object detector to determine relatively confident detections. The top-down supervised descent search is only applied to these confident detections.
  • Supervised descent is a general approach to optimize an objective function which is neither analytically differentiable nor practical to be numerically approximated. It is very suitable for vision problems when visual feature is involved in optimizing the objective function, because most visual features such as SIFT, HOG, and LBP histogram are not differentiable with respect to locations. Instead of computing the descent direction from the gradient, supervised descent uses a large number of examples to train a regression model to predict the descent direction. The training process requires features, which serves as the regressor, to be a fixed length vector, while bottom up segmentations naturally produces arbitrary size proposals. To deal with this issue, the system and method normalize the bounding boxes to a fixed size. In the following, the system and method explain how the supervised descent is adopted to find objects in a full parameter space.
  • Given an initial object hypothesis location o0=[x0, y0, s0, a0]T, which may not accurately bound the object, our objective is to use supervised descent to greedily adjust the bounding box by a local movement Δo=[Δx, Δy, Δs, Δa]T, leading to a more accurate localization of the object. The goal of the supervised descent training process is hence to learn a sequence of K models to predict the optimal descent direction of the bounding box for each step of the supervised descent, where the needed supervised descent step K is also automatically identified from the training process.
  • More specifically, denote Φ(ok-1) to be the n dimensional feature vector extracted from the bounding box defined by ok-1 in the k−1 step of the supervised descent process, the system and method learn an n×4 linear projection matrix Rk-1=[rk-1 x, rk-1 y, rk-1 s, rk-1 a]T and a four dimensional bias vector bk-1=[bk-1 x, bk-1 y, bk-1 s, bk-1 a]T so that the bounding box movement can be predicted as Δok=Rk-1 TΦ(ok-1)+bk-1 based on the location from the k−1 step. Φ(.) indicates the feature extracted which is HOG and LBP histogram in our experiments.
  • We first explain the training process for the first supervised descent model, followed by details to train models sequentially after. Given a set of labeled ground truth object locations {o* i=(x* i, y* i, s* i, a* i)}, the system and method construct the starting locations {o0 i=(x0 i, y0 i, s0 i, a0 i)} of the object by applying a random perturbation from the ground truth but assure that they are overlapped. The training of the projection matrix R0 and the bias b0 is to solve the following optimization problem:
  • arg min R 0 , b 0 i Δ o 0 * i - Δ o 0 i 2 , ( 5 )
  • where Δo0* i=o* i−o0 i is the true movement and Δo0 i=R0 TΦ(o0 i)+b0 is the predicted displacements of the state vector. The optimal R0 and b0 are computed in a closed-form by a linear least square method.
  • The subsequent Rk and bk for k=1,2, . . . , can be learned iteratively. At each iteration, the system and method update the new locations determined by the previous model Rk-1 and bk-1,

  • o k i =o k-1 i +R k-1 TΦ(o k-1 i)+b k-1.  (6)
  • By updating Δok* i=o* i−ok i and Δok i=Rk TΦ(ok-1 i)+bk-1 the optimal Rk and bk can be learned from a new linear regression problem by minimizing
  • arg min R k , b k i Δ o k * i - Δ o k i 2 . ( 7 )
  • The error empirically decreases as more iterations are added. In our experiments, this training of supervised descent models often converged in 20-30 steps.
  • Given a testing image, the system and method firstly apply the cascade regionlets detector [23] to the coarse bottom-up object candidates. Object hypotheses which produces high detection scores are fed to the iterative supervised descent search process to perform local search. New locations output by supervised descent search are re-evaluated by the object detector to obtain the detection score. By ranking all the detection scores from searched locations, the system and method keep the most confident detections.
  • The supervised descent search introduced in the previous subsection significantly improve the detection rate by scanning more predicted object candidates. In this section, the system and method assume the object has already been detected, but with non-perfect localization. To further improve the object detection system, the system and method train a model specific for object localization taking advantage of features extracted from the Regionlets detection model.
  • The Regionlets detector is composed of thousands of weak classifiers learned with RealBoost. These weak classifiers are formed as several cascades for early rejection, yielding fast object detection. The cascade structure is not related to our re-localization approach and would not be included in the following presentation without any misunderstanding. The input of each weak classifier in the Regionlets model is a 1-D feature extracted from a rectangular region in the detection window. In the trainging process, these 1-D features are greedily chosen to minimize the logistic loss over all training samples, which is based on classification errors.
  • Not only does the Regionlets training process greedily select discriminative visual appearances, but also it determines the spatial regions to extract the 1-D feature. Thus the resulting weak features extracted from regionlets implicitly encode thousands of spatial locations, which could be used to further predict the precise location of an object. It is worth noting that the detector learning only targets on minimizing the classification error which does not necessarily guarantee that the localization error is also minimized at the same time.
  • To leverage the rich spatial information encoded in the Regionlets model, the system and method let each Regionlet vote the object's position. Given the object location (l, t, r, b) detected by the object detector ((l, t, r, b) represents the object's left, top, right and bottom coordinates, respectively), the problem is equivalent to predict the localization error (Δln, Δlt, Δlr, Δlb) of the current detection so that the true object location is computed as:

  • l*=l+wΔl n ,t*=t+hΔt n,

  • r=r+wΔr n ,b=b+hΔb n.  (8)
  • Here (l*, t*, r*, b*) is the ground truth object location. (l, t, r, b) is the bounding box detected with the Regionlets model. w=r−l+1, h=b−t+1 are the detected bounding box width and height respectively. (Δln, Δtn, Δrn, Δbn) are the relative localization error between the ground truth and the current detection. It is normalized by the width and height of the detected objects. Detections from Regionlets model have various sizes, the system and method observe that normalizing displacement errors is critical to stabilize the training and prediction.
  • Training the localization model is to learn a vector V, so that the system and method can predict the localization error: ΔL=VTR, where ΔL is either Δln, Δtn, Δrn, or Δbn, R is the feature extracted for from regionlets. We minimize the squared localization error in the model training phase. More specifically, the system and method solve a support vector regression problem for each of the four coordinates respectively:
  • min V { V 2 + C m = 1 M max ( 0 , Δ L m - V T R m - ɛ ) 2 } , ( 9 )
  • where V is the coefficient vector to be learned, ΔLm is the normalized localization error of training sample m, Rm is the feature extracted from all the Regionlets in the object detection model for the m th sample as explained in the following, M is the total number of training examples. The first term in the Equation (9) is the regularization term, while C is a trade-off factor between the regularization and the sum of squared error, ε is the tolerance factor. The problem can be effectively solved using the publicly available liblinear package.
  • The feature R is extracted from the discriminatively learned Regionlets detection model. However, directly applying Regionlets features produces poor performance. Based on the weak classifier learned on each Regionlets feature, the system and method transfer the 1-D Regionlet feature into a sparse binary vector. Each Regionlets weak classifier is a piece-wise linear function implemented using a lookup table:
  • h i = j = 1 8 w i , j δ ( Q ( f i ) - j ) , ( 10 )
  • where fi is the 1-D feature extracted from a group of regionlets, Q(fi) quantize the feature f into an integer from 1 to 8. δ(x)=1 when x=0 otherwise 0. {wi,j}j-1 8 is the classifier weights learned in the boosting training process. We transfer Q(fi) into an 8-dimensional binary vector r, where the j th dimension is computed as r(j)=1(Q(fi)=j), and 1(.) is the indicator function. Apparently, there is one and only one nonzero dimension in r. Note that the Regionlets object detector is a combination of N weak classifiers:
  • H = i = 1 N h i . ( 11 )
  • Thus by concatenating these binary vectors from all weak classifiers, the detection model naturally produces 8N dimensional sparse vectors, denoted as R=(r1 T, r2 T, . . . , rN t)T. It serves as the feature vector Rm in Equation (9). Intuitively, each Regionlets feature fi has 8 options to vote for the actual object location depending on the binarized feature vector ri. Learning the weight vector V in Equation (9) is to jointly determine the votes (regression coefficients) in 8 different scenarios for all Regionlets features.
  • The sparse binary features extracted from regionlets are very high dimensional. We observed significant over-fitting problem if there are not enough training samples. To avoid over-fitting during training, the system and method randomly sample 80 k bounding boxes around ground truth objects to train the localization model.
  • The supervised descent search is designed to search more object candidates in a principled way to increase the detection rate, and a following discriminative visual model (Regionlets detector) is mandatory to determine the detection scores of new locations. Regionlets Re-localization is only used to predict the accurate object location. There is no detector followed to evaluate the new location as in the supervised search. Thus it adjusts the detection to a more precise location without changing the detection score. In contrast, using the object detector to re-evaluate the detection score decreases the performance. Because the newly predicted location usually gives lower detection score which causes the predicted location being eliminated in the post non-max suppression process. To summarize, the role of supervised descent search is to find objects based on detections with coarse locations. Regionlets Re-localization is conducted on fine detections from supervised descent search. It aims at further improvement in accurate localization based on reasonable good localizations from supervised descent search. Leaving out any of these two schemes would significantly hurt the detection performance according to our observation.
  • By way of example, a block diagram of a computer to support the system is discussed next. The computer preferably includes a processor, random access memory (RAM), a program memory (preferably a writable read-only memory (ROM) such as a flash ROM) and an input/output (I/O) controller coupled by a CPU bus. The computer may optionally include a hard drive controller which is coupled to a hard disk and CPU bus. Hard disk may be used for storing application programs, such as the present invention, and data. Alternatively, application programs may be stored in RAM or ROM. I/O controller is coupled by means of an I/O bus to an I/O interface. I/O interface receives and transmits data in analog or digital form over communication links such as a serial link, local area network, wireless link, and parallel link. Optionally, a display, a keyboard and a pointing device (mouse) may also be connected to I/O bus. Alternatively, separate connections (separate buses) may be used for I/O interface, display, keyboard and pointing device. Programmable processing system may be preprogrammed or it may be programmed (and reprogrammed) by downloading a program from another source (e.g., a floppy disk, CD-ROM, or another computer).
  • Each computer program is tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
  • The invention has been described herein in considerable detail in order to comply with the patent Statutes and to provide those skilled in the art with the information needed to apply the novel principles and to construct and use such specialized components as are required. However, it is to be understood that the invention can be carried out by specifically different equipment and devices, and that various modifications, both as to the equipment details and operating procedures, can be accomplished without departing from the scope of the invention itself.

Claims (18)

What is claimed is:
1. An object detector, comprising:
a bottom-up object hypotheses generation unit;
a top-down object search with supervised descent unit; and
an object re-localization unit with a localization model.
2. The detector of claim 1, comprising a feature extractor that extracts features from the image, and a learning module to train the object detector, wherein the learned object detector is applied to each possible locations to detect the object.
3. The detector of claim 1, wherein the object detector makes a b binary decision on whether the location presents an object based on scores provided by the object detector.
4. The detector of claim 1, comprising a location regression module to improve localization of the object.
5. The detector of claim 1, comprising Regionlets having extracted features into a multi-dimensional binary vector and wherein binary vectors obtained from all Regionlets features are concatenated for regression training.
6. The detector of claim 1, comprising a least square learning module to learn the coefficients for location regression based on the binary vector.
7. The detector of claim 1, comprising determining
min V { V 2 + C m = 1 M max ( 0 , Δ L m - V T R m - ɛ ) 2 } ,
where V is a coefficient vector to be learned, ΔLm is a normalized localization error of training sample m, Rm is a feature extracted from all the Regionlets in an object detection model for the m th sample as explained in the following, M is the total number of training examples, C is a trade-off factor between a regularization and the sum of squared error, ε is a tolerance factor.
8. The detector of claim 1, wherein the object hypotheses are formed through objectness measurement, saliency analysis or their combinations, or segmentation cues.
9. The detector of claim 1, wherein the bottom-up object hypotheses generation unit splits the object location space evenly to avoid the search algorithm converging to the same local minimum.
10. The detector of claim 1, comprising a low-level segmentation unit to propose the object hypotheses.
11. The detector of claim 1, comprising a superpixel segmentation unit to merge similar pixels locally into disjoint sets.
12. The detector of claim 1, wherein over segmented superpixels are gradually merged to produce larger candidates.
13. The detector of claim 1, wherein the detection with location relaxation takes coarse detection results and searches the object location guided by discriminatively learned descent mode.
14. The detector of claim 1, comprising learned supervised descent model is used to predict the next more accurate object location to explore based on observations from the current location.
15. The detector of claim 1, comprising a segmentation based bottom-up module to generate an initial set of candidate searching locations.
16. The detector of claim 1, comprising a module to receive over-segments or superpixels of an image and then hierarchically group these small regions to generate object hypotheses and to generate superpixel segments.
17. The detector of claim 16, wherein a segmented region ri is described by a plurality of characteristics including size of the region (total number of pixels), color histograms, and texture information or gradient orientation histograms.
18. The detetctor of claim 17, wherein four neighbor region similarities are defined based on these characteristics as shown in the following equations:
S c ( r i , r j ) = k = 1 n min ( c i k , c j k ) ,
where ci k is the kth dimension of the color histogram, sz(ri) is the number of pixels in image region ri, im stands for the whole image, ti k is the k th dimension of the texture histogram, bbij is the rectangular region which tightly bound region ri and rj, Sc, Ss and St are the color similarity, size similarity, texture similarities, respectively. Sf measures how the combined two regions will occupy the rectangular bounding box which tightly bounds them.
US14/716,435 2014-06-20 2015-05-19 Object detection with Regionlets re-localization Active US9235904B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/716,435 US9235904B1 (en) 2014-06-20 2015-05-19 Object detection with Regionlets re-localization

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201462014787P 2014-06-20 2014-06-20
US14/716,435 US9235904B1 (en) 2014-06-20 2015-05-19 Object detection with Regionlets re-localization

Publications (2)

Publication Number Publication Date
US20150371397A1 true US20150371397A1 (en) 2015-12-24
US9235904B1 US9235904B1 (en) 2016-01-12

Family

ID=54870114

Family Applications (1)

Application Number Title Priority Date Filing Date
US14/716,435 Active US9235904B1 (en) 2014-06-20 2015-05-19 Object detection with Regionlets re-localization

Country Status (1)

Country Link
US (1) US9235904B1 (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170140231A1 (en) * 2015-11-13 2017-05-18 Honda Motor Co., Ltd. Method and system for moving object detection with single camera
US9881380B2 (en) * 2016-02-16 2018-01-30 Disney Enterprises, Inc. Methods and systems of performing video object segmentation
US10165258B2 (en) * 2016-04-06 2018-12-25 Facebook, Inc. Efficient determination of optical flow between images
CN110268440A (en) * 2017-03-14 2019-09-20 欧姆龙株式会社 Image analysis apparatus, method for analyzing image and image analysis program
CN110675453A (en) * 2019-10-16 2020-01-10 北京天睿空间科技股份有限公司 Self-positioning method for moving target in known scene
US10540771B2 (en) * 2015-03-20 2020-01-21 Ventana Medical Systems, Inc. System and method for image segmentation
US10783409B2 (en) 2016-09-19 2020-09-22 Adobe Inc. Font replacement based on visual similarity
US10950017B2 (en) 2019-07-08 2021-03-16 Adobe Inc. Glyph weight modification
US10984295B2 (en) 2015-10-06 2021-04-20 Adobe Inc. Font recognition using text localization
US11107231B2 (en) * 2017-03-22 2021-08-31 Nec Corporation Object detection device, object detection method, and object detection program
US11295181B2 (en) 2019-10-17 2022-04-05 Adobe Inc. Preserving document design using font synthesis
CN114526682A (en) * 2022-01-13 2022-05-24 华南理工大学 Deformation measurement method based on image feature enhanced digital volume image correlation method
US11544507B2 (en) * 2018-10-17 2023-01-03 Samsung Electronics Co., Ltd. Method and apparatus to train image recognition model, and image recognition method and apparatus

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1416727A1 (en) * 2002-10-29 2004-05-06 Accenture Global Services GmbH Moving virtual advertising
US7616807B2 (en) * 2005-02-24 2009-11-10 Siemens Corporate Research, Inc. System and method for using texture landmarks for improved markerless tracking in augmented reality applications
FR2957164B1 (en) * 2010-03-03 2012-05-11 Airbus Operations Sas METHODS AND DEVICES FOR CONFIDURING VALIDATION OF A COMPLEX MULTICLEUM SYSTEM

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10540771B2 (en) * 2015-03-20 2020-01-21 Ventana Medical Systems, Inc. System and method for image segmentation
US10984295B2 (en) 2015-10-06 2021-04-20 Adobe Inc. Font recognition using text localization
US10176390B2 (en) * 2015-11-13 2019-01-08 Honda Motor Co., Ltd. Method and system for moving object detection with single camera
US20170140231A1 (en) * 2015-11-13 2017-05-18 Honda Motor Co., Ltd. Method and system for moving object detection with single camera
US10019637B2 (en) * 2015-11-13 2018-07-10 Honda Motor Co., Ltd. Method and system for moving object detection with single camera
US9881380B2 (en) * 2016-02-16 2018-01-30 Disney Enterprises, Inc. Methods and systems of performing video object segmentation
US10257501B2 (en) 2016-04-06 2019-04-09 Facebook, Inc. Efficient canvas view generation from intermediate views
US10165258B2 (en) * 2016-04-06 2018-12-25 Facebook, Inc. Efficient determination of optical flow between images
US10783409B2 (en) 2016-09-19 2020-09-22 Adobe Inc. Font replacement based on visual similarity
CN110268440A (en) * 2017-03-14 2019-09-20 欧姆龙株式会社 Image analysis apparatus, method for analyzing image and image analysis program
US11188781B2 (en) * 2017-03-14 2021-11-30 Omron Corporation Image analyzer, image analysis method, and image analysis program
US11107231B2 (en) * 2017-03-22 2021-08-31 Nec Corporation Object detection device, object detection method, and object detection program
US11544507B2 (en) * 2018-10-17 2023-01-03 Samsung Electronics Co., Ltd. Method and apparatus to train image recognition model, and image recognition method and apparatus
US11403794B2 (en) 2019-07-08 2022-08-02 Adobe Inc. Glyph weight modification
US10950017B2 (en) 2019-07-08 2021-03-16 Adobe Inc. Glyph weight modification
CN110675453A (en) * 2019-10-16 2020-01-10 北京天睿空间科技股份有限公司 Self-positioning method for moving target in known scene
US11295181B2 (en) 2019-10-17 2022-04-05 Adobe Inc. Preserving document design using font synthesis
US11710262B2 (en) 2019-10-17 2023-07-25 Adobe Inc. Preserving document design using font synthesis
CN114526682A (en) * 2022-01-13 2022-05-24 华南理工大学 Deformation measurement method based on image feature enhanced digital volume image correlation method

Also Published As

Publication number Publication date
US9235904B1 (en) 2016-01-12

Similar Documents

Publication Publication Date Title
US9235904B1 (en) Object detection with Regionlets re-localization
Long et al. Accurate object detection with location relaxation and regionlets re-localization
US10332266B2 (en) Method and device for traffic sign recognition
CN108268838B (en) Facial expression recognition method and facial expression recognition system
Najibi et al. G-cnn: an iterative grid based object detector
CN111126482B (en) Remote sensing image automatic classification method based on multi-classifier cascade model
Bai et al. A graph-based classification method for hyperspectral images
US8606022B2 (en) Information processing apparatus, method and program
US20210182618A1 (en) Process to learn new image classes without labels
Hussain et al. Nastalique segmentation-based approach for Urdu OCR
CN104615986A (en) Method for utilizing multiple detectors to conduct pedestrian detection on video images of scene change
WO2021118697A1 (en) Process to learn new image classes without labels
Huval et al. Deep learning for class-generic object detection
Chandio et al. Character classification and recognition for Urdu texts in natural scene images
CN112861917A (en) Weak supervision target detection method based on image attribute learning
KR20170108339A (en) Method for recognizing plural object in image
CN113486902A (en) Three-dimensional point cloud classification algorithm automatic selection method based on meta-learning
Muzakir et al. Model for Identification and Prediction of Leaf Patterns: Preliminary Study for Improvement
Singhal et al. Image classification using bag of visual words model with FAST and FREAK
Bardeh et al. New approach for human detection in images using histograms of oriented gradients
Saabni et al. Fast key-word searching via embedding and Active-DTW
Mueller et al. Hierarchical graph-based discovery of non-primitive-shaped objects in unstructured environments
Sanjeewani et al. A novel evolving classifier with a false alarm class for speed limit sign recognition
Aygüneş et al. Weakly supervised deep convolutional networks for fine-grained object recognition in multispectral images
El-Hajj et al. Recognition of Arabic handwritten words using contextual character models

Legal Events

Date Code Title Description
STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: NEC CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC LABORATORIES AMERICA, INC.;REEL/FRAME:037941/0595

Effective date: 20160309

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8