US8824548B2 - Object detecting with 1D range sensors - Google Patents
Object detecting with 1D range sensors Download PDFInfo
- Publication number
- US8824548B2 US8824548B2 US13/092,408 US201113092408A US8824548B2 US 8824548 B2 US8824548 B2 US 8824548B2 US 201113092408 A US201113092408 A US 201113092408A US 8824548 B2 US8824548 B2 US 8824548B2
- Authority
- US
- United States
- Prior art keywords
- sequence
- image
- classifier
- labels
- scanner
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
- H04N19/615—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding using motion compensated temporal filtering [MCTF]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/63—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using sub-band based transform, e.g. wavelets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/13—Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
Definitions
- This invention relates generally to image processing, and more particularly to classifying objects using range scanners in computer vision applications.
- Object classification is widely used in computer vision applications. While most common applications use 2D camera images, there is a need for accurate classification methods for 3D range data. For example, the objects can be part moving on an assembly line.
- object classification can use several type of data acquisition techniques such as inductive loop detector, video detector, acoustic detector, range sensor, and infrared detector.
- data acquisition techniques such as inductive loop detector, video detector, acoustic detector, range sensor, and infrared detector.
- One system uses a laser sensor that outputs range and intensity information for object detection and classification.
- the embodiments of the invention provide a method and system for classifying objects based on maximum margin classification and discriminative probabilistic sequential modeling of range data acquired by a scanner with a set of one or more 1D laser line scanner.
- the method includes pre-processing and classification phases. Different techniques, such as median filtering, background and foreground detection, 3D reconstruction and object prior information, are used during pre-processing steps to denoise the range data, and extract the most discriminative features. Then, a classifier is trained.
- the classifier is composed of an appearance classifier, a sequence classifier with different inference techniques, and state machine enforcement.
- FIG. 1 is a block diagram of object classification according to embodiments of the invention.
- FIG. 2 is a schematic of a scanner with a 1D laser line scanners according to embodiments of the invention.
- FIG. 1 shows a system and method for classifying an object 80 according to embodiments of our invention.
- Range data 101 are acquired by a scanner 90 from the object 80 as input for the method.
- the scanner 90 includes a 1D laser line sensor.
- the scanner is arranged a on pole 202 near the object is to be identified. It is understood that the invention can be worked with just one sensor.
- FIG. 2 also shows the field of view 203 for each sensor.
- the sensor acquires one or more side views of the object.
- the 1D (line) measurements of the range data are accumulated over time, and 2D images of range profile of the object are constructed.
- the 2D range image is used for object type classification.
- Output is a class 109 of the object.
- the method includes a preprocessing phase, and a classifying phase.
- preprocessing we denoise 110 the range data, remove 120 irrelevant background information, 3D project 130 the remaining foreground pixels using range information, and sensor scanning geometries, correct 140 the range, and extract features 155 .
- classification 170 we use outputs of a appearance classifier such as multi-class support vector machine (SVM) as features for a sequence classifier such as a conditional random field (CRF) classification to obtain initial class labels, enforce 180 object structure using discriminative properties of objects and feature attributes, and the sequential structure, and finally obtain the object class 109 .
- SVM multi-class support vector machine
- CRF conditional random field
- the noise level of the measurements changes based on the surface reflectance. For example, a black object can result in noisy measurements.
- each column of measurement comes from a vertical line in 3D space.
- different lines of scans can have different depth values (such as pole and body can be at different depth values).
- Classification is performed by the following steps. First, the height features are classified in the appearance classification 160 , and the appearance classification output is denoised using a sequence classification 170 . This approach is highly accurate because it benefits from both the maximum-margin nature of the appearance classification such as SVM and the power of discriminative probabilistic sequential model such as CRF. At last, we use a structure enforcement using a finite state machine to prevent invalid predictions, e.g. a object with only a single tire.
- the multi-class max-margin classifier SVM assigns initial labels to each time step of the image sequence.
- the sequential structure of the data is not taken into account during learning in this step except the windowing procedure in feature extraction.
- SVM takes the 70 ⁇ 11 dimensional height feature described above, and labels each features as either an object, body, tire or pole.
- the window with length 11 is shifted along the time axis, and each column of the range data is classified in that manner during testing.
- the SVM assigns initial labels but does not consider the sequential structure of the object. Therefore, we use the CRFs as an additional layer to exploit the sequential correlations between time steps. This stage performs as a denoising part on the predictions of SVMs, removing inconsistencies.
- MEMM Maximum Entropy Markov Model
- an inference process labels a test sequence.
- accurately predicting whole label sequence is very difficult so that individual predictions are used. This is achieved via predicting y i,t from a marginal distribution p(y i,t
- ⁇ t ⁇ ( j ) ⁇ i ⁇ ⁇ ⁇ ⁇ ( j , i , x t ) ⁇ ⁇ t - 1 ⁇ ( i ) , where ⁇ t (j) are the forward variables.
- the backward recursion is
- ⁇ t ⁇ ( i ) ⁇ j ⁇ ⁇ ⁇ t + 1 ⁇ ( j , i , x t + 1 ) ⁇ ⁇ t + 1 ⁇ ( j ) , where ⁇ t (i) are the backward variables, from which the marginal probabilities can be determined.
- the final step of classification is the enforcement of object constraints.
- This module takes output of the CRF. If labels do not correspond to a valid object, in other words, the labels do not correspond to some finite state machine. We convert the labels to labels of a most similar valid object model defined in an object grammar. If the CRF result is valid, this means there is no need for any correction. This is the case for a great majority of objects.
- the process is an error correcting regular grammar parser.
Abstract
Description
{(xi,yi)}i N=1, where xi=χi,1, χi,2, . . . , χi,Ti ,
and
yi=yi,1,yi,2, . . . , yi,Ti
is the label sequence.
where
Ψ(yt,
is the potential function,
gj(
is the transition feature function from state, and
is state feature function at state yi; λj and a μk are the parameters estimated at the learning process, and Z(x) is the normalization factor as a function of the observation sequence. Maximum likelihood parameter estimation of the above exponential family distribution corresponds to the maximum entropy solution.
(δt(j)=maxiΨ(j,i,χ t)δt−1(i),
which propagates the most likely path based on the maximum product rule. However, in many applications, accurately predicting whole label sequence is very difficult so that individual predictions are used. This is achieved via predicting yi,t from a marginal distribution p(yi,t|xi) using a dynamic programming forward-backward procedure,
where αt(j) are the forward variables. The backward recursion is
where βt(i) are the backward variables, from which the marginal probabilities can be determined.
Claims (15)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/092,408 US8824548B2 (en) | 2006-03-21 | 2011-04-22 | Object detecting with 1D range sensors |
JP2012090679A JP5773935B2 (en) | 2011-04-22 | 2012-04-12 | How to classify objects in a scene |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/385,620 US7903737B2 (en) | 2005-11-30 | 2006-03-21 | Method and system for randomly accessing multiview videos with known prediction dependency |
US13/092,408 US8824548B2 (en) | 2006-03-21 | 2011-04-22 | Object detecting with 1D range sensors |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/385,620 Division US7903737B2 (en) | 2005-11-30 | 2006-03-21 | Method and system for randomly accessing multiview videos with known prediction dependency |
Publications (2)
Publication Number | Publication Date |
---|---|
US20110200229A1 US20110200229A1 (en) | 2011-08-18 |
US8824548B2 true US8824548B2 (en) | 2014-09-02 |
Family
ID=47470252
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/092,408 Expired - Fee Related US8824548B2 (en) | 2006-03-21 | 2011-04-22 | Object detecting with 1D range sensors |
Country Status (1)
Country | Link |
---|---|
US (1) | US8824548B2 (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9740937B2 (en) | 2012-01-17 | 2017-08-22 | Avigilon Fortress Corporation | System and method for monitoring a retail environment using video content analysis with depth sensing |
US9858923B2 (en) * | 2015-09-24 | 2018-01-02 | Intel Corporation | Dynamic adaptation of language models and semantic tracking for automatic speech recognition |
US10371512B2 (en) * | 2016-04-08 | 2019-08-06 | Otis Elevator Company | Method and system for multiple 3D sensor calibration |
US10729382B2 (en) * | 2016-12-19 | 2020-08-04 | Mitsubishi Electric Research Laboratories, Inc. | Methods and systems to predict a state of the machine using time series data of the machine |
CN114205621A (en) * | 2018-02-28 | 2022-03-18 | 三星电子株式会社 | Encoding method and device, and decoding method and device |
CN110751188B (en) * | 2019-09-26 | 2020-10-09 | 华南师范大学 | User label prediction method, system and storage medium based on multi-label learning |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020159628A1 (en) * | 2001-04-26 | 2002-10-31 | Mitsubishi Electric Research Laboratories, Inc | Image-based 3D digitizer |
US7043084B2 (en) * | 2002-07-30 | 2006-05-09 | Mitsubishi Electric Research Laboratories, Inc. | Wheelchair detection using stereo vision |
US20080063264A1 (en) * | 2006-09-08 | 2008-03-13 | Porikli Fatih M | Method for classifying data using an analytic manifold |
US20080063285A1 (en) * | 2006-09-08 | 2008-03-13 | Porikli Fatih M | Detecting Moving Objects in Video by Classifying on Riemannian Manifolds |
US7599555B2 (en) * | 2005-03-29 | 2009-10-06 | Mitsubishi Electric Research Laboratories, Inc. | System and method for image matting |
US20090315981A1 (en) * | 2008-06-24 | 2009-12-24 | Samsung Electronics Co., Ltd. | Image processing method and apparatus |
US7835568B2 (en) * | 2003-08-29 | 2010-11-16 | Samsung Electronics Co., Ltd. | Method and apparatus for image-based photorealistic 3D face modeling |
US7903737B2 (en) * | 2005-11-30 | 2011-03-08 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for randomly accessing multiview videos with known prediction dependency |
-
2011
- 2011-04-22 US US13/092,408 patent/US8824548B2/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020159628A1 (en) * | 2001-04-26 | 2002-10-31 | Mitsubishi Electric Research Laboratories, Inc | Image-based 3D digitizer |
US7043084B2 (en) * | 2002-07-30 | 2006-05-09 | Mitsubishi Electric Research Laboratories, Inc. | Wheelchair detection using stereo vision |
US7835568B2 (en) * | 2003-08-29 | 2010-11-16 | Samsung Electronics Co., Ltd. | Method and apparatus for image-based photorealistic 3D face modeling |
US7599555B2 (en) * | 2005-03-29 | 2009-10-06 | Mitsubishi Electric Research Laboratories, Inc. | System and method for image matting |
US7903737B2 (en) * | 2005-11-30 | 2011-03-08 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for randomly accessing multiview videos with known prediction dependency |
US20080063264A1 (en) * | 2006-09-08 | 2008-03-13 | Porikli Fatih M | Method for classifying data using an analytic manifold |
US20080063285A1 (en) * | 2006-09-08 | 2008-03-13 | Porikli Fatih M | Detecting Moving Objects in Video by Classifying on Riemannian Manifolds |
US20090315981A1 (en) * | 2008-06-24 | 2009-12-24 | Samsung Electronics Co., Ltd. | Image processing method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
US20110200229A1 (en) | 2011-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8594431B2 (en) | Adaptive partial character recognition | |
US8824548B2 (en) | Object detecting with 1D range sensors | |
Schindler | An overview and comparison of smooth labeling methods for land-cover classification | |
Moser et al. | Dictionary-based stochastic expectation-maximization for SAR amplitude probability density function estimation | |
US7813581B1 (en) | Bayesian methods for noise reduction in image processing | |
Korus et al. | Evaluation of random field models in multi-modal unsupervised tampering localization | |
Bentabet et al. | Road vectors update using SAR imagery: a snake-based method | |
Zhang et al. | Hierarchical conditional random fields model for semisupervised SAR image segmentation | |
CN107680120A (en) | Tracking Method of IR Small Target based on rarefaction representation and transfer confined-particle filtering | |
CN105590020B (en) | Improved data comparison method | |
US20070127817A1 (en) | Change region detection device and change region detecting method | |
US20030215155A1 (en) | Calculating noise estimates of a digital image using gradient analysis | |
US20110274356A1 (en) | Image pattern recognition | |
US11663840B2 (en) | Method and system for removing noise in documents for image processing | |
US20180122097A1 (en) | Apparatus, method, and non-transitory computer-readable storage medium for storing program for position and orientation estimation | |
Hong et al. | Selective image registration for efficient visual SLAM on planar surface structures in underwater environment | |
Mohammad et al. | Contour-based character segmentation for printed Arabic text with diacritics | |
Nguyen et al. | UnfairGAN: An enhanced generative adversarial network for raindrop removal from a single image | |
Jana et al. | A fuzzy C-means based approach towards efficient document image binarization | |
CN113313179A (en) | Noise image classification method based on l2p norm robust least square method | |
Ghoshal et al. | An improved scene text and document image binarization scheme | |
CN113239828A (en) | Face recognition method and device based on TOF camera module | |
JP5773935B2 (en) | How to classify objects in a scene | |
Gómez-Moreno et al. | A “salt and pepper” noise reduction scheme for digital images based on support vector machines classification and regression | |
Sharma et al. | A Noise-Resilient Super-Resolution framework to boost OCR performance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC., M Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TUZEL, CUNEYT ONCEL;POLATKAN, GUNGOR;SIGNING DATES FROM 20120302 TO 20120307;REEL/FRAME:027883/0406 Owner name: MITSUBISHI ELECTRIC RESEARCH LABORATORIES, INC, MA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TUZEL, CUNEYT ONCEL;POLATKAN, GUNGOR;SIGNING DATES FROM 20120302 TO 20120307;REEL/FRAME:027883/0469 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551) Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20220902 |