CN108710761A - A kind of robust Model approximating method removing outlier based on spectral clustering - Google Patents

A kind of robust Model approximating method removing outlier based on spectral clustering Download PDF

Info

Publication number
CN108710761A
CN108710761A CN201810494460.2A CN201810494460A CN108710761A CN 108710761 A CN108710761 A CN 108710761A CN 201810494460 A CN201810494460 A CN 201810494460A CN 108710761 A CN108710761 A CN 108710761A
Authority
CN
China
Prior art keywords
model
outlier
classification
spectral clustering
interior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810494460.2A
Other languages
Chinese (zh)
Inventor
李琦铭
李俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Quanzhou Institute of Equipment Manufacturing
Original Assignee
Quanzhou Institute of Equipment Manufacturing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Quanzhou Institute of Equipment Manufacturing filed Critical Quanzhou Institute of Equipment Manufacturing
Priority to CN201810494460.2A priority Critical patent/CN108710761A/en
Publication of CN108710761A publication Critical patent/CN108710761A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Geometry (AREA)
  • Computer Hardware Design (AREA)
  • Discrete Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The present invention relates to a kind of robust Model approximating methods removing outlier based on spectral clustering, generate model hypothesis by carrying out multiple repairing weld to input data, build the similar matrix between preference matrix and its row vector;Then using based on spectral clustering removal outlier and the more structural model data class of generation;Finally according to the stopping function of more structural model instance datas come judge simulate approximating method whether stop, obtaining final models fitting as a result, the model parameter of i.e. more structural model structures.The present invention furthers investigate distinction of outlier during spectral clustering with interior point, efficiently removes outlier, accurately estimates more structural model parameters, and interior point and the result of outlier classification can instruct the subsequent sampling process of local sampling strategy.

Description

A kind of robust Model approximating method removing outlier based on spectral clustering
Technical field
The present invention relates to computer vision model-fitting technique fields, and in particular to one kind removing outlier based on spectral clustering Robust Model approximating method.
Background technology
The models fitting basic research task important as one is played an important role in field of machine vision, regarding Feel that SLAM, motion segmentation, three-dimensional reconstruction, panorama are taken pictures etc. all to have a wide range of applications.As shown in Figure 1, models fitting side The task of method is to generate model hypothesis by multiple data sampling, and the mould of these structure examples is then estimated by model selection Shape parameter.Since the original image and video information obtained from sensor is by camera inherent parameters, shooting angle and distance, light According to the influence of equal environmental changes, the key feature in image is often selected to describe sub (such as local feature description in models fitting Sub- SIFT feature) it is used as input data.And noise (noise), outlier are inevitably present in these data (outlier) information such as, accurately estimate model number and its corresponding parameter be still a great challenge task.
It is existing based on outlier removal pattern fitting method be usually required for independent step to containing outlier It is removed after being differentiated, if removing outlier before model estimation, the interior point of some model instances may be removed simultaneously;Such as Fruit removes after model estimation, and the presence of outlier can influence the accuracy of model parameter estimation;Secondly, most of method Do not use the subsequent sampling process that local sampling strategy is instructed with the relevant information of model fully yet, lead to not for comprising Noise and more structural model instance data quick samplings of high proportion outlier are to clean data subset.
Invention content
In view of the above-mentioned problems, the purpose of the present invention is to provide a kind of robust Model removing outlier based on spectral clustering is quasi- Conjunction method can efficiently remove outlier, improve the accuracy of model parameter.
To achieve the above object, the technical solution adopted by the present invention is:
A kind of robust Model approximating method removing outlier based on spectral clustering, specifically includes following steps:
Similar matrix between step 1, structure preference matrix and its row vector;
From N number of observation data point X={ x1,x2,...,xNIn, M subsets of stochastical sampling, it is false that estimation obtains M model If Θ={ θ12,...,θM};Then each model hypothesis is given to assign corresponding weight:
Wherein, nmIt is that m-th of the interior of model hypothesis is counted out;It is i-th of data point relative to m-th of model hypothesis Residual error;SmIt is the interior spot noise scale estimated by IKOSE methods;ψ () and hmIt is kernel function and its corresponding bandwidth;
The adaptive weight threshold obtained by IKOSE methods removes the low model hypothesis of weight, retains G more Add model hypothesis Θ={ θ of robust12,...,θG};So, preference of N number of observation data point relative to this G model hypothesis Matrix is represented by:
Wherein, P is the two-dimensional matrix of N*M dimensions;ri gIt is residual error of i-th of data point relative to g-th of model hypothesis; SgIt is interior spot noise scale;
By in preference matrix row vector P (i,:) and P (j,:) distance d (P (i,:),P(j,:)) structure similar matrix:
Wherein, σ is the parameter of exponential function;
Step 2, the subspace classification that similar matrix is obtained by spectral clustering, carry out according to the classification score of concept subspace Outlier differentiates;
The Spectral Clustering that can automatically determine subspace number is used to obtain the k sub-spaces of similar matrix as { C1, C2,...,Ck, the concept space of every sub-spaces classification of similar matrix is then built, and calculate the classification point of concept subspace Number, wherein the label that each data point belongs to some structural model example is I={ i1,i2,...,iN};
Then the classification score of concept subspace is differentiated by following formula:
Wherein, n (Ck) indicate to belong to classification CkInterior point, as S (Ck) being more than certain value, then the category is interior classification, it is no It is then outlier classification;Outlier classification is removed, the corresponding interior point of more structural model data class is retained;
Step 3 judges whether to stop models fitting according to stopping function;
First, the interior point of obtained model data class is fitted according to least square method, each model can be obtained Parameter:
Wherein, θlIt is the model parameter that first of model structure is obtained by least square method;Expression passes through spectrum Cluster all interior points of obtained l model data classes;
Then, the residual sum of squares (RSS) that the corresponding interior point of each model parameter obtains is counted:
Finally, we obtain the item whether models fitting stops according to the difference of the residual sum of squares (RSS) of front and back iteration twice Part:
ftl)-ft-1l) < δ (8);
If difference is less than threshold value δ, iteration ends export the model of outlier classification and more structural model examples Parameter;Otherwise, continue sampled point subset in the interior point data of more structural model data categories, estimate model hypothesis and build phase Like matrix, then repeatedly step 2 and step 3 continue next iteration, until meeting formula (8) or reaching greatest iteration time Number.
The kernel function is:
Wherein, t is the threshold constant of setting.
The maximum iteration is 5 times.
After adopting the above scheme, the present invention generates model hypothesis by carrying out multiple repairing weld to input data, builds preference Similar matrix between matrix and its row vector;Then using based on spectral clustering removal outlier and the more structural models of generation Data class;Finally judge to simulate whether approximating method stops according to the stopping function of more structural model instance datas, obtain most Whole models fitting as a result, the model parameter of i.e. more structural model structures.The present invention furthers investigate outlier in spectral clustering mistake The distinction put in Cheng Zhongyu, efficiently removes outlier, accurately estimates more structural model parameters, and interior point and outlier classification Result can instruct the subsequent sampling process of local sampling strategy.
Description of the drawings
Fig. 1 is existing pattern fitting method schematic diagram;
Fig. 2 is flow chart of the method for the present invention;
Fig. 3 is the subspace classification range distribution schematic diagram obtained based on Spectral Clustering.
Specific implementation mode
As shown in Fig. 2, present invention is disclosed a kind of robust Model approximating method removing outlier based on spectral clustering, tool Body includes the following steps:
Similar matrix between step 1, structure preference matrix and its row vector;
It is assumed that from N number of observation data point X={ x1,x2,...,xNIn, M subsets of stochastical sampling, estimation obtains M Model hypothesis Θ={ θ12,...,θM, it is then quasi- in model in order to reduce some redundancies and poor robustness model hypothesis Influence during conjunction, we assign corresponding weight to each model hypothesis:
Wherein, nmIt is that m-th of the interior of model hypothesis is counted out;ri mIt is i-th of data point relative to m-th of model hypothesis Residual error;SmIt is the interior spot noise scale estimated by IKOSE methods;ψ () and hmIt is kernel function and its corresponding bandwidth.
The expression formula for the kernel function that the present invention uses is as follows:
Wherein, t is the threshold constant of setting, we set it to 2.5.
The adaptive weight threshold obtained again by IKOSE methods removes the lower model hypothesis of some weights, Retain G more robust model hypothesis Θ={ θ12,...,θG}.So, N number of observation data point is relative to this G model The preference matrix of hypothesis is represented by:
Wherein, P is the two-dimensional matrix of N*M dimensions;ri gIt is residual error of i-th of data point relative to g-th of model hypothesis; SgIt is interior spot noise scale.
By in preference matrix row vector P (i,:) and P (j,:) distance d (P (i,:),P(j,:)) it may make up similar square Battle array:
Wherein, σ is the parameter of exponential function.
Step 2, the subspace classification that similar matrix is obtained by spectral clustering, carry out according to the classification score of concept subspace Outlier differentiates;
It is found in previous research:To row vector P (i, m) the structure concept space of preference matrix, can be obtained interior point and The different range distribution of outlier can remove the outlier in data according to the size of range distribution.And this patent passes through experiment It was found that it is that five lines are based on spectral clustering also to have similar property, Fig. 3 in concept space of the similar matrix per sub-spaces classification The range distribution schematic diagram of data point of the subspace classification that method obtains in concept space in classification.
Therefore, the present invention is obtained similar using the Spectral Clustering (Self-Tuning) that can automatically determine subspace number The k sub-spaces of matrix are { C1,C2,...,Ck, every sub-spaces classification of similar matrix is then built in concept space Range distribution, and calculate the classification score of concept subspace, judge its category attribute (interior classification or outlier classification), In each data point to belong to the label of some structural model example (classification) be I={ I1,I2,...,IN}.On this basis not only The classification containing larger outlier ratio can be obtained, and can be to the interior point minute of the model instance comprising more structured datas Class.The numerical score that the classification of outlier can be obtained by formula (5) is differentiated:
Wherein, wherein IiRefer to i-th of data and belongs to which subspace classification, each concept subspace classification CkIt will basis All data points obtain its fractional value S (C in its classificationk)。n(Ck) indicate to belong to classification CkInterior point, as S (Ck) be more than centainly Then the category is interior classification to value, is otherwise outlier classification.Outlier classification is removed, it is each to retain more structural model data class Self-corresponding interior point.
Differentiate that outlier is advantageous in that in the subspace structure concept space of similar matrix:In the subspace classification of generation In may include multiple outlier classes, multiple and different distributions is fitted to outlier in this way, is more nearly truthful data.
Step 3 judges whether to stop models fitting according to stopping function;
First, the interior point of obtained model data class is fitted according to least square method (Least Square), it can To obtain the parameter of each model:
Wherein, θlIt is the model parameter that first of model structure is obtained by least square method;Expression passes through spectrum Cluster all interior points of obtained l model data classes.
Then, the residual sum of squares (RSS) that the corresponding interior point of each model parameter obtains is counted:
Finally, we obtain the item whether models fitting stops according to the difference of the residual sum of squares (RSS) of front and back iteration twice Part:
ftl)-ft-1l) < δ (8);
If difference is less than threshold value δ, iteration ends export the model of outlier classification and more structural model examples Parameter;Otherwise, continue sampled point subset in the interior point data of more structural model data categories, estimate model hypothesis and build phase Like matrix, then repeatedly step 2 and step 3 continue next iteration, until meeting formula (8) or reaching maximum iteration (5 times).
When for more structural model examples of relatively simple lines the case where, pass through one to cluster process twice Simultaneously accurately to export the model parameter of outlier and more structured datas.And when in face of needing to estimate that the list of more structural models is answered It, can be according to once clustering as a result, change in obtained interior point data guidance next time when the complex situations such as matrix and basis matrix For the sampling of process, generate more robust comprising more interior model hypothesis put.This have the advantage that by each iteration, We can obtain cleaner and more robust model hypothesis, and preferably distinguish outlier information.
In order to verify the performance of the present invention, above-mentioned pattern fitting method, code fortune are realized with Matlab Programming with Pascal Language Capable hardware platform is 8 core processors of 3.4GHZ.More structural models of five straight lines of the selection comprising high proportion outlier are made For test data of experiment collection, including more structural models of five straight lines in the test set, it is 50 to count out in every straight line, Peeling off, to count out be 250, and the total number of data point is 500.
Experiment 50 times is repeated to the test set, the point number of subsets of each initial random acquisition is 2000.We utilize 50 times Average false drop rate and minimum false drop rate as evaluation criterion, while giving the run time of each algorithm as a comparison, The calculation formula of middle false drop rate is as follows:
The pattern fitting method and the classical model approximating method based on outlier removing method that table 1 gives the present invention Comparison result, it is specific as shown in table 1.
Method Average false drop rate (%) Minimum false drop rate (%) Run time (second)
KF 25.02 16.6 2.59
T-linkage 26.07 19.6 24.87
The pattern fitting method of the present invention 16.02 12.6 1.94
Table 1
In table 1, KF be model estimation before remove outlier method, T-linkage be model estimation after remove from The method of group's point.Result can be seen that the pattern fitting method of the present invention is substantially better than other methods from table, achieve minimum Average false drop rate (16.02%) and minimum false drop rate (12.6%);Simultaneously at runtime on, this patent propose method (1.94 seconds) embody the high efficiency of this patent method also below other control methods.
To sum up, the robust Model approximating method proposed by the present invention that outlier is removed based on spectral clustering can reach efficient standard True effect, to provide preferably theoretical base for the practical application of more structural model approximating methods comprising high proportion outlier Plinth.
The above is only the embodiment of the present invention, is not intended to limit the scope of the present invention, therefore every According to the technical essence of the invention to any subtle modifications, equivalent variations and modifications made by above example, this is still fallen within In the range of inventive technique scheme.

Claims (3)

1. a kind of robust Model approximating method removing outlier based on spectral clustering, it is characterised in that:The pattern fitting method Specifically include following steps:
Similar matrix between step 1, structure preference matrix and its row vector;
From N number of observation data point X={ x1,x2,...,xNIn, M subsets of stochastical sampling, estimation obtains M model hypothesis Θ ={ θ12,...,θM};Then each model hypothesis is given to assign corresponding weight:
Wherein, nmIt is that m-th of the interior of model hypothesis is counted out;ri mIt is i-th of data point relative to the residual of m-th model hypothesis Difference;SmIt is the interior spot noise scale estimated by IKOSE methods;ψ () and hmIt is kernel function and its corresponding bandwidth;
The adaptive weight threshold obtained by IKOSE methods removes the low model hypothesis of weight, retains G more Shandongs Model hypothesis Θ={ θ of stick12,...,θG};So, preference matrix of N number of observation data point relative to this G model hypothesis It is represented by:
Wherein, P is the two-dimensional matrix of N*M dimensions;ri gIt is residual error of i-th of data point relative to g-th of model hypothesis;SgIt is Interior spot noise scale;
By in preference matrix row vector P (i,:) and P (j,:) distance d (P (i,:),P(j,:)) structure similar matrix:
Wherein, σ is the parameter of exponential function;
Step 2, the subspace classification that similar matrix is obtained by spectral clustering, peel off according to the classification score of concept subspace Point differentiates;
The Spectral Clustering that can automatically determine subspace number is used to obtain the k sub-spaces of similar matrix as { C1,C2,..., Ck, the concept space of every sub-spaces classification of similar matrix is then built, and calculate the classification score of concept subspace, In each data point to belong to the label of some structural model example be I={ i1,i2,...,iN};
Then the classification score of concept subspace is differentiated by following formula:
Wherein, n (Ck) indicate to belong to classification CkInterior point, as S (Ck) be more than certain value then the category be interior classification, otherwise for Outlier classification;Outlier classification is removed, the corresponding interior point of more structural model data class is retained;
Step 3 judges whether to stop models fitting according to stopping function;
First, the interior point of obtained model data class is fitted according to least square method, the ginseng of each model can be obtained Number:
Wherein, θlIt is the model parameter that first of model structure is obtained by least square method;Expression passes through spectral clustering All interior points of obtained l model data classes;
Then, the residual sum of squares (RSS) that the corresponding interior point of each model parameter obtains is counted:
Finally, we obtain the condition whether models fitting stops according to the difference of the residual sum of squares (RSS) of front and back iteration twice:
ftl)-ft-1l) < δ (8);
If difference is less than threshold value δ, iteration ends export outlier classification and the model ginseng of more structural model examples Number;Otherwise, continue sampled point subset in the interior point data of more structural model data categories, estimate model hypothesis and build similar Matrix, then repeatedly step 2 and step 3 continue next iteration, until meeting formula (8) or reaching maximum iteration.
2. a kind of robust Model approximating method being removed outlier based on spectral clustering according to claim 1, feature are existed In:The kernel function is:
Wherein, t is the threshold constant of setting.
3. a kind of robust Model approximating method being removed outlier based on spectral clustering according to claim 1, feature are existed In:The maximum iteration is 5 times.
CN201810494460.2A 2018-05-22 2018-05-22 A kind of robust Model approximating method removing outlier based on spectral clustering Pending CN108710761A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810494460.2A CN108710761A (en) 2018-05-22 2018-05-22 A kind of robust Model approximating method removing outlier based on spectral clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810494460.2A CN108710761A (en) 2018-05-22 2018-05-22 A kind of robust Model approximating method removing outlier based on spectral clustering

Publications (1)

Publication Number Publication Date
CN108710761A true CN108710761A (en) 2018-10-26

Family

ID=63868523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810494460.2A Pending CN108710761A (en) 2018-05-22 2018-05-22 A kind of robust Model approximating method removing outlier based on spectral clustering

Country Status (1)

Country Link
CN (1) CN108710761A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961086A (en) * 2019-01-28 2019-07-02 平安科技(深圳)有限公司 Abnormal point ratio optimization method and device based on cluster and SSE
CN110163298A (en) * 2019-05-31 2019-08-23 闽江学院 A kind of pattern fitting method of the sampling of fusant collection and model selection
CN110163865A (en) * 2019-05-28 2019-08-23 闽江学院 A kind of method of sampling for unbalanced data in models fitting
CN112132204A (en) * 2020-09-18 2020-12-25 厦门大学 Robust model fitting method based on preference probability weighted sampling
US11113580B2 (en) 2019-12-30 2021-09-07 Industrial Technology Research Institute Image classification system and method

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109961086A (en) * 2019-01-28 2019-07-02 平安科技(深圳)有限公司 Abnormal point ratio optimization method and device based on cluster and SSE
CN109961086B (en) * 2019-01-28 2024-05-31 平安科技(深圳)有限公司 Clustering and SSE-based outlier proportion optimization method and device
CN110163865A (en) * 2019-05-28 2019-08-23 闽江学院 A kind of method of sampling for unbalanced data in models fitting
CN110163865B (en) * 2019-05-28 2021-06-01 闽江学院 Sampling method for unbalanced data in model fitting
CN110163298A (en) * 2019-05-31 2019-08-23 闽江学院 A kind of pattern fitting method of the sampling of fusant collection and model selection
US11113580B2 (en) 2019-12-30 2021-09-07 Industrial Technology Research Institute Image classification system and method
CN112132204A (en) * 2020-09-18 2020-12-25 厦门大学 Robust model fitting method based on preference probability weighted sampling
CN112132204B (en) * 2020-09-18 2022-05-24 厦门大学 Robust model fitting method based on preference probability weighted sampling

Similar Documents

Publication Publication Date Title
CN108710761A (en) A kind of robust Model approximating method removing outlier based on spectral clustering
CN110097543B (en) Hot-rolled strip steel surface defect detection method based on generation type countermeasure network
CN110837768B (en) Online detection and identification method for rare animal protection
CN105844627B (en) A kind of sea-surface target image background suppressing method based on convolutional neural networks
CN104063876B (en) Interactive image segmentation method
CN112183153A (en) Object behavior detection method and device based on video analysis
CN107045631A (en) Facial feature points detection method, device and equipment
CN103415825A (en) System and method for gesture recognition
CN109741268B (en) Damaged image complement method for wall painting
Jung et al. Rigid motion segmentation using randomized voting
CN105426929B (en) Object shapes alignment device, object handles devices and methods therefor
CN108960329B (en) Chemical process fault detection method containing missing data
CN107194344B (en) Human behavior recognition method adaptive to bone center
JP2009140009A5 (en)
CN104156979B (en) Deviant Behavior online test method in a kind of video based on gauss hybrid models
CN103559689A (en) Removal method for point cloud noise points
CN111429481B (en) Target tracking method, device and terminal based on adaptive expression
CN112756324B (en) Article cleaning method and device and terminal equipment
CN102708367A (en) Image identification method based on target contour features
CN109409231B (en) Multi-feature fusion sign language recognition method based on self-adaptive hidden Markov
CN105243352A (en) 3D motion recognition method and device
CN106296747A (en) Robust multi-model approximating method based on structure decision diagram
Tzagkarakis et al. A statistical approach to texture image retrieval via alpha-stable modeling of wavelet decompositions
CN108256578B (en) Gray level image identification method, device, equipment and readable storage medium
CN111723737B (en) Target detection method based on multi-scale matching strategy deep feature learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181026

RJ01 Rejection of invention patent application after publication