CN103605984B - Indoor scene sorting technique based on hypergraph study - Google Patents
Indoor scene sorting technique based on hypergraph study Download PDFInfo
- Publication number
- CN103605984B CN103605984B CN201310566625.XA CN201310566625A CN103605984B CN 103605984 B CN103605984 B CN 103605984B CN 201310566625 A CN201310566625 A CN 201310566625A CN 103605984 B CN103605984 B CN 103605984B
- Authority
- CN
- China
- Prior art keywords
- image
- hypergraph
- lrm
- linear regression
- semi
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Indoor scene sorting technique based on hypergraph study, relates to indoor scene classification.More or less a hundred target detection is used to extract target from image, according to the super descriptor of the goal descriptor composition formed as the feature descriptor of image;Use k nearest neighbor method that image descriptor is built hypergraph, calculate its Laplacian Matrix, build semi-supervised learning framework;Build a linear regression model (LRM), and this linear regression model (LRM) is joined in semi-supervised learning framework;According to constructed semi-supervised learning framework, and combine the feature descriptor of extracted image, parts of images descriptor is labeled, this semi-supervised learning frame is made can to dope to automatic Iterative the label not marking image, thus complete image classification, meanwhile, linear regression model (LRM) is initialised during automatic Iterative;Foundation linear regression model (LRM), and combine the feature descriptor of extracted image, the data being newly added directly can be carried out image classification, and hypergraph need not be again pulled up.
Description
Technical field
The present invention relates to indoor scene classification, especially relate to a kind of indoor scene sorting technique based on hypergraph study.
Background technology
At present, the general feature descriptor using low level of indoor scene classification, mainly include the information such as color, texture, shape.
The feature descriptor of these low levels has preferable effect to outdoor scene classification, yet with the kind of object that indoor scene is complicated
And overlap, thus performance is general on indoor scene classifying quality.Along with the development of correlation technique, there are some images improved special
Levy descriptor and be introduced into the classifying quality for improving image, such as pyramid matching attribute ([1] S.Lazebnik, C.Schmid, and
J.Ponce,“Beyond bags of features:Spatial pyramid matching for recognizing natural scene
categories,”in Proc.IEEE Int.Conf.Computer Vision and Pattern Recognition,2006,vol.2,pp.
2169 2178), global description's ([2] C.Siagian and L.Itti, " Rapid biologically-inspired scene
classification using features shared with visual attention,”IEEE Trans.Pattern Anal.Mach.Intell.,
Vol.29, no.2, pp.300 312, Feb.2007) etc., but these characteristics of image improved describe owing to not solving indoor
The key problem of scene image, can not be significantly increased the classifying quality of indoor scene.Use and high-level comprise image language
The feature descriptor of justice, owing to saving the substantial amounts of semanteme of image, it is possible to identify multiple object in indoor scene, to improving room
Interior scene image classifying quality important role.
With in high-level image descriptor, employing a series of image, semantic attribute of having researched and proposed in early days is believed to describe image
Breath, these methods describing image obtain good effect in Image Acquisition and image classification field.Stanford University laboratory
It is also proposed that one new is super descriptor ([3] L.Li, H.Su, E.Xing and F.Li, " Object Bank:A High-Level
Image Representation for Scene Classification and Semantic Feature Sparsification,”Proceedings
Of the Neural Information Processing Systems (NIPS), 2010) image is described, this image descriptor is being retouched
State to have on the image of the class with complex object, especially off-the-air picture and preferably describe effect.But these images are classified still
Conventional full measure of supervision is so used to classify, it is impossible to enough to consider global property information and the local number of all data
It is believed that the relation between breath, so showing the most general on image classifying quality.
Summary of the invention
It is an object of the invention to provide a kind of indoor scene sorting technique based on hypergraph study.
The present invention comprises the following steps:
(1) more or less a hundred target detection is used to extract target from image, further according to one of the goal descriptor composition formed
Super descriptor, as the feature descriptor of image;
(2) use k nearest neighbor method that the image descriptor of all generations builds hypergraph, and hypergraph based on generation calculates it and draws
This matrix of pula, and then build semi-supervised learning framework;
(3) build a linear regression model (LRM), and this linear regression model (LRM) is joined in semi-supervised learning framework;
(4) feature descriptor of the image that semi-supervised learning framework constructed in foundation step (3), and integrating step (1) is extracted,
Parts of images descriptor is labeled so that this semi-supervised learning frame can dope to automatic Iterative the label not marking image,
Thus complete image classification, meanwhile, the linear regression model (LRM) in step (3) is initialised during automatic Iterative;
(5) feature descriptor of the image that the linear regression model (LRM) in foundation step (3), and integrating step (1) is extracted, can be right
The data being newly added directly carry out image classification, and need not again pull up hypergraph.
In step (2), the concrete grammar of described structure semi-supervised learning framework can be:
First calculate the feature descriptor Euclidean distance between any two of the image of extraction, and obtain correlation matrix H with this:
Wherein υ represents the node of hypergraph, and e represents the limit of hypergraph;
And then weight w (e) of each edge in hypergraph, the number of degrees d (υ) of each node and number of degrees δ (e) on every super limit can be calculated,
W (e), d (υ) and δ (e) can be used to construct its relevant diagonal matrix W, D as diagonal element simultaneouslyυAnd De, according to these three
Diagonal matrix and correlation matrix, can be calculated intermediate object program Θ is:
Use unit matrix I to deduct Θ then can obtain:
L=I-Θ
Result of calculation L i.e. represents the Laplacian Matrix of this hypergraph;Semi-supervised learning can be constructed based on this Laplacian Matrix
The regularization term of framework:
Ω(f)=fTLf
Wherein f represents the label vector needing prognostic chart picture, fTRepresent the transposed vector of vector f;And then construct semi-supervised frame
Frame, its formula is as follows:
Wherein Y represents the matrix being labeled image, and tr represents calculating matrix trace, and lambda parameter is the number of a non-negative, controls
The balance between model complexity and empirical loss function;By calculating this formula, the prediction label of total data can be obtained
F。
In step (3), described linear regression model (LRM), its effect is to the data being newly added, it is possible to use this linear regression model (LRM)
Directly carry out image classification, and hypergraph need not be again pulled up;Linear regression model (LRM) formula is as follows:
g(x)=QTx+θ
Wherein Q is the first order parameter of linear regression model (LRM), and θ is constant term parameter;This linear regression model (LRM) is embedded into semi-supervised
In habit framework, the newest framework is:
Wherein, X represents the feature descriptor of each image, α and γ as the regular parameter of non-negative, control model complexity and
Balance between empirical loss function;
According to the convex attribute of above-mentioned formula, the partial derivative of parameters can be calculated respectively and try to achieve the optimal solution of F, first use J table
Show this semi-supervised learning framework, if the partial derivative of F and Q obtains equal to 0:
The Q that second equation is tried to achieve is updated in the first equation, can be in the hope of the result of F:
F=(K-αXM)-1Y
Wherein, intermediate object program K represents L+ (λ+α) I, and intermediate object program M represents (α XTX+γI)-1αXT, now will try to achieve F
Substitute into ask and the local derviation formula equation of Q can obtain Q be:
Q=(αXTX+γI)-1αXTF=MF
Obtain Q and be the parameter of linear regression model (LRM), when there being new data to enter, new data need not be built hypergraph, Ke Yizhi
Connect according to formula g (x)=QTX+ θ tries to achieve the label information of new data.
The present invention uses raw image data to build a hypergraph, and uses semi-supervised learning framework to predict the mark not marking image
Sign, scheme more rich information owing to hypergraph itself saves than common, and semi-supervised learning framework not only considers global data
Attribute information, has allowed also for the local message between labeled data and unlabeled data, thus the present invention is at indoor scene
Classification aspect obtains preferable effect.
The invention have the advantages that: use the image descriptor comprising semantic information and semi-supervised learning framework to come room
Interior scene is classified, and can effectively provide the precision that indoor scene is classified.The linear regression model (LRM) simultaneously trained, it is possible to add
The prediction of speed new data label.The present invention is that robot path selects and Indoor Video field provides new technical method, has
Effect improves the efficiency using indoor scene art.
Accompanying drawing explanation
Fig. 1 is the flow chart of the embodiment of the present invention.
Fig. 2 is that the present invention compares schematic diagram with the classifying quality of other sorting techniques.In fig. 2, abscissa is training data
Mark ratio (%), ordinate is classification accuracy (%);Curve a represents hypergraph learning method of the present invention, and curve b represents general
Logical drawing method, curve c represents that k near neighbor method, curve d represent that Laplce's SVMs, curve e represent progressive and directly push away
Formula SVMs, curve f represents common SVMs.
Fig. 3 is the linear regression model (LRM) prediction image tag result schematic diagram that the present invention uses.In figure 3, abscissa is training
The mark ratio (%) of data, ordinate is classification accuracy (%);Curve a represents parameter Q that 10% training data generates,
Curve b represents parameter Q that 20% training data generates, and curve c represents parameter Q that 30% training data generates, curve d table
Showing parameter Q that 40% training data generates, curve e represents parameter Q that 50% training data generates.
Detailed description of the invention
The indoor scene sorting technique based on hypergraph study that the present invention proposes, introduces the concrete technical scheme of the present invention according to Fig. 1
With implement step:
Step one: use more or less a hundred target detection to extract target from image, further according to the goal descriptor composition formed
One super descriptor, as the feature descriptor of image;
Step 2: use k nearest neighbor method that the image descriptor of all generations is built hypergraph, and calculate based on the hypergraph generated
Its Laplacian Matrix, and then construct semi-supervised learning framework;
Step 3: build a linear regression model (LRM), and this linear regression model (LRM) is joined in semi-supervised learning framework;
Step 4: according to semi-supervised learning framework constructed in step 3, and the feature of image that integrating step one is extracted retouches
State symbol, parts of images descriptor is labeled so that this semi-supervised learning frame can dope to automatic Iterative and not mark image
Label, thus complete image classification.Meanwhile, the model of the linear regression in step 3 is initialised during automatic Iterative;
Step 5: according to the model of the linear regression in step 3, and the feature descriptor of image that integrating step one is extracted,
The data being newly added directly can be carried out image classification, and hypergraph need not be again pulled up.
About the concrete grammar building semi-supervised learning framework mentioned in step 2, first retouch according to the feature of the image extracted
State symbol and build hypergraph, and calculate its correlation matrix H:
Wherein υ represents the node of hypergraph, and e represents the limit of hypergraph.And then weight w (e) of each edge, Mei Gejie in hypergraph can be calculated
The number of degrees d (υ) of point and number of degrees δ (e) on every super limit, can use w (e), d (υ) and δ (e) to construct its phase as diagonal element simultaneously
The diagonal matrix W, D closedυAnd De, according to these three diagonal matrix and correlation matrix, can be calculated intermediate object program Θ is:
Use unit matrix I to deduct Θ then can obtain:
L=I-Θ
Result of calculation L i.e. represents the Laplacian Matrix of this hypergraph.Semi-supervised learning can be constructed based on this Laplacian Matrix
The regularization term of framework:
Ω(f)=fTLf
Wherein f represents the label vector needing prognostic chart picture, fTRepresent the transposed vector of vector f.And then construct semi-supervised frame
Frame, its formula is as follows:
Wherein Y represents the matrix being labeled image, and tr represents calculating matrix trace, and lambda parameter is the number of a non-negative, control
Make the balance between model complexity and empirical loss function.By calculating this formula, the pre-mark of total data can be obtained
Sign F.
The model of the linear regression mentioned in step 3, its effect is to the data being newly added, it is possible to use this linear regression model (LRM)
Directly carry out image classification, and hypergraph need not be again pulled up.The model formation of this linear regression is as follows:
g(x)=QTx+θ
Wherein Q is the first order parameter of linear regression model (LRM), and θ is constant term parameter.This linear model is embedded into semi-supervised learning
In framework, the newest framework is:
Wherein, X represents the feature descriptor of each image, α and γ controls complexity and the warp of model as the regular parameter of non-negative
Test the balance between loss function.
According to the convex attribute of above-mentioned formula, the partial derivative of parameters can be calculated respectively and try to achieve the optimal solution of F, first use J
Represent this semi-supervised learning framework, if the partial derivative of F and Q obtains equal to 0:
The Q that second equation is tried to achieve is updated in the first equation, can be in the hope of the result of F:
F=(K-αXM)-1Y
Wherein, intermediate object program K represents L+ (λ+α) I, and intermediate object program M represents (α XTX+γI)-1αXT, now will try to achieve F
Substitute into ask and the local derviation formula equation of Q can obtain Q be:
Q=(αXTX+γI)-1αXTF=MF
Obtain Q and be the parameter of linear regression model (LRM), when there being new data to enter, new data need not be built hypergraph, can direct root
According to formula g (x)=QTX+ θ tries to achieve the label information of new data.
Claims (2)
1. indoor scene sorting technique based on hypergraph study, it is characterised in that comprise the following steps:
(1) more or less a hundred target detection is used to extract target from image, further according to one of the goal descriptor composition formed
Super descriptor, as the feature descriptor of image;
(2) use k nearest neighbor method that the image descriptor of all generations builds hypergraph, and hypergraph based on generation calculates it and draws
This matrix of pula, and then build semi-supervised learning framework;Described structure semi-supervised learning framework method particularly includes:
First calculate the feature descriptor Euclidean distance between any two of the image of extraction, and obtain correlation matrix H with this:
Wherein v represents the node of hypergraph, and e represents the limit of hypergraph;
And then weight w (e) of each edge in hypergraph, number of degrees d (v) of each node and number of degrees δ (e) on every super limit can be calculated,
W (e), d (v) and δ (e) can be used to construct its relevant diagonal matrix W, D as diagonal element simultaneouslyvAnd De, according to these three
Diagonal matrix and correlation matrix, can be calculated intermediate object program Θ is:
Use unit matrix I to deduct Θ then can obtain:
L=I-Θ
Result of calculation L i.e. represents the Laplacian Matrix of this hypergraph;Semi-supervised learning can be constructed based on this Laplacian Matrix
The regularization term of framework:
Ω (f)=fTLf
Wherein f represents the label vector needing prognostic chart picture, fTRepresent the transposed vector of vector f;And then construct semi-supervised frame
Frame, its formula is as follows:
Wherein Y represents the matrix being labeled image, and tr represents calculating matrix trace, and lambda parameter is the number of a non-negative, controls
The balance between model complexity and empirical loss function;By calculating this formula, the prediction label of total data can be obtained
F;
(3) build a linear regression model (LRM), and this linear regression model (LRM) is joined in semi-supervised learning framework;
(4) feature descriptor of the image that semi-supervised learning framework constructed in foundation step (3), and integrating step (1) is extracted,
Parts of images descriptor is labeled so that this semi-supervised learning frame can dope to automatic Iterative the label not marking image,
Thus complete image classification, meanwhile, the linear regression model (LRM) in step (3) is initialised during automatic Iterative;
(5) feature descriptor of the image that the linear regression model (LRM) in foundation step (3), and integrating step (1) is extracted, can be right
The data being newly added directly carry out image classification, and need not again pull up hypergraph.
2. the indoor scene sorting technique learnt based on hypergraph as claimed in claim 1, it is characterised in that in step (3), institute
Stating linear regression model (LRM), its effect is to the data being newly added, it is possible to use this linear regression model (LRM) directly to carry out image classification,
And hypergraph need not be again pulled up;Linear regression model (LRM) formula is as follows:
G (x)=QTx+θ
Wherein Q is the first order parameter of linear regression model (LRM), and θ is constant term parameter;This linear regression model (LRM) is embedded into semi-supervised
In habit framework, the newest framework is:
Wherein, X represents the feature descriptor of each image, α and γ as the regular parameter of non-negative, control model complexity and
Balance between empirical loss function;
According to the convex attribute of above-mentioned formula, the partial derivative of parameters can be calculated respectively and try to achieve the optimal solution of F, first use J table
Show this semi-supervised learning framework, if the partial derivative of F and Q obtains equal to 0:
The Q that second equation is tried to achieve is updated in the first equation, can be in the hope of the result of F:
F=(K-α XM)-1Y
Wherein, intermediate object program K represents L+ (λ+α) I, and intermediate object program M represents (α XTX+γI)-1αXT, now will try to achieve F
Substitute into ask and the local derviation formula equation of Q can obtain Q be:
Q=(α XTX+γI)-1αXTF=MF
Obtain Q and be the parameter of linear regression model (LRM), when there being new data to enter, new data need not be built hypergraph, Ke Yizhi
Connect according to formula g (x)=QTX+ θ tries to achieve the label information of new data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310566625.XA CN103605984B (en) | 2013-11-14 | 2013-11-14 | Indoor scene sorting technique based on hypergraph study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310566625.XA CN103605984B (en) | 2013-11-14 | 2013-11-14 | Indoor scene sorting technique based on hypergraph study |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103605984A CN103605984A (en) | 2014-02-26 |
CN103605984B true CN103605984B (en) | 2016-08-24 |
Family
ID=50124204
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310566625.XA Active CN103605984B (en) | 2013-11-14 | 2013-11-14 | Indoor scene sorting technique based on hypergraph study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103605984B (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105426923A (en) * | 2015-12-14 | 2016-03-23 | 北京科技大学 | Semi-supervised classification method and system |
CN107423547A (en) * | 2017-04-19 | 2017-12-01 | 江南大学 | Increment type location algorithm based on the semi-supervised learning machine that transfinites |
CN109300549B (en) * | 2018-10-09 | 2020-03-17 | 天津科技大学 | Food-disease association prediction method based on disease weighting and food category constraint |
CN109492691A (en) * | 2018-11-07 | 2019-03-19 | 南京信息工程大学 | A kind of hypergraph convolutional network model and its semisupervised classification method |
CN111307798B (en) * | 2018-12-11 | 2023-03-17 | 成都智叟智能科技有限公司 | Article checking method adopting multiple acquisition technologies |
CN110097080B (en) * | 2019-03-29 | 2021-04-13 | 广州思德医疗科技有限公司 | Construction method and device of classification label |
CN110097112B (en) * | 2019-04-26 | 2021-03-26 | 大连理工大学 | Graph learning model based on reconstruction graph |
CN110363236B (en) * | 2019-06-29 | 2020-06-19 | 河南大学 | Hyperspectral image extreme learning machine clustering method for embedding space-spectrum combined hypergraph |
CN111259184B (en) * | 2020-02-27 | 2022-03-08 | 厦门大学 | Image automatic labeling system and method for new retail |
CN113963322B (en) * | 2021-10-29 | 2023-08-25 | 北京百度网讯科技有限公司 | Detection model training method and device and electronic equipment |
CN114463602B (en) * | 2022-04-12 | 2022-07-08 | 北京云恒科技研究院有限公司 | Target identification data processing method based on big data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6598043B1 (en) * | 1999-10-04 | 2003-07-22 | Jarg Corporation | Classification of information sources using graph structures |
CN103020120A (en) * | 2012-11-16 | 2013-04-03 | 南京理工大学 | Hypergraph-based mixed image summary generating method |
-
2013
- 2013-11-14 CN CN201310566625.XA patent/CN103605984B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6598043B1 (en) * | 1999-10-04 | 2003-07-22 | Jarg Corporation | Classification of information sources using graph structures |
CN103020120A (en) * | 2012-11-16 | 2013-04-03 | 南京理工大学 | Hypergraph-based mixed image summary generating method |
Non-Patent Citations (1)
Title |
---|
基于核方法的半监督超图顶点分类算法分析;贾志洋等;《云南师范大学学报》;20130131;46-49 * |
Also Published As
Publication number | Publication date |
---|---|
CN103605984A (en) | 2014-02-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103605984B (en) | Indoor scene sorting technique based on hypergraph study | |
Zhou et al. | Multi-scale deep context convolutional neural networks for semantic segmentation | |
Li et al. | Spatial attention pyramid network for unsupervised domain adaptation | |
Chen et al. | This looks like that: deep learning for interpretable image recognition | |
Zhao et al. | Jsnet: Joint instance and semantic segmentation of 3d point clouds | |
Bosquet et al. | STDnet: Exploiting high resolution feature maps for small object detection | |
Zhu et al. | Soft proposal networks for weakly supervised object localization | |
Zhao et al. | Diversified visual attention networks for fine-grained object classification | |
Bazzani et al. | Self-taught object localization with deep networks | |
Vu et al. | Context-aware CNNs for person head detection | |
CN105005794B (en) | Merge the image pixel semanteme marking method of more granularity contextual informations | |
Lei et al. | Region-enhanced convolutional neural network for object detection in remote sensing images | |
Lin et al. | Multiple instance ffeature for robust part-based object detection | |
CN110414462A (en) | A kind of unsupervised cross-domain pedestrian recognition methods and system again | |
Shen et al. | A survey on label-efficient deep image segmentation: Bridging the gap between weak supervision and dense prediction | |
Yang et al. | Object-aware dense semantic correspondence | |
CN104504366A (en) | System and method for smiling face recognition based on optical flow features | |
Lin et al. | Ru-net: Regularized unrolling network for scene graph generation | |
Zhu et al. | Efficient action detection in untrimmed videos via multi-task learning | |
CN103745233B (en) | The hyperspectral image classification method migrated based on spatial information | |
CN110956158A (en) | Pedestrian shielding re-identification method based on teacher and student learning frame | |
CN110334718A (en) | A kind of two-dimensional video conspicuousness detection method based on shot and long term memory | |
Xu et al. | Consistency-regularized region-growing network for semantic segmentation of urban scenes with point-level annotations | |
Chen et al. | Point-to-box network for accurate object detection via single point supervision | |
Pei et al. | Localized traffic sign detection with multi-scale deconvolution networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |