CN102609715A - Object type identification method combining plurality of interest point testers - Google Patents

Object type identification method combining plurality of interest point testers Download PDF

Info

Publication number
CN102609715A
CN102609715A CN2012100045450A CN201210004545A CN102609715A CN 102609715 A CN102609715 A CN 102609715A CN 2012100045450 A CN2012100045450 A CN 2012100045450A CN 201210004545 A CN201210004545 A CN 201210004545A CN 102609715 A CN102609715 A CN 102609715A
Authority
CN
China
Prior art keywords
interest
different
point
collective
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012100045450A
Other languages
Chinese (zh)
Other versions
CN102609715B (en
Inventor
罗会兰
井福荣
张彩霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi University of Science and Technology
Original Assignee
Jiangxi University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi University of Science and Technology filed Critical Jiangxi University of Science and Technology
Priority to CN201210004545.0A priority Critical patent/CN102609715B/en
Publication of CN102609715A publication Critical patent/CN102609715A/en
Application granted granted Critical
Publication of CN102609715B publication Critical patent/CN102609715B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of mode identification, computer vision and image understanding and discloses an object type identification method combining a plurality of interest point testers. The method disclosed by the invention comprises the following steps of: firstly, extracting an interest point containing various shapes, edge outline and gray information through different interest point testers, so as to form different expression vectors of an image. A visual dictionary set can be obtained based on different interest point sets, and each member utilizes one different image characteristic. A classifier set is obtained based on the generated visual dictionary set, so as to create an object type cognitive model and a model learning method to adapt to the selecting characteristics according to the current identification task. As shown in a test, the method can combine information detected by different interest point testers and capture different characteristics of the image so as to effectively improve the performance of the traditional object type identification method based on a single visual dictionary.

Description

A kind of object class recognition methods that combines a plurality of points of interest to detect son
Technical field
The invention belongs to pattern-recognition, computer vision, image understanding technical field, be specifically related to the recognition methods of a kind of object class.
Background technology
The identification of object class is a key issue in the computer vision field.The object class model change in must type of handling well with type between similar balance.The mankind can discern many object classes easily, but for computing machine and robot, this task is still extremely challenging.In object class aspect, the variation of illumination condition, geometry deformation blocks with ground unrest or the like and brings many challenges all for effectively study and sane identification.In addition, the object class identification very big difference between also will the interior different instances of type of overcoming.
One sub-picture comprises many information, and characterization one sub-picture how makes it to be used for effectively and efficiently identification.This problem is very difficult, and depends on identification mission.Bag-of-words aspect of model method, very popular recently, because this method is simple and effective.The basic thought of this method is the set of regarding image as sparse point of interest (region-of-interest, or be called salient region).It derives from the morpheme method in the text analyzing, and basic thought is the sparse set of regarding image as autonomous block, and some representative area pieces of sampling from image are every separately then and describe characteristic, uses the description spatial distributions to come presentation video.
Point of interest detects son can be divided into three types: based on profile, based on gray scale and based on parameter model.Many Computer Vision Task depend on low-level feature, and the result receives the influence that detects son of using to a great extent.In computer vision field, detect zone and reached maturation to a certain degree with one type of conversion unchangeability.These unchangeability method for detecting area are applied in the very different fields, comprise in the identification and object classification field based on model.The point of interest that is extracted by different detection possibly include the different information contents.The method that the invention provides a kind of novelty combines a plurality of detection to be used for classified image.Integrated approach provides a kind of effective fusion mode to handle the information that different point of interest comprises.This integrated framework has also mated human visual system's mechanism, and the multiple different clues of acceptance that can walk abreast are discerned different object classes.
Current common recognition to object class Study of recognition is: the first, and the shape of object is complicated with outward appearance and similar object differences is big, so model should be (comprises a lot of parameters, use to mix and describe) of enriching; The second, the outward appearance of object should be a height change in type, so model should be (to allow the variation of parameter) flexibly; The 3rd, for object in type of processing changes and blocks, model should be made up of characteristic, and part is formed in other words, and these characteristics needn't detect in all instances, and these local mutual alignments have constituted further model information; The 4th, it is difficult using priori to come the modelling class, preferably learning model from training sample; The 5th, must consider counting yield.
So utilizing the method for machine learning to carry out object class Study of recognition is current a kind of research tendency.Early stage to set up the method limitation of a fixed model to the manual work of certain objects class very big, possibly not be generalized under multiclass object and the different application scene.But it is generally more intense to the study supervision degree of object class identification at present; The requirement that has is cut apart image in advance; The requirement that has is to the rectangle location of target object; The requirement that has is to image type of giving label, and in addition the most weak supervision sample also can require target object in the sample to occupy the center of sample with absolute predominance, and all samples will have same size.The supervision sample to obtain cost very big, this just means and can not obtain a lot of samples so, sample that also can not all types can both get access to, this has just limited the performance learnt and the width of study.
Human vision system can walk abreast and utilize multiple information to come recognition object, and can both learn a model for every kind of unchangeability, and this thought of integrated study technology just.Non-supervised integrated study technology cluster integrated technology has in other words obtained certain development in recent years, for the supervision degree that reduces the identification of object class with utilize the integrated study Study on Technology to provide the foundation.Exist many point of interest to detect son at present, but which kind of point of interest detect son be more suitable in current task in other words performance how to be difficult to make correct answer.The present invention proposes to use different detection to obtain the different clues of image.Detect on the sub detected point of interest in difference, set up the different visual dictionary.Based on the different visual dictionary, same training image energy collecting quantizes to obtain different trained vector collection, and they have caught the information of image different aspect, on different trained vector collection, can learn to obtain different member classifying devices.When using the image that these sorters of having learnt different aspect object model characteristic classify new, the different members sorter provides their answer, integrated they can obtain the lifting of performance.
The main contribution of this invention has been to propose a kind of method of carrying out the identification of object class based on non-supervised integrated study technology.The present invention can effectively reduce the supervision degree of object class identification, fully utilizes multiple effective information, and the collateral learning object model effectively improves object class identification efficiency and accuracy.
Summary of the invention
Too complicated in order to solve the model that exists in the identification of traditional object class, the supervision degree is crossed the problem of strong and poor robustness, the invention provides a kind ofly to utilize dictionary collective to walk abreast to utilize the method for the multiple object identification information class that exists in the image.
The present invention is a kind of vision dictionary method.It comprises extracts point of interest (or being called marking area) from image, describe point of interest and mark the point of interest vector after describing with the vision dictionary of learning with local description.Just as in text classification, the number of times statistics that each label occurs generates an overall histogram and is used for the presentation video content.Histogram is input to a sorter and comes the object classification in the recognition image.The vision dictionary obtains by the point of interest of training data being described the vector set cluster.Image classification is a difficulty especially for the conventional machines learning algorithm, and main cause is that the quantity of information that comprises of image is too big, and dimension is too high.The too high conventional machines learning method that causes of dimension obtains very unsettled model, and the generalization ability of model is very poor.The present invention is used for image classification with the integrated study technology.Different points of interest detects son and is used for forming vision dictionary collective.Can obtain the different quantization vector collection of same training dataset based on vision dictionary collective.Based on the quantification training set that has comprised the different aspect characteristic, can train different sorters, thereby obtain a sorter collective, every kind of sorter utilizes different information to set up object model.Can obtain unexpected effect when discerning new image with the sorter collective of learning.Integrated approach improves existing learning algorithm through the prediction that combines a plurality of models.A good collective should be that the otherness between the member is bigger in the collective.If the member in the collective is the same, that integrated they can not bring the lifting of performance.So the otherness between the member is a key factor of the extensive error of decision integrated study.The present invention proposes a kind of technology that generates otherness vision dictionary collective and generate respective classified device collective based on vision dictionary collective.
Content of the present invention is set forth as follows:
1, utilizes different point of interest detection to generate to include and enrich shape, the vision dictionary collective of cincture exterior feature and half-tone information
The structure of vision dictionary collective is non-supervised, and the class label of sample only just can be used when training classifier.Receive the inspiration of human perception, motivation of the present invention is parallelly to utilize multiple available clue to come classified image.Come recognition object just as the mankind often use different information, the present invention uses different points of interest to detect son and extracts the pictures different information content.Utilize different point of interest detection to extract to include and enrich shape, the point of interest of cincture exterior feature and half-tone information, the difference that forms image is expressed vector.In different point of interest set, can obtain a vision dictionary collective, each member utilizes a kind of pictures different characteristic.In order to increase the otherness that generates collective; When forming member's vision dictionary; Concentrate from training image at random earlier and select a part of image, after using a kind of different points of interest detection to obtain all points of interest on these images, select a part to form the vision dictionary at random.In vision dictionary collective, can obtain the different quantization vectors of same image.
The process prescription of this method is following:
1) adopts different points of interest to detect son and extract point of interest;
2) use clustering algorithm that the point of interest cluster after describing is obtained a vision dictionary;
3) repeating step 1 is to step 2, up to the vision dictionary collective that generates preset size.
Experimental result shows that the method can merge the different sub detected interest point informations that detect, and catches the characteristic and the information of image different aspect.Use vision wordbook body surface to reach image and the better recognition performance is arranged than traditional graphical representation method based on single vision dictionary.
2, the different points of interest of fusion detect sub detected different aspect characteristics of image and generate sorter collective
After utilizing different points of interest to detect son generation dictionary collective, can obtain the different quantized training dataset based on each member's dictionary.Merge the different sorter of training on the quantification training dataset of different information, thereby can obtain a sorter collective.Each member classifying device is set up model according to the object that is characterized as of different aspect.Through making up difference vision dictionary collective, the sorter collective that can obtain having high diversity.Collective with high diversity can effectively reduce the needed supervision degree of accurate model of setting up.The parallel sub detected different aspect characteristics of image of different detections that utilizes of the present invention comes classified image, the characteristics of using the different visual dictionary to come the presentation video different aspect.Obtain the different quantization vector collection of training dataset based on resulting vision dictionary collective.Different quantization vector collection study based on same training dataset obtain sorter collective, and the different models in the collective can be caught different character.Concrete step is following:
1) generate vision dictionary collective, each member's vision dictionary merges the different sub detected different aspect characteristics of image that detect;
2) based on member's vision dictionary, training data is quantized;
3) sorter of study on the training dataset after the quantification;
4) repeating step 2 generates the sorter collective of preset size to step 3.
3, integrated vision dictionary collective and corresponding sorter collective recognition object class
Member's vision dictionary is independently with corresponding member classifying device, can parallel training.After forming based on the sorter collective of vision dictionary collective, when classifying a new test pattern, quantification and the model that application is acquired of extraction and description, image that equally also comprises point of interest is to the process of quantization vector.The classification results of integrated classifier collective is exported integrated result and is used for classified image.Concrete step is following:
1) utilizes the different sons that detect that new images is detected point of interest, and utilize descriptor to describe these points of interest;
2) based on corresponding member's vision dictionary, new images is quantized;
3) use corresponding member classifying device classification new images, obtain classification results;
4) repeating step 2 has obtained the classification results of oneself to step 3 up to each member classifying device;
5) utilize the classification results of the integrated member classifying device of integrated technology to obtain final object class label.
To sum up the inventive method is at first used different points of interest to detect son and is detected the point of interest that comprises training image different aspect information, and cluster obtains a vision dictionary that can characterize a kind of image information on the interest point set after the description.Based on this vision dictionary former training plan image set is quantized, thereby obtain the different quantized vector set, training obtains coming according to customizing messages the model of minute type objects on this vector set.This process is parallel carries out, and each processor uses different points of interest to detect son and catches the model that pictures different information is learnt object, sees shown in Figure 1.After extracting the point of interest of new images; Walk abreast and use the member in the vision dictionary collective respectively image to be quantized; Use respective classified device member to discern then, the recognition result according to all member classifying devices carries out the integrated final recognition result that provides at last, sees shown in Figure 2.
The present invention comes recognition object through generating the vision dictionary collective that can express the object multi-aspect information.With respect to the object class recognition methods based on single vision dictionary, the method has strong robustness, puts into practice advantages such as simple and average effective.This method can detect sub detected interest point information with difference and merge in each vision dictionary; Catch the characteristic and the information of image different aspect; Thereby sorter collective of parallel generation; Reduced the complexity of finding the solution, so this invention also can effectively improve the consumption of counting yield, minimizing computational resource, recognition object fast and accurately.
The average behavior that the present invention has on the different field data set is better, the advantage of strong robustness, and model is simple, is highly suitable for general operation person.It does not need the adjustment of complex parameters, and the supervision degree is low, and to training data require low.Utilize the intrinsic concurrency of integrated study, can on a plurality of processors, utilize a small amount of training data collateral learning, so efficient of the present invention is also higher relatively.
Description of drawings
Fig. 1 is an exemplary plot of the present invention.
Fig. 2 is the exemplary plot of new images being classified with vision dictionary collective of learning and sorter collective.
Embodiment
The preferred specific embodiment of the present invention:
Change the image size, make every sub-picture approximately comprise 40,000 pixels (aspect ratio reservation).Because the SIFT descriptor is the most popular and the most effective descriptor, and most existing correlation technique all uses 128 dimension SIFT vectors to describe point of interest.So preferred specific embodiment also uses it to describe point of interest.Select new training subclass of image formation of 60% at every turn.From every sub-picture, select 60 points of interest at random, come constructor's vision dictionary with k-means.Because the intrinsic randomness of k-means algorithm, so when forming different member's dictionaries, be equivalent to use different cluster devices.In the great majority research relevant with " bag-of-words " model, the size of vision dictionary is between 100 to 1000, so this parameter is arranged to intermediate value 500.Linear SVM (Support Vector Machine) is at acquistion to a sorter of going to school based on the quantization vector collection of each member's dictionary.Size of 9 formation of this process iteration is 9 sorter collective.When the new image of test, sorter collective is used for classified image, and consistance function C SPA is used for the integrated result of collective.CSPA calculates each image to being in the probability in same type based on sorter collective, thereby sets up a similarity matrix.
In order to detect different points of interest, 9 kinds of following different points of interest detect the different information contents that son is used for extracting image, are 9 collective so can obtain size:
1) the Harris point of interest detects son;
2) the SUSAN point of interest detects son;
3) the LOG point of interest detects son;
4) Harris Laplace point of interest detects son;
5) the Gilles point of interest detects son;
6) the SIFT point of interest detects son parameter PeakThresh=5 is set;
7) the SIFT point of interest detects son parameter PeakThresh=0 is set;
8) selecting radius at random is 100 of the border circular areas of 10 to 30 pixels;
9) selecting radius at random is 500 of the border circular areas of 10 to 30 pixels.
Experimental result shows that the preferred specific embodiment of the present invention has more performance than traditional recognition methods based on single vision dictionary, even has surpassed some performances through the complex model of meticulous parameter adjustments.

Claims (4)

1. object class recognition methods that combines a plurality of points of interest to detect son is characterized in that utilizing different point of interest detection to extract to include and enriches shape, the vision dictionary collective of cincture exterior feature and half-tone information, and concrete steps are following:
1) adopts different points of interest to detect son and extract point of interest;
2) use clustering algorithm that the point of interest cluster after describing is obtained a vision dictionary;
3) repeating step 1 is to step 2, up to the vision dictionary collective that generates preset size.
2. method according to claim 1 is characterized in that said point of interest detects son detects sub-detected image with following 9 kinds of different points of interest point of interest:
1) the Harris point of interest detects son;
2) the SUSAN point of interest detects son;
3) the LOG point of interest detects son;
4) Harris Laplace point of interest detects son;
5) the Gilles point of interest detects son;
6) the SIFT point of interest detects son parameter PeakThresh=5 is set;
7) the SIFT point of interest detects son parameter PeakThresh=0 is set;
8) selecting radius at random is 100 of the border circular areas of 10 to 30 pixels;
9) selecting radius at random is 500 of the border circular areas of 10 to 30 pixels.
3. method according to claim 2 is characterized in that merging the sub detected different aspect characteristics of image of different detections and generates sorter collective, and concrete steps are following:
1) generate vision dictionary collective, each member's vision dictionary merges the different sub detected different aspect characteristics of image that detect;
2) based on member's vision dictionary, training data is quantized;
3) sorter of study on the training dataset after the quantification;
4) repeating step 2 generates the sorter collective of preset size to step 3.
4. method according to claim 3 is characterized in that integrated vision dictionary collective and corresponding sorter collective recognition object class, and concrete steps are following:
1) utilizes the different sons that detect that new images is detected point of interest, and utilize descriptor to describe these points of interest;
2) based on corresponding member's vision dictionary, new images is quantized;
3) use corresponding member classifying device classification new images, obtain classification results;
4) repeating step 2 has obtained the classification results of oneself to step 3 up to each member classifying device;
5) utilize the classification results of the integrated member classifying device of integrated technology to obtain final object class label.
CN201210004545.0A 2012-01-09 2012-01-09 Object type identification method combining plurality of interest point testers Expired - Fee Related CN102609715B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210004545.0A CN102609715B (en) 2012-01-09 2012-01-09 Object type identification method combining plurality of interest point testers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210004545.0A CN102609715B (en) 2012-01-09 2012-01-09 Object type identification method combining plurality of interest point testers

Publications (2)

Publication Number Publication Date
CN102609715A true CN102609715A (en) 2012-07-25
CN102609715B CN102609715B (en) 2015-04-08

Family

ID=46527074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210004545.0A Expired - Fee Related CN102609715B (en) 2012-01-09 2012-01-09 Object type identification method combining plurality of interest point testers

Country Status (1)

Country Link
CN (1) CN102609715B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105184212A (en) * 2014-04-04 2015-12-23 卡姆芬德公司 Image processing server
CN108241870A (en) * 2016-12-23 2018-07-03 赫克斯冈技术中心 For distributing specific class method for distinguishing interested in measurement data
CN109145936A (en) * 2018-06-20 2019-01-04 北京达佳互联信息技术有限公司 A kind of model optimization method and device
US10825048B2 (en) 2013-05-01 2020-11-03 Cloudsight, Inc. Image processing methods
CN113837080A (en) * 2021-09-24 2021-12-24 江西理工大学 Small target detection method based on information enhancement and receptive field enhancement
CN113936738A (en) * 2021-12-14 2022-01-14 鲁东大学 RNA-protein binding site prediction method based on deep convolutional neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1691054A (en) * 2004-04-23 2005-11-02 中国科学院自动化研究所 Content based image recognition method
CN101807259A (en) * 2010-03-25 2010-08-18 复旦大学 Invariance recognition method based on visual vocabulary book collection

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1691054A (en) * 2004-04-23 2005-11-02 中国科学院自动化研究所 Content based image recognition method
CN101807259A (en) * 2010-03-25 2010-08-18 复旦大学 Invariance recognition method based on visual vocabulary book collection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUI-LAN LUO,HUI WEI,FAN-XING HU: "Improvements in image categorization using codebook ensembles", 《IMAGE AND VISION COMPUTING》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10825048B2 (en) 2013-05-01 2020-11-03 Cloudsight, Inc. Image processing methods
CN105184212A (en) * 2014-04-04 2015-12-23 卡姆芬德公司 Image processing server
CN108241870A (en) * 2016-12-23 2018-07-03 赫克斯冈技术中心 For distributing specific class method for distinguishing interested in measurement data
CN108241870B (en) * 2016-12-23 2022-01-25 赫克斯冈技术中心 Method for assigning a specific category of interest within measurement data
CN109145936A (en) * 2018-06-20 2019-01-04 北京达佳互联信息技术有限公司 A kind of model optimization method and device
CN109145936B (en) * 2018-06-20 2019-07-09 北京达佳互联信息技术有限公司 A kind of model optimization method and device
CN113837080A (en) * 2021-09-24 2021-12-24 江西理工大学 Small target detection method based on information enhancement and receptive field enhancement
CN113837080B (en) * 2021-09-24 2023-07-25 江西理工大学 Small target detection method based on information enhancement and receptive field enhancement
CN113936738A (en) * 2021-12-14 2022-01-14 鲁东大学 RNA-protein binding site prediction method based on deep convolutional neural network

Also Published As

Publication number Publication date
CN102609715B (en) 2015-04-08

Similar Documents

Publication Publication Date Title
Zhang et al. Slow feature analysis for human action recognition
Bregonzio et al. Fusing appearance and distribution information of interest points for action recognition
Wang et al. Semi-latent dirichlet allocation: A hierarchical model for human action recognition
Peng et al. Exploring Motion Boundary based Sampling and Spatial-Temporal Context Descriptors for Action Recognition.
Liu et al. Depth context: a new descriptor for human activity recognition by using sole depth sequences
CN101807259B (en) Invariance recognition method based on visual vocabulary book collection
Rahmani et al. Discriminative human action classification using locality-constrained linear coding
Nour el houda Slimani et al. Human interaction recognition based on the co-occurence of visual words
CN102609715B (en) Object type identification method combining plurality of interest point testers
CN103854016A (en) Human body behavior classification and identification method and system based on directional common occurrence characteristics
CN104050460B (en) The pedestrian detection method of multiple features fusion
Yingxin et al. A robust hand gesture recognition method via convolutional neural network
Zhang et al. Semantically modeling of object and context for categorization
Song et al. Visual-context boosting for eye detection
Symeonidis et al. Neural attention-driven non-maximum suppression for person detection
Chen et al. Generalized Haar-like features for fast face detection
CN113822134A (en) Instance tracking method, device, equipment and storage medium based on video
Bhattacharya et al. Covariance of motion and appearance featuresfor spatio temporal recognition tasks
Rasel et al. An efficient framework for hand gesture recognition based on histogram of oriented gradients and support vector machine
CN102609718A (en) Method for generating vision dictionary set by combining different clustering algorithms
Aiouez et al. Real-time Arabic Sign Language Recognition based on YOLOv5.
Narang et al. Devanagari character recognition in scene images
Chuang et al. Hand posture recognition and tracking based on bag-of-words for human robot interaction
Li et al. Human Action Recognition Using Multi-Velocity STIPs and Motion Energy Orientation Histogram.
Shebiah et al. Classification of human body parts using histogram of oriented gradients

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20150408

Termination date: 20200109