CN103366181A - Method and device for identifying scene integrated by multi-feature vision codebook - Google Patents

Method and device for identifying scene integrated by multi-feature vision codebook Download PDF

Info

Publication number
CN103366181A
CN103366181A CN2013102689531A CN201310268953A CN103366181A CN 103366181 A CN103366181 A CN 103366181A CN 2013102689531 A CN2013102689531 A CN 2013102689531A CN 201310268953 A CN201310268953 A CN 201310268953A CN 103366181 A CN103366181 A CN 103366181A
Authority
CN
China
Prior art keywords
vision
code book
fusion
features
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102689531A
Other languages
Chinese (zh)
Inventor
覃剑钊
阎镜予
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Security and Surveillance Technology PRC Inc
Original Assignee
China Security and Surveillance Technology PRC Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Security and Surveillance Technology PRC Inc filed Critical China Security and Surveillance Technology PRC Inc
Priority to CN2013102689531A priority Critical patent/CN103366181A/en
Publication of CN103366181A publication Critical patent/CN103366181A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention provides a method and a device for identifying a scene integrated by a multi-feature vision codebook, which belong to the technologies of image processing and mode identifying. The method comprises the following steps of carrying out multi-feature integration on a local area of a scene image through a local classifier, and obtaining the expression of the multi-feature vision codebook of the local area of the scene image; and carrying out overall integration and classification on the expression of the multi-feature vision codebook according to overall integration parameters and classification parameters which are obtained through pre-training. Compared with a mode of generating the expression of a single-feature vision codebook or the expression of a multi-feature signal vision codebook by using a single-feature estimation probability and then carrying out overall feature integration, the method disclosed by the invention has the advantage that a more accurate automatic scene classification result can be obtained.

Description

The scene Recognition method and apparatus that many features vision code book merges
Technical field
The present invention relates to video image and process and mode identification technology, relate in particular to the scene Recognition method and apparatus that a kind of many features vision code book merges.
Background technology
Image-based scene Recognition technology can with camera or camera acquisition to view data be identified as automatically different scene classifications, for example: sandy beach, forest, highway, street, office, bedroom etc.This technology can be applicable to intelligent automobile, intelligent robot self-navigation field, and the image-based scene Recognition can be other Computer Vision Task simultaneously, for example: object identification, object are found, the behavior classification, image retrieval, video monitoring provide necessary prior imformation.
In recent years, the method based on local feature is widely used in the image-based scene Recognition.These class methods are to blocking, and illumination variation and slight geometric deformation are insensitive, compare with the method based on global characteristics, have stronger robustness.Method based on global characteristics, scene image is treated as an integral body, feature is extracted from entire image, for example: and the color histogram of entire image or the textural characteristics of entire image, then use sorter these features of extracting from entire image are trained and to classify.And based on the method for local feature, extract feature from the regional area of scene image, then component visual or the visual theme good according to training in advance is described as the width of cloth scene image probability distribution or the set of these component visual or visual theme.In order further to improve the performance of scene Recognition or object identification system, multiple many Feature fusions based on the overall situation (for example: Multiple Kernel Learning, the linear enhancing) are proposed for the vision code book of a plurality of different characteristics that merge scene image and express.These generate corresponding single features vision code book for various single features at first respectively and express based on the Feature fusion of the overall situation, then use the methods such as Multiple Kernel Learning or linear enhancing to train fusion parameters and sorting parameter to be used for scene Recognition.But this Feature fusion based on the overall situation, the mistake during single features vision code book that can't remedial frames is expressed, these mistakes can be passed to during global characteristics merges.
Summary of the invention
In view of this, the technical problem to be solved in the present invention provides the scene Recognition method and apparatus that a kind of many features vision code book merges, to carry out many Fusion Features at the scene image local, thereby correct the mistake in the expression of single features vision code book, obtain more accurately single features vision code book expression and carry out overall situation fusion, improve the accuracy of scene Recognition.
It is as follows that the present invention solves the problems of the technologies described above the technical scheme that adopts:
According to an aspect of the present invention, the scene Recognition method that a kind of many features vision code book that provides merges comprises:
By local classifiers the scene image regional area is carried out many Fusion Features, the many features vision code book that obtains the scene image regional area is expressed;
The overall fusion parameters that obtains according to training in advance and sorting parameter are expressed many features vision code book and are carried out the overall situation and merge and classify.
Preferably, by local classifiers image local area is carried out many Fusion Features, the many features vision code book expression that obtains image local area specifically comprises:
Obtain uniformly topography overlapped under the multiple yardstick from scene image;
From each topography, extract various features;
The many features vision code book that obtains by training in advance carries out Fusion Features to the topography of scene image, and the vision code book of generating scene image under various different characteristics expressed.
Preferably, the many features vision code book that obtains by training in advance carries out Fusion Features to the topography of scene image, and the vision code book of generating scene image under various different characteristics expressed and specifically comprised:
Every kind of feature in the localized region uses first the simple classification device to choose candidate's vision word, then uses complex classifier to calculate the probability that local features belongs to candidate's vision word;
Express according to the many features vision code book behind the probability generation Local Feature Fusion of every kind of feature.
Preferably, the overall fusion parameters that obtains according to training in advance and sorting parameter are expressed many features vision code book and are carried out overall situation fusion and classification comprises:
Calculate the posterior probability that scene image belongs to different scene classifications, select the maximum corresponding scene classification of posterior probability as classification results; Perhaps
Calculate scene image in an interfacial side as classification results.
Preferably, various features comprises: gradient orientation histogram feature, structure partial binary pattern feature, color characteristic or structural color feature.
Preferably, comprise that also training sample image obtains the step of many features vision code book of local classifiers, specifically comprises before the method:
According to the sample image generating training data collection that manually carries out the classification demarcation;
The sample image of concentrating from training data obtains the fractional sample image that overlaps each other under the multiple yardstick uniformly;
From each fractional sample image, extract various features;
The various fractional sample characteristics of image that belong to different scene classifications are carried out respectively cluster, generate a series of vision words;
Different set put in the vision word of different characteristic generate the corresponding vision code book of each feature.
Preferably, different set put in the vision word of different characteristic generate after the corresponding vision code book of each feature, also comprise obtaining overall fusion parameters and sorting parameter step, be specially:
Regional area to sample image in the training set carries out Fusion Features, generates the expression of sample image on different characteristic vision code book;
Many Fusion Features of the training overall situation, and store overall fusion parameters and sorting parameter.
Preferably, many Fusion Features of the training overall situation, and store overall fusion parameters and sorting parameter specifically comprises: after the proper vector series connection with many features vision code book, use classifier calculated fusion parameters and sorting parameter;
Or calculate respectively each visual code eigen kernel of vector matrix, calculate weighting parameters and the sorting parameter of each nuclear matrix by Multiple Kernel Learning;
Or respectively each visual code eigen vector is trained independently sorter, learn the weighting parameters of each sorter.
According to another aspect of the present invention, the scene Recognition device that a kind of many features vision code book that provides merges comprises:
Local Fusion Module is used for by local classifiers the scene image regional area being carried out many Fusion Features, and the many features vision code book that obtains the scene image regional area is expressed;
Overall situation Fusion Module is used for the overall fusion parameters that obtains according to training in advance and sorting parameter and many features vision code book is expressed carries out the overall situation and merge and classify.
Preferably, local Fusion Module comprises:
Topography's acquiring unit is used for obtaining uniformly topography overlapped under the multiple yardstick from scene image;
Feature extraction unit is used for extracting various features from each topography;
The vision code book is expressed generation unit, is used for many features vision code book of obtaining by training in advance the topography of scene image is carried out Fusion Features, and the vision code book of generating scene image under various different characteristics expressed.
Preferably, vision code book expression generation unit comprises:
The probability calculation subelement is used for every kind of feature of localized region, uses first the simple classification device to choose candidate's vision word, then uses complex classifier to calculate the probability that local features belongs to candidate's vision word;
The vision code book is expressed computation subunit, is used for expressing according to the many features vision code book behind the probability generation Local Feature Fusion of every kind of feature.
Preferably, this device also comprises training module, training module is used for study and manually carries out the sample image that classification is demarcated, and obtains many features vision code book by sample image being carried out the part fusion, obtains overall fusion parameters and sorting parameter by sample image being carried out overall situation fusion.
The method and apparatus of the embodiment of the invention, by extract many features at image local area, and the sorter of training regional area estimates that topography belongs to the probability of candidate's vision word, generate a plurality of feature vision code books expression and carry out again overall amalgamation judging, with use single features estimated probability, the single vision code book expression of the vision code book expression of generation single features or a plurality of features is carried out the global characteristics fusion again and is compared, can correct the mistake that causes owing to the characteristic information quantity not sufficient, thereby generate more accurately vision code book expression, the global characteristics amalgamation judging is passed through in these more accurately many features vision code book expression again, has improved the accuracy of final scene Recognition.
Description of drawings
The scene Recognition method flow diagram that a kind of many features vision code book that Fig. 1 provides for the embodiment of the invention merges;
The method flow diagram of the many Fusion Features of a kind of regional area that Fig. 2 provides for the preferred embodiment of the present invention;
A kind of multiple dimensioned lower topography division example of obtaining that Fig. 3 provides for the preferred embodiment of the present invention;
A kind of method flow diagram of training many features code book that Fig. 4 provides for the embodiment of the invention;
The method flow diagram of another kind of local many Fusion Features that Fig. 5 provides for the preferred embodiment of the present invention;
The scene Recognition apparatus module structural drawing that a kind of many features vision code book that Fig. 6 provides for the embodiment of the invention merges.
Embodiment
In order to make technical matters to be solved by this invention, technical scheme and beneficial effect clearer, clear, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explain the present invention, is not intended to limit the present invention.
As shown in Figure 1, the scene Recognition method that merges of a kind of many features vision code book of providing of the embodiment of the invention comprises:
S102, by local classifiers the scene image regional area is carried out many Fusion Features, the many features vision code book that obtains the scene image regional area is expressed;
See also Fig. 2, this step can further comprise:
S1021, obtain uniformly topography overlapped under the multiple yardstick from scene image;
Particularly, see also the example that a kind of multiple dimensioned lower topography that Fig. 3 provides divides.On first order yardstick, whole image is made as a whole extraction feature; On the yardstick of the second level, the length and width of topography are half of whole image length and width, have half overlapping between two contiguous topographies; On third level yardstick, the length and width of topography are half of second level topography; By that analogy.Only enumerated a kind of method of obtaining multiple dimensioned topography herein, in the situation of technical resource abundance, can be in the poor extraction of thinner yardstick topography to obtain better performance.
S1022, from each topography, extract various features;
Particularly, above-mentioned various features includes but not limited to the HOG(gradient orientation histogram) feature, structure LBP(local binary patterns) feature, color characteristic or structural color feature.
Wherein, the leaching process of HOG feature mainly comprises: at first topography is divided into some equal portions, then in each equal portions, calculate gradient direction and the intensity of each picture point, then calculate the gradient orientation histogram of each equal portions, the histogram of gradients of each equal portions is together in series obtains the HOG feature at last.
Wherein, the characteristic extraction procedure of structure LBP feature mainly comprises: at first topography is divided into some equal portions, then from each equal portions, extract the LBP feature (magnitude relationship that namely compares each picture point and neighborhood each point, generate binary expression, then to the binary expression statistic histogram of the point in this zone), at last the series connection of the LBP feature of each equal portions is obtained structure LBP feature.
Wherein, the leaching process of structural color feature mainly comprises: at first topography is divided into some equal portions, then extracts color histogram in each equal portions, being together in series at last obtains the structural color feature.
S1023, many features vision code book of obtaining according to training in advance carry out Fusion Features to the topography of scene image, and the vision code book of generating scene image under various different characteristics expressed.
Preferably, this step further comprises: every kind of feature in the localized region, use first simple classification device (as: euclid-distance classifier, Ka Shi distance classification device, Pasteur's distance classification device) chooses candidate's vision word, then use complex classifier (such as support vector machine classifier) to calculate the probability that local features belongs to candidate's vision word; Express according to the many features vision code book behind the probability generation Local Feature Fusion of every kind of feature.Detailed step is referring to Fig. 5 and the explanation thereof of back.
S104, the overall fusion parameters that obtains according to training in advance and sorting parameter are expressed many features vision code book and are carried out the overall situation and merge and classify.
Wherein, if use based on the sorter of statistical model, can belong to by calculating sample characteristics to be identified the posterior probability of different scene classifications, select the maximum corresponding scene classification of posterior probability as classification results.Based on interfacial sorter, determine classification results by calculating sample characteristics to be identified in interfacial which side.
The method of the embodiment of the invention, by in the process that generates the expression of many features vision code book, in the many features of topography's extracted region, and the training local classifiers estimates that topography belongs to the probability of candidate's vision word.Compare with using the single features estimated probability, so local many Feature fusions can be corrected owing to the not enough mistake that causes of single features quantity of information, thereby generate more accurately vision code book expression.After these more accurately many features code book expression generate, can again pass through the Fusion Features of the overall situation, obtain final recognition result.With use single vision code book expression, or the single vision code book that generates a plurality of features carries out overall situation fusion after expressing again and compares, and adopts method of the present invention will obtain higher recognition accuracy.
Be illustrated in figure 4 as a kind of method flow diagram of training many features code book that the embodiment of the invention provides, comprise:
S402, basis are manually carried out the sample image generating training data collection that classification is demarcated;
Specifically, the sample image of collection includes but not limited to by artificial shooting or by searching on the internet many scene images of downloading.In general the training sample of every class scene approximately needs 200~300; For some visual angles and the large indoor scene of content change, need more training sample.And manually classification is carried out in these training and demarcate the generating training data collection.
S404, the sample image of concentrating from training data obtain the fractional sample image that overlaps each other under the multiple yardstick uniformly.
This step is identical with above-mentioned S1021, no longer repeats here.
S406, from each fractional sample image, extract various features;
S408, the various local features that belong to different scene classifications are carried out respectively cluster, generate a series of vision words;
Specifically, the cluster centre point is the feature representation of each vision word; Wherein, cluster includes but not limited to K-Means clustering, hierarchical cluster, fuzzy K-means cluster and simulated annealing cluster etc.
Wherein, K-Means is a kind of clustering method commonly used, by setting clusters number K, generate at random K cluster centre after, upgrade cluster centre and characteristic of correspondence is vectorial by iteration, proper vector is divided into K cluster.The use additive method is not described in detail.
S410, different set put in the vision word of different characteristic generate the corresponding vision code book of each feature;
For instance, suppose to have feature among N kind scene classification and the M, will obtain so M vision code book of N X.
S412, the regional area of sample image in the training set is carried out Fusion Features, generate the expression of sample image on different characteristic vision code book.
Many Fusion Features of S414, the study overall situation, and store overall fusion parameters and sorting parameter.
Particularly, this step can realize by following several modes:
(1) each different characteristic code book is expressed be together in series and obtain all Characteristic of Images vectors in the training set, then according to calibrated sample class training classifier, obtain simultaneously overall fusion parameters and sorting parameter.
Wherein, the concrete training method of sorter is introduced in S507.
(2) express according to the vision code book of different characteristic and obtain the nuclear matrix of different characteristic in training set, the linearity that obtains each nuclear matrix by Multiple Kernel Learning again adds and parameter and sorting parameter.
Wherein, Multiple Kernel Learning, the error rate by minimizing training sample with minimize structure risk (referring generally to maximize training sample and interfacial spacing), ask linear weighted function coefficient (fusion parameters) and the sorting parameter of each nuclear matrix.
(3) respectively each visual code eigen vector is trained independently sorter, then learn the weighting parameters of each sorter.Wherein, these weighting parameters can obtain by the error rate that minimizes training sample.
The typical case of present embodiment uses as at intelligent vehicle navigation, can according to top training method, use vehicle-mounted camera to gather different location (as: street 1, street 2, highway 1, highway 2) trains after scene image and artificial the demarcation.In concrete navigation procedure, vehicle-mounted camera is constantly collected institute through the picture in place, and these pictures just can be known through the recognition methods that the following describes and work as the place that vehicle in front is travelled.
Be illustrated in figure 5 as the method flow diagram of another kind of local many Fusion Features that the preferred embodiment of the present invention provides, comprise:
S501, from each regional area of image, extract certain single features;
S502, from the consistent vision code book of this characteristic type, just select a plurality of candidate's vision words by simple classification device (for example: Euclidean distance, Ka Shi distance, Pasteur distance etc.); Execution in step S503 when also needing to extract other features, otherwise execution in step S505;
Particularly, calculate the distance (Euclidean distance or cassette distance or Pasteur's distance) of vision word in this feature vision code book consistent with this characteristic type, find out minor increment and corresponding vision word, and calculate the distance (Euclidean distance or cassette distance or Pasteur are apart from distance) of other vision words and local features and the ratio of minor increment, therefrom select ratio and form candidate's vision set of letters less than the vision word of certain threshold value and the vision word of minor increment.Wherein, Euclidean distance is the quadratic sum sqrt of difference between two each element of vector.
S503, from image local area, extract the feature of other types;
The proper vector of S504, series connection image local area, execution in step S508 after the element span in the different characteristic proper vector generally need not be carried out normalized simultaneously, otherwise direct execution in step S508;
Wherein normalization refers to the span of each element of proper vector is transformed between 0~1.
S505, obtain in the cluster many proper vectors of the regional area that forms candidate's vision word;
Particularly, according to candidate's vision word, find out the topography zone that forms this vision word in the cluster process, then from these topographies, extract various features.The feature extracting method front is told about.
S506, many proper vectors of each regional area of connecting, the element span in the different characteristic proper vector are not then carried out normalized simultaneously;
Each candidate's vision word can obtain stack features vector.
S507, training is used for the local local classifiers that merges;
Particularly, different regional areas respectively training classifier solve the Fusion Features problem in topography zone.The proper vector in the different corresponding topography of candidate's vision word zones is as dissimilar training sample training classifier (we claim that this sorter is local classifiers).
Wherein, sorter is by the different classes of proper vector statistical model of study, or learns between the different classes of proper vector interphase and finish classification task.Can select support vector machine (but being not limited to) different classes of by study between the interphase of proper vector (or vector linearity (non-linear) mapping), this interphase is minimizing the simultaneous minimization structure risk of training error.
S508, employing local classifiers estimate that this regional area belongs to the probability of different word candidate;
Concrete, according to the proper vector that S504 obtains, estimate that with the local classifiers that S507 obtains this topography zone belongs to the probability of different candidate's vision words.
Wherein for the sorter based on proper vector, the probability that belongs to different candidate's vision words can draw by calculating its posterior probability.For the sorter based on support vector machine, can be by calculated characteristics and interfacial distance estimations probability.
S509, express according to the probability generating feature vision code book of word candidate;
Particularly, the probability that belongs to different candidate's vision words according to each topography zone generates the vision code book of certain feature and expresses.Wherein the vision code book express refer to a proper vector, each element of proper vector recorded for the probability of occurrence of vision word.
At first, according to the vision number of words N of this feature w, generate a N wThe proper vector of dimension sets to 0 each element of this vector.Then according to S508, to each regional area of scene image, the probability of calculated candidate vision word is if this probability greater than the corresponding proper vector element of this candidate's vision word, then upgrades the value of this proper vector element.
Different types of feature is repeated to express for the vision code book of different characteristic after S501~S509 generates Local Feature Fusion.
The scene Recognition apparatus module structural drawing that a kind of many features vision code book that being illustrated in figure 6 as the embodiment of the invention provides merges, this device comprises: training module 10, local Fusion Module 20 and overall Fusion Module 30, wherein:
Training module 10 is used for study and manually carries out the sample image that classification is demarcated, and obtains many features vision code book by sample image being carried out the part fusion, obtains overall fusion parameters and sorting parameter by sample image being carried out overall situation fusion.
Specifically, training module 10 is used for training module and is used for the sample image generating training data collection that basis is manually carried out the classification demarcation; The sample image of concentrating from training data obtains the fractional sample image that overlaps each other under the multiple yardstick uniformly; From each fractional sample image, extract various features; The various fractional sample characteristics of image that belong to different scene classifications are carried out respectively cluster, generate a series of vision words; Different set put in the vision word of different characteristic generate the corresponding vision code book of each feature, training module also is used for: the regional area to the training set sample image carries out Fusion Features, generates the expression of sample image on different characteristic vision code book; Many Fusion Features of the training overall situation, and store overall fusion parameters and sorting parameter.
Local Fusion Module 20 is used for by local classifiers the scene image regional area being carried out many Fusion Features, and the many features vision code book that obtains the scene image regional area is expressed;
Further, local Fusion Module 20 comprises:
Topography's acquiring unit 201 is used for obtaining uniformly topography overlapped under the multiple yardstick from scene image;
Feature extraction unit 202 is used for extracting various features from each topography;
The vision code book is expressed generation unit 203, is used for many features vision code book of obtaining by training in advance the topography of scene image is carried out Fusion Features, and the vision code book of generating scene image under various different characteristics expressed.
Further, vision code book expression generation unit 203 comprises:
Probability calculation subelement 2031 is used for every kind of feature of localized region, uses first the simple classification device to choose candidate's vision word, then uses complex classifier to calculate the probability that local features belongs to candidate's vision word;
The vision code book is expressed computation subunit 2032, is used for expressing according to the many features vision code book behind the probability generation Local Feature Fusion of every kind of feature.
Overall situation Fusion Module 30 is used for the overall fusion parameters that obtains according to training in advance and sorting parameter and many features vision code book is expressed carries out the overall situation and merge and classify.
Need to prove that the technical scheme of the scene Recognition method that many features vision code book of front merges can realize by the device of present embodiment, no longer repeats here.
The method and apparatus of the embodiment of the invention, by extract many features at image local area, and the sorter of training regional area estimates that topography belongs to the probability of candidate's vision word, generate a plurality of feature vision code books expression and carry out again overall amalgamation judging, with use single features estimated probability, generate that single vision code book is expressed or the single vision code book of a plurality of features is expressed and carried out global characteristics again and merge and compare, can correct the mistake that causes owing to the characteristic information quantity not sufficient, thereby generate more accurately vision code book expression, the global characteristics amalgamation judging is passed through in these more accurately many features vision code book expression again, has improved the accuracy of final scene Recognition.
More than with reference to the accompanying drawings of the preferred embodiments of the present invention, be not so limit to interest field of the present invention.Those skilled in the art do not depart from the scope and spirit of the present invention, and can have multiple flexible program to realize the present invention, obtain another embodiment such as the feature as an embodiment can be used for another embodiment.Allly using any modification of doing within the technical conceive of the present invention, be equal to and replace and improve, all should be within interest field of the present invention.

Claims (12)

1. the scene Recognition method that merges of the code book of feature vision more than a kind is characterized in that the method comprises:
By local classifiers the scene image regional area is carried out many Fusion Features, the many features vision code book that obtains the scene image regional area is expressed;
The overall fusion parameters and the sorting parameter that obtain according to training in advance carry out overall situation fusion and classification to described many features vision code book expression.
2. scene Recognition method according to claim 1 is characterized in that, describedly by local classifiers image local area is carried out many Fusion Features, and the many features vision code book that obtains image local area is expressed and comprised:
Obtain uniformly topography overlapped under the multiple yardstick from described scene image;
From each topography, extract various features;
The many features vision code book that obtains by training in advance carries out Fusion Features to the topography of described scene image, and the vision code book of generating scene image under various different characteristics expressed.
3. scene Recognition method according to claim 2, it is characterized in that, the many features vision code book that obtains by training in advance carries out Fusion Features to the topography of described scene image, and the vision code book of generating scene image under various different characteristics expressed and comprised:
To every kind of feature in the described regional area, use first the simple classification device to choose candidate's vision word, then use complex classifier to calculate the probability that local features belongs to candidate's vision word;
Express according to the many features vision code book behind the probability generation Local Feature Fusion of described every kind of feature.
4. scene Recognition method according to claim 1 is characterized in that, the overall fusion parameters that obtains according to training in advance and sorting parameter are expressed described many features vision code book and carried out overall situation fusion and classification comprises:
Calculate the posterior probability that scene image belongs to different scene classifications, select the maximum corresponding scene classification of posterior probability as classification results; Perhaps
Calculate scene image in an interfacial side as classification results.
5. scene Recognition method according to claim 2 is characterized in that, described various features comprises: gradient orientation histogram feature, structure partial binary pattern feature, color characteristic or structural color feature.
6. the described scene Recognition method of any one is characterized in that according to claim 1-5, comprises that also training sample image obtains the step of many features vision code book of local classifiers, specifically comprises before the described method:
According to the sample image generating training data collection that manually carries out the classification demarcation;
The sample image of concentrating from described training data obtains the fractional sample image that overlaps each other under the multiple yardstick uniformly;
From each fractional sample image, extract various features;
The various fractional sample characteristics of image that belong to different scene classifications are carried out respectively cluster, generate a series of vision words;
Different set put in the vision word of different characteristic generate the corresponding vision code book of each feature.
7. scene Recognition method according to claim 6, it is characterized in that, described vision word with different characteristic is put into different set and is generated after the corresponding vision code book of each feature, also comprises obtaining overall fusion parameters and sorting parameter step, is specially:
Regional area to sample image in the training set carries out Fusion Features, generates the expression of sample image on different characteristic vision code book;
Many Fusion Features of the training overall situation, and store overall fusion parameters and sorting parameter.
8. scene Recognition method according to claim 7 is characterized in that, many Fusion Features of the described training overall situation, and store overall fusion parameters and sorting parameter comprises:
After the proper vector series connection with many features vision code book, use classifier calculated fusion parameters and sorting parameter;
Or calculate respectively each visual code eigen kernel of vector matrix, calculate weighting parameters and the sorting parameter of each nuclear matrix by Multiple Kernel Learning;
Or respectively each visual code eigen vector is trained independently sorter, learn the weighting parameters of each sorter.
9. the scene Recognition device that merges of the code book of feature vision more than a kind is characterized in that this device comprises:
Local Fusion Module is used for by local classifiers the scene image regional area being carried out many Fusion Features, and the many features vision code book that obtains the scene image regional area is expressed;
Overall situation Fusion Module, the overall fusion parameters and the sorting parameter that are used for obtaining according to training in advance carry out overall situation fusion and classification to described many features vision code book expression.
10. scene Recognition device according to claim 9 is characterized in that, described local Fusion Module comprises:
Topography's acquiring unit is used for obtaining uniformly topography overlapped under the multiple yardstick from described scene image;
Feature extraction unit is used for extracting various features from each topography;
The vision code book is expressed generation unit, is used for many features vision code book of obtaining by training in advance the topography of described scene image is carried out Fusion Features, and the vision code book of generating scene image under various different characteristics expressed.
11. scene Recognition device according to claim 10 is characterized in that, described vision code book is expressed generation unit and is comprised:
The probability calculation subelement is used for every kind of feature of described regional area, uses first the simple classification device to choose candidate's vision word, then uses complex classifier to calculate the probability that local features belongs to candidate's vision word;
The vision code book is expressed computation subunit, is used for expressing according to the many features vision code book behind the probability generation Local Feature Fusion of described every kind of feature.
12. the described scene Recognition device of any one according to claim 9-11, it is characterized in that, described device also comprises training module, described training module is used for study and manually carries out the sample image that classification is demarcated, obtain many features vision code book by sample image being carried out the part fusion, obtain overall fusion parameters and sorting parameter by sample image being carried out overall situation fusion.
CN2013102689531A 2013-06-28 2013-06-28 Method and device for identifying scene integrated by multi-feature vision codebook Pending CN103366181A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102689531A CN103366181A (en) 2013-06-28 2013-06-28 Method and device for identifying scene integrated by multi-feature vision codebook

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102689531A CN103366181A (en) 2013-06-28 2013-06-28 Method and device for identifying scene integrated by multi-feature vision codebook

Publications (1)

Publication Number Publication Date
CN103366181A true CN103366181A (en) 2013-10-23

Family

ID=49367481

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102689531A Pending CN103366181A (en) 2013-06-28 2013-06-28 Method and device for identifying scene integrated by multi-feature vision codebook

Country Status (1)

Country Link
CN (1) CN103366181A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104318271A (en) * 2014-11-21 2015-01-28 南京大学 Image classification method based on adaptability coding and geometrical smooth convergence
CN104361313A (en) * 2014-10-16 2015-02-18 辽宁石油化工大学 Gesture recognition method based on multi-kernel learning heterogeneous feature fusion
CN104992180A (en) * 2015-06-26 2015-10-21 武汉大学 Multi-feature fusion car logo recognition method and system for traffic tollgates
CN105426924A (en) * 2015-12-14 2016-03-23 北京工业大学 Scene classification method based on middle level features of images
CN106156798A (en) * 2016-07-25 2016-11-23 河海大学 Scene image classification method based on annular space pyramid and Multiple Kernel Learning
CN106599907A (en) * 2016-11-29 2017-04-26 北京航空航天大学 Multi-feature fusion-based dynamic scene classification method and apparatus
CN106612457A (en) * 2016-11-09 2017-05-03 广州视源电子科技股份有限公司 Method and system for video sequence alignment
CN107967457A (en) * 2017-11-27 2018-04-27 全球能源互联网研究院有限公司 A kind of place identification for adapting to visual signature change and relative positioning method and system
CN108604303A (en) * 2016-02-09 2018-09-28 赫尔实验室有限公司 General image feature from bottom to top and the from top to bottom system and method for entity classification are merged for precise image/video scene classification
CN111325290A (en) * 2020-03-20 2020-06-23 西安邮电大学 Chinese painting image classification method based on multi-view fusion and multi-example learning
CN111553374A (en) * 2019-02-12 2020-08-18 腾讯大地通途(北京)科技有限公司 Road scene dividing method and device, electronic equipment and storage medium
CN112699855A (en) * 2021-03-23 2021-04-23 腾讯科技(深圳)有限公司 Image scene recognition method and device based on artificial intelligence and electronic equipment
CN112966646A (en) * 2018-05-10 2021-06-15 北京影谱科技股份有限公司 Video segmentation method, device, equipment and medium based on two-way model fusion
CN112770875B (en) * 2018-10-10 2022-03-11 美的集团股份有限公司 Method and system for providing remote robot control
CN114726690A (en) * 2022-04-18 2022-07-08 清华大学 Codebook generation method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814147A (en) * 2010-04-12 2010-08-25 中国科学院自动化研究所 Method for realizing classification of scene images
CN102567722A (en) * 2012-01-17 2012-07-11 大连民族学院 Early-stage smoke detection method based on codebook model and multiple features
CN102609722A (en) * 2012-02-07 2012-07-25 西安理工大学 Method for fusing local shape feature structure and global shape feature structure of video image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101814147A (en) * 2010-04-12 2010-08-25 中国科学院自动化研究所 Method for realizing classification of scene images
CN102567722A (en) * 2012-01-17 2012-07-11 大连民族学院 Early-stage smoke detection method based on codebook model and multiple features
CN102609722A (en) * 2012-02-07 2012-07-25 西安理工大学 Method for fusing local shape feature structure and global shape feature structure of video image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李凤彩: "基于码本模型的场景图像分类研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
舒畅 等: "多特征局部与全局融合的人脸识别方法", 《计算机工程》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104361313A (en) * 2014-10-16 2015-02-18 辽宁石油化工大学 Gesture recognition method based on multi-kernel learning heterogeneous feature fusion
CN104361313B (en) * 2014-10-16 2017-10-31 辽宁石油化工大学 A kind of gesture identification method merged based on Multiple Kernel Learning heterogeneous characteristic
CN104318271B (en) * 2014-11-21 2017-04-26 南京大学 Image classification method based on adaptability coding and geometrical smooth convergence
CN104318271A (en) * 2014-11-21 2015-01-28 南京大学 Image classification method based on adaptability coding and geometrical smooth convergence
CN104992180B (en) * 2015-06-26 2019-01-29 武汉大学 A kind of multiple features fusion automobile logo identification method and system towards traffic block port
CN104992180A (en) * 2015-06-26 2015-10-21 武汉大学 Multi-feature fusion car logo recognition method and system for traffic tollgates
CN105426924A (en) * 2015-12-14 2016-03-23 北京工业大学 Scene classification method based on middle level features of images
CN105426924B (en) * 2015-12-14 2018-12-07 北京工业大学 A kind of scene classification method based on image middle level features
CN108604303B (en) * 2016-02-09 2022-09-30 赫尔实验室有限公司 System, method, and computer-readable medium for scene classification
CN108604303A (en) * 2016-02-09 2018-09-28 赫尔实验室有限公司 General image feature from bottom to top and the from top to bottom system and method for entity classification are merged for precise image/video scene classification
CN106156798A (en) * 2016-07-25 2016-11-23 河海大学 Scene image classification method based on annular space pyramid and Multiple Kernel Learning
CN106156798B (en) * 2016-07-25 2019-10-25 河海大学 Scene image classification method based on annular space pyramid and Multiple Kernel Learning
CN106612457A (en) * 2016-11-09 2017-05-03 广州视源电子科技股份有限公司 Method and system for video sequence alignment
CN106612457B (en) * 2016-11-09 2019-09-03 广州视源电子科技股份有限公司 Video sequence alignment schemes and system
CN106599907B (en) * 2016-11-29 2019-11-29 北京航空航天大学 The dynamic scene classification method and device of multiple features fusion
CN106599907A (en) * 2016-11-29 2017-04-26 北京航空航天大学 Multi-feature fusion-based dynamic scene classification method and apparatus
CN107967457A (en) * 2017-11-27 2018-04-27 全球能源互联网研究院有限公司 A kind of place identification for adapting to visual signature change and relative positioning method and system
CN107967457B (en) * 2017-11-27 2024-03-19 全球能源互联网研究院有限公司 Site identification and relative positioning method and system adapting to visual characteristic change
CN112966646B (en) * 2018-05-10 2024-01-09 北京影谱科技股份有限公司 Video segmentation method, device, equipment and medium based on two-way model fusion
CN112966646A (en) * 2018-05-10 2021-06-15 北京影谱科技股份有限公司 Video segmentation method, device, equipment and medium based on two-way model fusion
CN112770875B (en) * 2018-10-10 2022-03-11 美的集团股份有限公司 Method and system for providing remote robot control
CN111553374B (en) * 2019-02-12 2022-07-26 腾讯大地通途(北京)科技有限公司 Road scene dividing method and device, electronic equipment and storage medium
CN111553374A (en) * 2019-02-12 2020-08-18 腾讯大地通途(北京)科技有限公司 Road scene dividing method and device, electronic equipment and storage medium
CN111325290B (en) * 2020-03-20 2023-06-06 西安邮电大学 Traditional Chinese painting image classification method based on multi-view fusion multi-example learning
CN111325290A (en) * 2020-03-20 2020-06-23 西安邮电大学 Chinese painting image classification method based on multi-view fusion and multi-example learning
CN112699855A (en) * 2021-03-23 2021-04-23 腾讯科技(深圳)有限公司 Image scene recognition method and device based on artificial intelligence and electronic equipment
CN114726690A (en) * 2022-04-18 2022-07-08 清华大学 Codebook generation method and device, electronic equipment and storage medium
CN114726690B (en) * 2022-04-18 2024-03-29 清华大学 Codebook generation method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN103366181A (en) Method and device for identifying scene integrated by multi-feature vision codebook
Li et al. Line-cnn: End-to-end traffic line detection with line proposal unit
Chen et al. Object-level motion detection from moving cameras
EP3620980B1 (en) Learning method, learning device for detecting lane by using cnn and testing method, testing device using the same
CN110674874B (en) Fine-grained image identification method based on target fine component detection
US20160070976A1 (en) Image processing apparatus, image processing method, and recording medium
CN104504366A (en) System and method for smiling face recognition based on optical flow features
CN104915949A (en) Image matching algorithm of bonding point characteristic and line characteristic
CN104200228B (en) Recognizing method and system for safety belt
CN105354565A (en) Full convolution network based facial feature positioning and distinguishing method and system
JP2016062610A (en) Feature model creation method and feature model creation device
Lee et al. Place recognition using straight lines for vision-based SLAM
US10275667B1 (en) Learning method, learning device for detecting lane through lane model and testing method, testing device using the same
CN103136504A (en) Face recognition method and device
CN104615986A (en) Method for utilizing multiple detectors to conduct pedestrian detection on video images of scene change
CN104778474A (en) Classifier construction method for target detection and target detection method
CN108921850B (en) Image local feature extraction method based on image segmentation technology
CN112200186B (en) Vehicle logo identification method based on improved YOLO_V3 model
CN104281572A (en) Target matching method and system based on mutual information
CN109063790B (en) Object recognition model optimization method and device and electronic equipment
CN104268552A (en) Fine category classification method based on component polygons
Thubsaeng et al. Vehicle logo detection using convolutional neural network and pyramid of histogram of oriented gradients
CN104318590A (en) Video target tracking method
Vashisth et al. Histogram of Oriented Gradients based reduced feature for traffic sign recognition
Al Mamun et al. Efficient lane marking detection using deep learning technique with differential and cross-entropy loss.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20131023