WO2010111916A1 - 多类目标的检测装置及检测方法 - Google Patents

多类目标的检测装置及检测方法 Download PDF

Info

Publication number
WO2010111916A1
WO2010111916A1 PCT/CN2010/071193 CN2010071193W WO2010111916A1 WO 2010111916 A1 WO2010111916 A1 WO 2010111916A1 CN 2010071193 W CN2010071193 W CN 2010071193W WO 2010111916 A1 WO2010111916 A1 WO 2010111916A1
Authority
WO
WIPO (PCT)
Prior art keywords
classifier
classifiers
strong
detected
data
Prior art date
Application number
PCT/CN2010/071193
Other languages
English (en)
French (fr)
Inventor
梅树起
吴伟国
Original Assignee
索尼公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 索尼公司 filed Critical 索尼公司
Priority to JP2012502431A priority Critical patent/JP5500242B2/ja
Priority to US13/257,617 priority patent/US8843424B2/en
Priority to EP10758018A priority patent/EP2416278A1/en
Publication of WO2010111916A1 publication Critical patent/WO2010111916A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade

Definitions

  • Multi-class target detection device and detection method Multi-class target detection device and detection method
  • the present invention relates to target detection techniques. More particularly, it relates to a detecting apparatus for detecting target data of a plurality of categories and a detecting method thereof.
  • the same type of object may be affected by multiple factors such as illumination, angle of view, and attitude, which may cause a huge difference in the image, which brings great difficulty to the image measurement technology in the image.
  • the same type of object may be divided into multiple subclasses for processing, but how to effectively utilize the commonality among multiple subclasses and accurately distinguish the differences is still a subject that needs further study.
  • the literature [1] proposes a feature sharing technology. By combining the classifiers of multiple types of objects, the features are shared as much as possible among multiple classes to reduce The purpose of computing costs. Simple feature sharing multi-class joint training is very effective in reducing computational cost, and has achieved good results, but its efficiency is low, and sharing of weak classifiers due to shared features leads to the sharing of features in the latter part of strong classifiers. It is getting more and more difficult.
  • Literature [2] further proposed a vector Boosting tree algorithm to measure faces with different perspectives and different poses in the image. However, the algorithm proposed in the same document [2] forces feature sharing among various types, which makes the feature sharing mode to the classifier when a certain class of multiple classes cannot share features well with other types. Further training has brought difficulties.
  • An object of the present invention is to provide a detecting apparatus and a detecting method thereof for detecting target data of a plurality of categories, which are different from the above prior art.
  • a training method for a detecting device for detecting target data of a plurality of categories comprising:
  • Obtaining a feature list of the current level strong classifier by iteratively performing the optimal feature, and simultaneously constructing a set of weak classifiers for the plurality of categories, obtaining a corresponding plurality of strong classifiers capable of processing the plurality of categories Detection device.
  • a training method for a detecting apparatus for detecting a plurality of types of target data according to the present invention is trained using samples of a plurality of types of targets, by determining which of the plurality of categories, among which categories, feature sharing errors are minimized Obtaining a feature sharing sample class set, constructing a weak classifier for each of the optimal feature sharing sample class sets using the selected optimal feature, thereby constructing a detecting device including the weak classifier.
  • Measuring device and detecting method, and the detecting device comprises: an input unit,
  • a joint classifier comprising a strong classifier corresponding to the number of categories and for respectively detecting target data of a corresponding category, wherein each of said strong classifiers is obtained by adding a set of weak classifiers, each The weak classifier uses a feature to perform weak classification on the data to be detected, wherein the joint classifier includes a shared feature list, and each feature in the shared feature list is respectively belonged to one or more of different strong classifiers Weak classifiers are shared; weak classifiers belonging to different strong classifiers using the same feature have different values from each other. In this way, features are shared between strong classifiers for each type of target to reduce computational costs, but classifiers are not shared between classes to reflect inter-class differences.
  • a training method for a detecting apparatus for detecting target data of r categories wherein the r categories may be fine to coarse according to a predetermined similarity criterion
  • the steps are merged into a predetermined multi-layer structure, and the r categories are set as the most detailed category at the lowest level, and r is a natural number greater than 1,
  • the training method includes:
  • each level - the classifier comprises a strong classifier having a number corresponding to the number of categories targeted, said level classifiers forming said detection means in series,
  • the training for the classifier that prepares to detect the m categories in the first stage includes: preparing a positive sample set and a negative sample set for the m categories to be processed for the level classifier, where l ⁇ m ⁇ r;
  • Obtaining a feature list of the current class strong classifier by iteratively performing the optimal feature, and constructing a set of weak classifiers for the m categories, respectively, to obtain a level including m strong classifiers that can process the m categories Classifier.
  • a detecting apparatus for detecting a plurality of (r) categories of target data and a detecting method thereof according to a second aspect of the present invention wherein the plurality of categories are merged into a predetermined order by a similarity criterion a multi-layer structure, and the plurality of categories are set at the lowest level as the most subdivided category, and the detecting means includes:
  • An input unit configured to input data to be detected
  • cascade classifier comprising a plurality of cascaded classifiers configured to pair the layer classes in the predetermined multi-layer structure according to a coarse-to-fine strategy Performing classification processing, and each level classifier includes a strong classifier corresponding to the number of categories processed, wherein each of the strong classifiers includes a set of weak classifiers, each of which uses a feature pair The data to be detected is weakly classified,
  • Each of the level classifiers includes a shared feature list, and each feature in the shared feature list is shared by one or more weak classifiers respectively belonging to different strong classifiers;
  • the weak classifiers of the classifier have different parameter values from each other.
  • a detecting device as a cascade type classifier shares features between strong classifiers of various types of targets to reduce computational cost, but classifiers are not shared between the classes Reflect the differences between classes.
  • multiple categories are combined and processed according to the principle of coarse to fine, and then the categories are gradually split to be refined.
  • Fig. 1 shows a training method of a detecting device for detecting a plurality of types of target data according to a first embodiment of the present invention.
  • Fig. 2 shows a Haar-like feature prototype used in the training method according to the first embodiment of the present invention.
  • Figures 3a and 3b show the structure of the weak classifier and the strong classifier, respectively.
  • Fig. 4 shows a classifier of the detecting device obtained by the training method according to the first embodiment of the present invention.
  • Figures 5a and 5b respectively illustrate the use of the category tree structure CT to represent sample category changes during training.
  • Fig. 6 shows a training method according to a third embodiment of the present invention.
  • Fig. 7 shows a classifier of a detecting device obtained by the training method according to the second or third embodiment of the present invention.
  • Figure 8 illustrates a detection device for detecting a predetermined plurality of types of targets in an image or video in accordance with the present invention.
  • FIG. 9 is a block diagram showing an exemplary structure of a computer in which the present invention is implemented.
  • the first embodiment uses a plurality of types of cars (cars, buses, and trucks) as targets to be detected. It should be appreciated that embodiments of the present invention are not limited to detecting cars in images and/or video, but may also be to other objects in the image and/or video (eg, multi-angle faces), even to real-time networks. Data or host data is intrusion classification, etc. for detection.
  • FIG. 1 shows a method for detecting multiple types of target data according to a first embodiment of the present invention. - The training method 100 of the detection device.
  • the method begins in step S101 by first preparing a positive sample set and a negative sample set for the plurality of categories.
  • a certain number of positive sample sets and negative sample sets are prepared for three types of cars (cars, buses, and trucks), respectively, and the positive sample sets are the same size of the front view of the three types of cars (cars, buses, and trucks).
  • the car image set, the size is unified to 32x32 (pixels); its negative sample set is sampled from the background image set (a set of images that do not contain the target object, the size is not required), and the size is uniformly scaled to 32x32 (pixels).
  • Figure 2 shows the Haar-like feature prototype used.
  • the Haar-like feature is a rectangle defined in the image, including two parts in white and black, respectively, in Figure 2.
  • the orientation of the rectangle is divided into two types: upright and 45 degree tilt.
  • the Haar-like feature prototype has four rectangles in the image (x, y) and the size of the rectangle (width w and height h), which can generate tens of thousands as the position, size, and aspect ratio of the rectangle change.
  • the specific Haar-like characteristics of the meter act on the image.
  • the value of the Haar-like feature is a scalar, and the sum of the gray values of all pixels in the white region is defined as ⁇ ), and the black region is m(fi), then the Haar-like feature value is determined by
  • the training is started from step S102 of Fig. 1.
  • An optimal feature sharing sample class set for feature sharing among the plurality of categories is determined, and an optimal feature is selected by feature traversal. For example, based on a plurality of candidate training features, determining which of the plurality of categories (here, three categories) to share features among the plurality of categories (here, three categories) is minimized by using a forward order selection method or the like, and selecting The features consisting of the determined categories share a set of sample categories S and the corresponding training features are selected by feature traversal.
  • FIG. 1 step S103 After determining the feature sharing sample class set S and the corresponding selected optimal feature, respectively constructing a weak classifier for each category in the optimal feature sharing sample category set by using the selected optimal feature (FIG. 1 step S103).
  • the structure of the weak classifier is shown in Figure 3a.
  • the decision tree is used as a weak classifier.
  • Each weak classifier is constructed using a Haar-like feature. According to the relationship between the input eigenvalue and the threshold, the classifier has two A different output.
  • Step S104 of FIG. 1 obtains a feature list of the current class strong classifier by iteratively performing optimal feature selection, and also constructs a weak group for each of the plurality of categories (in the three categories) - a classifier, obtaining detection means comprising a respective plurality of strong classifiers capable of processing said plurality of categories.
  • the structure of the strong classifier (H (Q classifier) for each class is shown in Figure 3b, its output is +1 or -1 and its threshold ⁇ can be adjusted as needed.
  • These features used by the weak classifier h(C,) are derived from the classifier's feature list (feature group).
  • the training process of the classifier is the process of finding each (G) classifier, that is, the process of searching for multiple weak classifiers h(C,) for each category, and finally searching through the features used by each weak classifier by iteratively. , that is, the feature ⁇ ⁇ process. This process ends up with a set of shared features.
  • the step of iterating can be as known to those skilled in the art to specify the number of iterations T, by adjusting the sample weights to start the next iteration to respectively reconstruct the weak classifier for the plurality of categories (in these three categories), in the iteration After the number of times, the detecting means including all the weak classifiers is obtained, and the flow is ended (step S105).
  • the training termination judgment ⁇ it is preferable to use the training termination judgment ⁇ to iterate, and to set the desired performance of the training to the classifier H(C,) of each category, if a certain category is reached during the training process. Its expected performance, then this category will exit the classifier c,) joint training process.
  • the false detection rate is tested (C' H , (N FA is the number of samples that the classifier misdetects the negative sample set as a positive sample, which is a negative sample) The total number of ), if / ( C , ) ⁇ /, then category C, the training termination condition has been met, the training of the classifier is exited; if all the sample categories satisfy the training termination condition, the training of the classifier is ended.
  • the forward order selection method is used to determine which of the categories participating in the training are shared with which the feature error is the smallest, that is, It is optimal for which categories to form a feature shared sample category set S for feature sharing, and the features with the best classification performance for the categories in the set S are selected in the feature library, and then used for each category in the S.
  • the optimal features are respectively constructed as weak classifiers.
  • the present invention is not limited to the forward sequential selection method, but other sequential selection methods (e.g., backward sequential selection) may be employed to select a feature shared sample category set consisting of the determined categories.
  • a strong classifier is trained for each category, in which all strong classifications
  • the training of the devices is performed jointly.
  • the features used by the weak classifiers in each strong classifier are shared among multiple classes, but the training of each weak classifier is performed independently in each class; Category sharing, a feature may be shared by all categories, or it may only be shared by certain categories.
  • the detecting apparatus obtained by the training method according to the first embodiment of the present invention includes an input unit configured to input data to be detected, a joint classifier including a plurality of strong classifiers, and a discriminating unit configured to be configured according to The classification result of the plurality of strong classifiers determines the target data of which category the data to be detected belongs to. It should be understood by those skilled in the art that the specific criterion and manner of the discriminating unit can be flexibly set according to the requirements of the actual application, or the classification result of the joint classifier can be directly obtained without setting the discriminating unit, which should all be in the present invention. Within the spirit and scope.
  • data to be detected (for example, sample images) are respectively processed by all classes of strong classifiers and discriminated by the discriminating unit, thereby allowing the output of more than one strong classifier to be judged as positive, and It is not specified that only one judgment is positive; there is no mutually exclusive relationship between strong classifiers of different categories, and a certain data to be detected may be discriminated as target data of multiple categories. As long as the output of a strong classifier is judged positive by the discriminating unit, the output of the detecting device is
  • a detecting device for detecting a plurality of types of target data is designed as a classifier of a cascade structure (Cascade) in which a plurality of classifiers are serially coupled.
  • the classifiers of the training cascade classifier are first trained (the sample class used by the SCJ is artificially designed as a predetermined multi-layer structure (the first multi-layer structure in the present invention).
  • the most detailed category will be divided (for example) - r categories, r is a natural number greater than 1) set at the lowest level, then merge these categories into a smaller number of smaller classes according to predetermined similarity criteria, and then merge them one by one To the highest level, for example, a large class.
  • Figures 5a and 5b show the use of a category tree structure CT to represent sample category changes during training.
  • a sample of 7 types of objects participates in the training. These 7 types are set at the lowest level of the tree, Level 3, and these 7 categories are called “leaves” c, "; and then according to a similarity criterion, 7 types of samples are included.
  • Some classes merge to get the higher-level Level 2 of the tree, class 3 c," ; Finally, the Level 2 class 3 is merged into the highest-level Levell class 1 c. "; when using samples in training, use the higher-level sample classes from the Levell of CT.
  • the early goal of classifier training is to distinguish between target objects and non-target objects as a whole; as the training progresses, the overall differentiation becomes difficult.
  • the sample classification of the sample is performed using the Level 2 sample of CT, and finally the sample of the 7 leaf types of CT is used.
  • Figure 5b is still for the class of cars, trucks and buses, which are the "leaves" of CT. , the three types of merged into the CT root node class c ".
  • the corresponding training will be from c.” and then split into 3 leaf classes c, ".
  • the car is divided into trucks, cars, buses, etc. After multiple categories, each category can be further divided into more sub-categories.
  • a training method for a detection apparatus for detecting target data of r categories includes: training a corresponding level classifier from a top-level category according to a coarse-to-fine strategy Each level classifier includes a strong classifier having a number corresponding to the number of categories targeted, the level classifiers forming the detection device in series.
  • the training for the classifier that prepares to detect the m categories in the first stage includes: preparing a positive sample set and a negative sample set for the m categories to be processed for the level classifier, where l ⁇ m ⁇ r;
  • a certain classifier SC fc of the cascade classifier is trained for the class m samples to be processed by the classifier, and includes m (G) classifiers, which respectively correspond to the m class samples. Also, each of the strong classifiers (G) is obtained by adding a plurality of weak classifiers / G). - The structure of the classifier is shown in Figure 3b, and the weak classifier h(C,), taking the decision tree as an example, is shown in Figure 3a.
  • These features used by the weak classifier h(C,) are derived from a set of shared features f of the classifier SCfc .
  • the classifier 8 ( ⁇ training process is the process of finding each classifier, that is, the process of searching for multiple weak classifiers h(C,) for each category, and finally the process of searching for the features used by each weak classifier. That is, the feature ⁇ process, thereby obtaining the shared feature set.
  • any one of the shared feature groups may be used by multiple categories to construct a weak classifier, that is, the features are shared by multiple classes; but the parameters of the weak classifier are based on various types of The data is calculated separately, that is, the weak classifier is not shared among multiple classes.
  • the higher-level sample categories are used for training, and the split criteria for the sample categories are set; as the training progresses, when the criteria are met, the existing categories are split. Continue training for the lower-level, more detailed sample categories until the final split.
  • the "set sample category splitting criteria" used in the second embodiment may be a sub-supervised division of sub-categories for each level, and mandatory man-made sample category splitting. For example, specify the first level classifier for the top level, the second level and third level classifier for the higher layer, and so on. It is also possible to use an unsupervised method of automatically generating subclasses and continuing the training.
  • the second embodiment preferably uses the error in the training set as a criterion for classifying the sample category. That is, when the training is normal, the error in the training set continues to decrease. When the error in the set is difficult to continue to decrease, it indicates that the intra-class difference of some sample categories currently used hinders the continuation of the training, and the sample category should be split. . In this case, since the process of training the classifiers at all levels except for the lowest layer may split the samples, the training is based on a coarse-to-fine strategy for the predetermined multi-layer structure.
  • Each level in the category trains one or more corresponding level classifiers respectively, but for example, when the intra-class difference is large, it is possible that the corresponding level is not trained for a certain layer category, especially for the highest level category.
  • Classifier The multi-layer structure class hierarchy (the second multi-layer structure in the present invention) actually processed by the classifiers at each stage after the completion of the training may be different from the predetermined multi-layer structure (the first multi-layer structure in the present invention) which is artificially defined in advance. .
  • the measure of effectiveness performed includes: - setting the threshold of the strong classifier consisting of the currently constructed weak classifier to zero, and testing the classification error of the strong classifier for the positive and negative samples of the corresponding class;
  • the training of the classifier is exited, and the sample category is divided into the next sample class from coarse to fine. Then restart the training of the classifier.
  • the highest layer of the predetermined multi-layer structure category may be any number of categories, but usually has one category.
  • the training for the classifier for detecting the target data of the 1 category comprises: preparing a positive sample set and a negative sample set; training the weak classifier for the plurality of candidate training features, selecting A weak classifier with a minimum classification error; and a weak classifier constructed by iteration to obtain a first-level classifier composed of the obtained weak classifier, which is generally used to distinguish between a target image and a non-target image.
  • the number of iterations can be predetermined, or it can be automatically determined by using the training termination judgment condition.
  • the number of iterations can be predetermined, or it can be automatically determined by using the training termination judgment condition.
  • the description of the training termination judgment condition is as described in the first embodiment, and details are not described herein again.
  • the training termination judgment condition be set for any classifier, but also the expected training performance can be set for the target categories as a whole (for example, the total false detection rate is set for the lowest level category), if some The training of the category has reached the desired performance, and this category is no longer involved in the training of subsequent classifiers at all levels.
  • the third embodiment describes a more detailed classification (training) method for the cascade classifier by using cars, buses, and trucks as targets to be detected.
  • the feature pool is prepared, and for example, a Haar-like feature prototype is applied to an image of 32x32 (pixels) to obtain hundreds of thousands of specific features. - -
  • a positive sample set / is selected for different classes: a positive sample class (one class or three classes) corresponding to the class, and a positive sample set P (C is filtered using a pre-kl classifier) , get the current positive sample set by removing the sample judged to -1. Give each positive sample a mark +1 ⁇
  • a negative sample set Nf is prepared for each positive sample set/respectively.
  • a negative sample set can be prepared for each class by intercepting sub-pictures of the same size as the positive sample image in a certain order in the background image.
  • preparing a negative sample set for the relevant category G comprises: cascading classification consisting of a strong classifier associated with G in all previous prior classifiers A window traversal search is performed in the background image, and a window image misjudged as a positive sample is added to the negative sample set Nf of G.
  • the desired training performance can be set individually for each sample category as a whole.
  • the exit condition of the classifier can also be set, for example, the number of iterations T is specified, where the desired minimum detection rate ⁇ and the expected maximum false detection rate f io are set for each type of target.
  • step S602 The process of searching for a plurality of weak classifiers h(C,) for each category is started from step S602, and finally the features used by the respective weak classifiers are searched by iteration.
  • step S603 constructs a decision tree weak classifier W, Q with / for all classes in 5, the structure of which is shown in Figure 3a;
  • step S604 update all sample categories in the feature sharing sample category set S
  • step S608 if the decision result in the step S606 is NO, the training termination determination is performed.
  • the judgment is made using the expected maximum false detection rate.
  • the training of the detecting device does not limit the specific Boosting algorithm, but may be other algorithms such as Gentle-Boosting, Real-Boosting and the like.
  • the detecting device obtained by the training method according to the second or third embodiment of the present invention includes: an input unit configured to input data to be detected; and a cascade classifier.
  • the cascade classifier as shown in Figure 7, includes a classifier consisting of multiple (n) series.
  • the plurality (r) of categories described herein may be merged into a predetermined multi-layer structure step by step according to a similarity criterion, and the plurality of categories are set at the lowest level as the most subdivided category, and accordingly,
  • the plurality of level classifiers are configured to classify each layer class in the predetermined multi-layer structure according to a coarse-to-fine strategy, and each level classifier includes a number corresponding to the number of processed categories. Classifier.
  • each of the strong classifiers includes a set of weak classifiers, each weak classifier uses a feature to weakly classify the data to be detected, wherein each of the classifiers includes a shared feature list.
  • Each feature in the shared feature list is shared by one or more weak classifiers belonging to different strong classifiers respectively; weak classifiers belonging to different strong classifiers using the same feature have different parameter values from each other.
  • the detecting device is a cascade structure classifier serially coupled by a plurality of "stage classifiers" SC, but it is simultaneous detection for a plurality of class objects Designed, a strong classifier of multiple categories within each classifier is shared with a list of features (ie - - Shared feature groups) are combined.
  • the data to be detected is sequentially input to the classifiers of the cascade classifier.
  • the discriminating process is as follows: calculating values of all valid features in the feature list of the current classifier; m strong classifiers of the level, in turn according to sharing of features in the feature list, according to Calculating the obtained eigenvalues determines the outputs of the respective weak classifiers and adds them to the output of the final strong classifier.
  • the discriminating process if the data to be detected is rejected by a strong classifier for detecting the category c, the corresponding classifier in the subsequent level classifier is used to detect the category c, and the corresponding subclass thereof The strong classifier no longer continues to discriminate the input data to be detected. At this time, the data to be detected is said to be the category c, and the corresponding leaf layer category is rejected.
  • the features in the feature list of each classifier are only regarded as invalid features related to the strong classifiers that are no longer involved in the discriminating process, and are no longer involved in the calculation, thereby saving computational cost.
  • the discriminating process if the data to be detected is rejected by all leaf layer categories, the discriminating process is aborted, and the data to be detected is referred to as non-target data.
  • the data to be detected is passed by a strong classifier of the last classifier, the data to be detected is determined to have the target class attribute corresponding to the strong classifier, if the data to be detected is After the plurality of strong classifiers of the last stage pass, it is determined that the data to be detected has corresponding multi-target class attributes.
  • the detecting device can detect various types of target data, and in the case of detecting a plurality of categories of predetermined targets in the input image or video, the detecting device according to the present invention may further include a window traversal component configured to window traverse the detected image or video, and a post-processing component configured to merge the windows generated by the window traversal component and filter the merged window using a predetermined threshold, Get the final test results.
  • a window traversal component configured to window traverse the detected image or video
  • a post-processing component configured to merge the windows generated by the window traversal component and filter the merged window using a predetermined threshold, Get the final test results.
  • Figure 8 illustrates a detection device for detecting a predetermined plurality of types of targets in an image or video in accordance with the present invention.
  • window Traversal Process 810 For any given image to be detected (step S811) or from - detecting the image captured in the video, using the rectangular window to perform the image calendar (step S812), and sequentially obtaining the window image in step S813 (where the order and manner of traversing are arbitrary, which may be left to right, top to bottom) , can also be from right to left, from bottom to top; the step size of the window translation during traversal is arbitrary, can be pixel-by-pixel, or can be separated by multiple pixels, or proportional to the size of the current window).
  • step S814 sequentially applying the cascade classifier to each window obtained in the scanning process, performing feature calculation on the window image using the features in the trained classifier (step S814) and applying the classification
  • the device performs classification (step S815). If the cascade classifier discriminates the window image as a target category (having more than one target category attribute), records the position and size of the window in the original image, and all target category attributes it has (step S816). After the window traversal is finished, the image is reduced according to a certain scale factor, and the above window traversal and window image determination process are performed again.
  • WinScanModel that is, selecting a fixed-size window to traverse the image, after the traversal is finished, scaling down or enlarging the image size, and re-traversing the image using a fixed-size window
  • WinScanMode2 which keeps the size of the image unchanged, selects the size of the window when it is traversed for the first time.
  • the traversal is finished, the size of the window is reduced or enlarged according to a certain scale, and the original image is re-traversed.
  • Post-processing flow 820 is performed by the post-processing component, comprising: step S821 window merging to merge adjacent positive response results and step S822 threshold filtering to discard weak responses, and merge results after window merging and threshold filtering As the final test result (step S830)q
  • multiple responses are generated near the same target (car) in the image, merging adjacent multiple responses into one output response.
  • the merging process defines "adjacent" as having adjacent window center positions, similar size ratios, and the same target category attributes, and then calculating the average center position, average window size of the adjacent cluster of target windows, and The number of merged windows is used as the confidence of the merged result.
  • the merge process performs target attribute merge on the merged result of the merged position center adjacent and the similar size, that is, if there are multiple locations near a certain position in the image The result of combining different target attributes, counting the number of each target attribute, taking the largest target attribute as the final target attribute, and taking the confidence of each target attribute as the confidence of the final combined result.
  • the merge result is accepted, otherwise the merge result is discarded.
  • feature sharing is performed between classifiers of multiple types of targets, but weak classifiers associated with shared features are separately constructed in various types, so that differences between various types of targets can be effectively distinguished. It improves the convergence speed of training, and also improves the distinguishing performance of the joint classifier for various types of targets. It is not mandatory to specify that features are shared among all categories, reducing unnecessary operations.
  • Feature sharing between multi-class classifiers reduces feature computational cost of multi-class classifiers, in accordance with various embodiments of the present invention.
  • test set is processed in parallel using the cascaded classifiers of the three types of cars, and then the above test set is processed using the joint classifier of feature sharing.
  • the test results are shown in the following table:
  • the two schemes have similar classification performance, and the ⁇ combiner has higher detection efficiency.
  • the joint classifier not only distinguishes between (multi-class) target images and non-target images, but also tries to reflect the differences between the target categories.
  • the use of coarse-to-fine multi-level sample categories allows the joint classifier to prioritize the overall difference between the target and the non-target, and then consider the differences between the target categories, further improving the efficiency of detection.
  • the above series of processing and devices can also be implemented by software and firmware.
  • a program constituting the software is installed from a storage medium or a network to a computer having a dedicated hardware structure, such as the general-purpose computer 900 shown in FIG. 9, which can be installed with various programs. Perform various functions and more.
  • a central processing unit (CPU) 901 executes various processes in accordance with a program stored in a read only memory (ROM) 902 or a program loaded from a storage portion 908 to a random access memory (RAM) 903.
  • ROM read only memory
  • RAM random access memory
  • data required when the CPU 901 executes various processing or the like is also stored as needed.
  • the CPU 901, the ROM 902, and the RAM 903 are connected to one another via a bus 904.
  • Input/output interface 905 is also coupled to bus 904.
  • the following components are connected to the input/output interface 905: an input portion 906 including a keyboard, a mouse, etc.; an output portion 907 including a display such as a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker And so on; a storage portion 908 including a hard disk or the like; and a communication portion 909 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 909 performs communication processing via a network such as the Internet.
  • the drive 910 is also connected to the input/output interface 905 as needed.
  • a removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory or the like is mounted on the drive 910 as needed, so that the calculation order read therefrom is installed into the storage portion 908 as needed.
  • a program constituting the software is installed from a network such as the Internet or a storage medium such as the removable medium 911.
  • such a storage medium is not limited to the removable medium 911 shown in FIG. 9 in which a program is stored and distributed separately from the device to provide a program to the user.
  • the detachable medium 911 include a magnetic disk (including a floppy disk (registered trademark)), an optical disk (including a compact disk read only memory (CD-ROM) and a digital versatile disk (DVD)), and a magneto-optical disk (including a mini disk (MD) (registered trademark) )) and semiconductor memory.
  • the storage medium may be a ROM 902, a hard disk included in the storage portion 908, or the like, in which programs are stored, and distributed to the user together with the device containing them.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

- -
多类目标的检测装置及检测方法
技术领域
[01] 本发明涉及目标检测技术。尤其涉及用于对多个类别的目标数据进行 检测的检测装置及其检测方法。
背景技术
[02] 运用机器学习方法对图像或其它待检测数据进行目标数据的检测 显得越来越重要。 尤其是对图像中的物体检测已成为其中一个重要分支。
[03] 同一类物体受光照、视角、 姿态等多重因素的影响在图像中可能产 生出差异巨大的状态,这给图像中的物 测技术带来很大困难。 同一类 物体因而可能会被划分为多个子类进行处理,但如何既有效利用多个子类 之间的共性而又能准确区分其差别仍然是一个需要进一步研究的课题。
[04] 对多类图像物 测技术来说, 文献【1】提出了一种特征共享技 术,通过将多类物体的分类器进行联合训练,在多类之间尽可能共享特征, 以达到减少运算成本的目的。单纯的特征共享多类联合训练对减少运算成 本十分有效, 取得了良好的效果, 但其效率较低, 且由于共享特征的同时 也共享弱分类器导致在强分类器的后段,特征的共享越来越困难。文献【2】 在此^ I上进一步提出了一种向量 Boosting树算法 测图像中呈现不 同视角和不同姿态的人脸。 但同样文献【2】所提算法强制在各类之间进 行特征共享, 这使得当多类中的某一类不能较好的与其他各类共享特征 时, 强制的特征共享方式给分类器的进一步训练带来了困难。
参考文献
[1】 A. Tor r alb a, K.P. Murphy, and W.T. Freeman. Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection. CVPR 2004.
[2】 C. Huang, H. Ai, Y. Li, and S. Lao. Vector Boosting for Rotation Invariant Multi-View Face Detection. ICCV 2005. - - 发明内容
[05] 本发明的目的是提供一种区别于以上现有技术的、用于对多个类别 的目标数据进行检测的检测装置及其检测方法。
[06] 根据本发明的一个方面,提供了一种用于对多个类别的目标数据进 行检测的检测装置的训练方法, 包括:
确定所述多个类别中进行特征共享的最优特征共享样本类别集合,并 通过特征遍历为其挑选最优特征;
使用所述选中的最优特征对所述最优特征共享样本类别集合中的各 个类别分别构建弱分类器; 以及
通过迭代地进行最优特征 得到当前级强分类器的特征列表,同时 也为所述多个类别分别构建一组弱分类器,获得包括能处理所述多个类别 的相应多个强分类器的检测装置。
[07] 根据本发明的用于对多类目标数据进行检测的检测装置的训练方 法使用多类目标的样本进行训练,通过确定所述多个类别中在哪些类别之 间进行特征共享误差最小来获得特征共享样本类别集合,使用所述选中的 最优特征对所述最优特征共享样本类别集合中的各个类别分别构建弱分 类器, 由此构建包含弱分类器的检测装置。 测装置和检测方法,、其;检测装置包括: 输入单元,
Figure imgf000004_0001
数据;联合分类器, 包括数量与所述类别数量相对应并用于分别检测对应 类别的目标数据的强分类器,其中,每个所述强分类器都由一组弱分类器 相加得到,每个弱分类器使用一个特征对所述待检测数据进行弱分类,其 中所述联合分类器内包含共享特征列表,所述共享特征列表中的每个特征 被分别属于不同强分类器的一个或多个弱分类器共享使用;使用同一特征 的分属不同强分类器的弱分类器具有彼此不同的 值。这样,在针对各 类目标的强分类器之间共享特征以减少计算成本,但各类之间不共享分类 器以体现类间差异。
[09] 根据本发明的另一个方面, 提供了一种用于对 r个类别的目标数据 进行检测的检测装置的训练方法, 其中, 所述 r个类别可按预定相似性标 准由细到粗被逐级合并为预定多层结构,并且所述 r个类别作为划分最细 的类别设置在最底层, r为大于 1的自然数, 所述训练方法包括:
按照由粗到细的策略从最顶层类别开始训练相应的级分类器,每个级 - - 分类器包括具有与所针对类别数量相对应的数量的强分类器,所述各级分 类器串联形成所述检测装置,
其中, 针对其中一级准备检测 m个类别的级分类器的训练包括: 为该级分类器准备处理的 m个类别分别准备正样本集和负样本集, 其中 l<m≤r;
确定所述 m个类别中进行特征共享的最优类别集合, 并通过特征遍 历为其挑选最优特征;
使用所述选中的最优特征对所述最优特征共享样本类别集合中的各 个类别分别构建弱分类器; 以及
通过迭代地进行最优特征 得到当前级强分类器的特征列表,同时 也为所述 m个类别分别构建一组弱分类器, 获得包括可处理所述 m个类 别的 m个强分类器的级分类器。
[10] 据本发明第二个方面的用于对多个( r个)类别目标数据进行检测 的检测装置及其检测方法,其中, 所述多个类别按相似性标准被逐级合并 为预定多层结构, 并且所述多个类别作为划分最细的类别设置在最底层, 所述检测装置包括:
输入单元, 被配置成输入待检测数据; 以及
级联分类器, 所述级联分类器包括由多个串联的级分类器,所述多个 级分类器被配置成按照由粗到细的策略对所述预定多层结构中的各层类 别进行分类处理,并且每个级分类器都包括数量与所处理类别数量相对应 的强分类器, 其中, 每个所述强分类器包括一组弱分类器,每个弱分类器 使用一个特征对所述待检测数据进行弱分类,
其中每个所述级分类器包含一个共享特征列表,所述共享特征列表中 的每个特征被分别属于不同强分类器的一个或多个弱分类器共享使用;使 用同一特征的分属不同强分类器的弱分类器具有彼此不同的参数值。
[11] 类似地,根据本发明第二个方面,作为级联式分类器的检测装置在各 类目标的强分类器之间共享特征以减少计算成本,但各类之间不共享分类 器以体现类间差异。 同时为有效处理多类目标,在训练各级分类器的过程 中按照由粗到细的原则先将多个类别合并处理,而后逐渐拆分类别做细化 处理。 - - 附图说明
[12] 结合附图, 通过参考下列详细的示例性实施例的描述, 将会更好地 理解本发明本身、 优选的实施方式以及本发明的目标和优点。
图 1 示出了根据本发明第一实施例的用于对多类目标数据进行检测 的检测装置的训练方法。
图 2示出了根据本发明第一实施例的训练方法所使用的 Haar-like特 征原型。
图 3a和图 3b分别示出了弱分类器和强分类器的结构。
图 4 示出了根据本发明第一实施例的训练方法所获得的检测装置的 分类器。
图 5a和 5b分别列举了使用类别树结构 CT表示训练过程中的样本类 别变化。
图 6示出了根据本发明第三实施例的训练方法。
图 7 示出了根据本发明第二或第三实施例的训练方法所获得的检测 装置的分类器。
图 8 示出了根据本发明的检测装置检测图像或视频中预定多类目标 的絲。
图 9是示出其中实现本发明的计算机的示例性结构的框图。
具体实施方式
[13] 下面将结合附图对本发明加以详细说明, 应指出的是, 所描述的实 施例仅旨在便于对本发明的理解, 而对其不起任何限定作用。
第一实施例的训练方法
[14] 第一实施例以多类汽车(轿车、 巴士和卡车)为待检测的目标。 应 了解, 本发明的实施例并不限于对图像和 /或视频中的汽车进行检测, 还 可以对图像和 /或视频中的其它物体(如多角度的人脸)、 甚至可对对实时 网络数据或主机数据进行入侵分类等等进行检测。
[15] 图 1示出了根据本发明第一实施例的用于对多类目标数据进行检测 - - 的检测装置的训练方法 100。
[16] 该方法在步骤 S101开始, 首先为所述多个类别分别准备正样本集 和负样本集。 本实施例中分别为三类汽车(轿车、 巴士和卡车)准备一定 数量的正样本集和负样本集, 其正样本集分别为三类汽车(轿车、 巴士和 卡车)正面视角的相同尺寸的汽车图像集, 尺寸统一为 32x32 (像素); 其负样本集从背景图像集(一组不包含目标物体的图像,尺寸不做任何要 求) 中抽样得到, 尺寸统一缩放到 32x32 (像素)。
[17] 同时准备训练特征池。将 Haar-like特征原形应用于例如 32x32 (像 素)的图像, 得到数十万具体的训练特征。但应了解本发明的实施例并不 限定所使用特征的具体种类, 例如可以是 Haar-like特征, HOG (梯度方 位直方图)特征, LBP (局部二值模式)特征或其他特征。
[18] 在此, 图 2示出了所使用的 Haar-like特征原型。 Haar-like特征为 定义在图像中的一个矩形, 包括分别在图 2 中以白色和黑色表示的两部 分, 矩形的方位分为直立和 45度倾斜两种。 Haar-like特征原型具有四个 矩形在图像中的位置(x, y )和矩形的尺寸 (宽度 w和高度 h ), 随着矩形的位置、尺寸和宽高比的变化,可生成数以万计的具体 Haar-like 特征作用于图像。 Haar-like特征的取值为一标量, 定义白色区域内所有 像素的灰度值总和为 ^画 ),黑色区域内为 m(fi),则 Haar-like特征值由
^ ^式 feature i - Sw ^)- 5"w ( )计算。
[19] 从图 1的步骤 S102开始训练。 确定所述多个类别中进行特征共享 的最优特征共享样本类别集合,并通过特征遍历为其挑选最优特征。例如, 以多个待选训练特征为基础,通过使用前向顺序选择法等方法确定所述多 个类别(在此为 3个类别)中在哪些类别之间进行特征共享是误差最小的, 选择由所确定的类别组成的特征共享样本类别集合 S, 并通过特征遍历选 中相应的训练特征。
[20] 在确定特征共享样本类别集合 S及相应的所选最优特征后,使用所 述选中的最优特征对所述最优特征共享样本类别集合中的各个类别分别 构建弱分类器(图 1中步骤 S103 )。 弱分类器的结构如图 3a所示, 在本 实施例中使用决策树作为弱分类器, 每个弱分类器使用一个 Haar-like特 征构建, 根据输入的特征值与阈值的关系分类器有两个不同的输出。
[21] 图 1的步骤 S104,通过迭代地进行最优特征挑选得到当前级强分类 器的特征列表, 同时也为所述多个类别(在此 3个类别)分别构建一组弱 - - 分类器, 获得包括能处理所述多个类别的相应多个强分类器的检测装置。 针对每一类别的强分类器(H(Q分类器)的结构如图 3b所示, 其输出为 +1或 -1并且其阈值 Θ可以根据需要进行调节。
[22] 弱分类器 h(C,)使用的这些特征来自于分类器的特征列表 (特征组)。 分类器的训练过程就是寻找各 (G)分类器的过程, 也就是对每个类别搜 索多个弱分类器 h(C,)的过程, 最终通过迭代搜索各个弱分类器所使用的 特征的过程, 即特征^ ^过程。 此过程最后得到一组共享特征 。
[23] 迭代的步骤可以如本领域技术人员所了解指定迭代次数 T, 通过调 整样本权重开始下一次迭代分别为所述多个类别(在此 3个类别)再构建 弱分类器, 在满足迭代次数 Τ之后, 获得包括所有弱分类器的检测装置, 结束流程 (步骤 S105 )。
[24] 根据本发明, 优选地可以采用训练终止判断 ^来进行迭代, 对各 个类别的分类器 H(C,)分别设定训练所要达到的期望性能,如果某个类别在 训练过程中达到了其期望性能,则这个类别将退出该分类器的 c,)联合训 练过程。例如,对于所有属于当前特征共享样本类别集合 S的类别( c, ) 测试误检率 (C' H , ( NFA为该分类器将负样本集中样本误检为正样本 的数量, 为负样本的总数量), 如果/ (C,)< /,则类别 C,已经满足训练终止 条件, 退出该分类器的训练; 如果所有样本类别都满足训练终止条件, 则 结束该分类器的训练。 而如果有部分样本不满足训练条件, 则对于属于 S 的 d ( Ct S ), 则更新样 重: = 1 (^( (/ ^ , ^) '),对( , ^则 保持样本权重不变; 同时使所有样 重归一化使得∑H¾ = 1, 进行下一 次迭代。
[25] 根据本发明的第一实施例, 在对分类器的训练过程中, 使用前向顺 序选择法来确定在参与训练的所有类别中哪些类别之间进行特征共享是 整体误差最小的,即由哪些类别组成一个特征共享样本类别集合 S进行特 征共享是最优的,同时在特征库中挑选出对集合 S内的类别来说分类性能 最优的特征,然后对 S中的每个类别使用最优特征分别构建弱分类器。然 而,本发明并不限于前向顺序选择法,而是可以采用其它的顺序选择法(例 如后向顺序选择法)来选择由所确定的类别组成的特征共享样本类别集 合。
[26] 根据第一实施例的检测装置和检测方法
[27] 在第一实施例中对每个类别都训练一个强分类器,其中所有强分类 器的训练是联合进行的,各强分类器中的弱分类器所使用的特征在多类之 间进行共享,但各个弱分类器的训练在各类内部分别独立进行; 并不限定 特征被所有类别共享, 某个特征有可能被所有类别共享,也可能只被某些 类别共享。
[28] 根据本发明第一实施例的训练方法所获得的检测装置包括被配置 成输入待检测数据的输入单元、包括多个强分类器的联合分类器以及判别 单元,判别单元被配置成根据多个强分类器的分类结果,对所述待检测数 据属于哪个类别的目标数据进行判别。本领域的技术人员应当理解,可以 根据实际应用的需求来灵活地设置判别单元的具体判别标准和方式,或者 也可以不设置判别单元而直接得到联合分类器的分类结果,其均应在本发 明的精神和范围之内。
[29] 其中多个强分类器所组成的联合分类器如图 4所示, 包括 m个数 量与所述类别数量 m相对应并用于分别检测对应类别的目标数据(在此 实施例中 m=3 )的强分类器(在第一实施例中是 Boosting分类器 H(G = ∑hj(Cd) ), 其中每个强分类器包括一个或更多个弱分类器(hj(Q ), 其中, 每个所述强分类器都由一组弱分类器相加得到,每个弱分类器使用一个特 征对所述待检测数据进行弱分类; 其中联合分类器包含共享特征列表(即 共享特征组), 共享特征列表中的每个特征 ( ~ η )被分别属于不同强分 类器的一个或多个弱分类器共享使用(例如 /3并不为强分类器 H(G)和强 分类器 H(C¾所使用 );使用同一特征的分属不同强分类器的弱分类器具有 彼此不同的参数值。这样,在针对各类目标的强分类器之间共享特征以减 少计算成本, 但各类之间不共享分类器以体现类间差异。
[30] 在该检测装置内部, 待检测数据(例如样本图像)分别被所有类别 的强分类器进行处理并被判别单元进行判别,因而允许多于一个强分类器 的输出被判断为正, 而不是规定只有一个判断为正; 不同类别的强分类器 之间没有互斥关系, 某个待检测数据可能被判别为多个类别的目标数据。 只要有一个强分类器的输出被判别单元判其为正,则该检测装置的输出为
+1, 否则输出为 -1。
第二实施例
[31] 根据本发明的第二实施例,将用于检测多类目标数据的检测装置设 计成由多个级分类器串行联结的级联结构(Cascade ) 的分类器。 为此, 首先将训练级联分类器的各级分类器(SCJ所使用的样本类别人为设计 为预定多层结构(本发明中的第一多层结构)。 将划分最细的类别 (例如 - - r个类别, r为大于 1的自然数)设置在最底层, 然后根据预定相似性标 准将这些类别合并为较高一层的较少的几个较大的类,而后再逐级次合并 至最高层的例如一个大类为止。
[32] 图 5a和 5b示出了使用类别树结构 CT表示训练过程中的样本类别 变化。 图 5a中, 共有 7类物体的样本参与训练, 将这 7类设置在树的最 底层 Level3并称这 7类为"叶子类 "c," ; 然后根据某种相似性标准将 7类 样本中的某些类合并得到树的较高层 Level2的 3类 c," ; 最后将 Level2的 3类合并为最高层 Levell的 1类 c。";在训练中使用样本时从 CT的 Levell 开始先使用较高层的样本类,即分类器训练的早期目标是整体上区分目标 物体和非目标物体;随着训练的进行当整体区分变得困难时再进行样本的 类别拆分使用 CT的 Level2的 3类样本,最后使用 CT的 7个叶子类的样 本。 图 5b仍针对轿车、 卡车和巴士 3类, 此 3类为 CT的"叶子类,,, 三 类合并后为 CT的根节点类 c"。相应的训练将从 c。"开始然后适时拆分为 3 个叶子类 c,"。 当然在将汽车分为卡车、 轿车、 巴士等等多个类别后, 还 可每个类别再继续划分为更细致的多个子类。
[33] 根据本发明的第二实施例, 用于对 r个类别的目标数据进行检测的 检测装置的训练方法, 包括:按照由粗到细的策略从最顶层类别开始训练 相应的级分类器,每个级分类器包括具有与所针对类别数量相对应的数量 的强分类器, 所述各级分类器串联形成所述检测装置。
其中, 针对其中一级准备检测 m个类别的级分类器的训练包括: 为该级分类器准备处理的 m个类别分别准备正样本集和负样本集, 其中 l<m≤r;
确定所述 m个类别中进行特征共享的最优类别集合, 并通过特征遍 历为其挑选最优特征;
使用所述选中的最优特征对所述最优特征共享样本类别集合中的各 个类别分别构建弱分类器; 以及
通过迭代地进行最优特征^ ^得到当前级强分类器的特征列表,同时 也为所述 m个类别分别构建一组弱分类器, 获得包括可处理所述 m个类 别的 m个强分类器的级分类器。
可以理解,级联分类器的某一级分类器 SCfc是针对此级分类器所要处 理的 m类样本训练得到的, 包含 m个 (G)分类器, 分别对应 m类样本。 同样,其中每一个强分类器 (G)是由多个弱分类器/ G)相加得到的。 - - 分类器的结构如图 3b所示,以决策树为例的弱分类器 h(C,)如图 3a所示。
[34] 弱分类器 h(C,)使用的这些特征来自于级分类器 SCfc的一组共享特 征 f 。 级分类器 8(^的训练过程就是寻找各 分类器的过程, 也就是 对每个类别搜索多个弱分类器 h(C,)的过程, 最终就是搜索各个弱分类器 所使用的特征的过程, 即特征^^过程, 由此得到所述共享特征组 。
[35] 与第一实施例类似,共享特征组中的任意一个特征都可能被多个类 别用于构建弱分类器, 即特征被多类共享;但弱分类器的参数才艮据各类的 数据分别计算得到, 即弱分类器并不在多类间共享。
[36] 如上所述, 在训练中, 先使用较高层的样本类别进行训练, 并设定 样本类别的拆分标准; 随着训练的进行, 当这个标准得到满足时, 将现有 类别拆分为较低层的更细致的样本类别继续训练, 直至最后拆分至最底 层。
[37] 第二实施例所采用的 "设定的样本类别拆分标准" 可以是有监督地 为各级指定子类划分,进行强制的人为样本类别拆分。例如为最顶层指定 第一级分类器, 为较高层指定第二、 第三级分类器等等。 也可以采用无监 督的自动产生子类并延续训练的方法。
[38] 可替代地, 第二实施例优选以训练集内的误差作为样本类别拆分的 判断标准。 即在训练正常进行时, 训练集内误差持续减小, 当集内误差难 以继续降低时,说明当前使用的某些样本类别的类内差异较大阻碍了训练 的继续, 应当进行样本类别拆分。 在这种情况下, 由于在训练除针对最底 层之外的其他各层类别的各级分类器的过程可能会拆分样本,因此尽管训 练时是按照由粗到细的策略针对预定多层结构中的每一层类别分别训练 一个或多个相应的级分类器, 但是例如当类内差异^ f艮大时针对某一层类 别、特别是针对最高层类别有可能并没有训练出对应的级分类器。训练完 成后各级分类器实际处理的多层结构类别层次(本发明中的第二多层结 构)可能与事先人为定义的预定多层结构(本发明中的第一多层结构)有 所区别。
[39] 具体地,针对准备处理除最底层类别之夕卜的其他各层类别的任意一 级分类器(即 l≤m<r ), 则在每次迭代过程中在为所述类别构建弱分类器 后进行有效性度量, 以判断是否进行样本类别拆分。
所述进行有效性度量包括: - - 将由目前所构建弱分类器组成的强分类器的阈值设为零,并测试所述 强分类器对相应类别的正负样本的分类误差;
判断所述分类误差是否随着逐个迭代过程逐渐降低; 和
如果判断所述分类误差不再随着逐个迭代过程逐渐降低、或者降低緩 慢, 或者发生震荡, 则退出该级分类器的训练, 并且将样本类别按从粗到 细拆分成下一层样品类别后重新开始该级分类器的训练。
[40] 如上所述, 所述预定多层结构类别的最高层可以是任意数量的类 别, 但通常具有 1个类别。根据第二实施例, 针对用于检测所述 1个类别 的目标数据的级分类器的训练包括: 准备正样本集和负样本集;对于所述 多个待选训练特征训练弱分类器,选择具有最小分类误差的弱分类器; 以 及通过迭代构建弱分类器, 获得由所获得的弱分类器构成的第一级分类 器, 通常用于区分目标图像和非目标图像。 同样, 迭代的次数可以预定, 也可以通过采用训练终止判断条件来自动判定。
[41] 类似地, 针对其他任意一级分类器迭代训练分类器时, 其迭代的次 数都可以预定,也可以通过采用训练终止判断条件来自动判定。针对训练 终止判断条件的描述如第一实施例中所述, 在此不再赘述。
[42] 不仅针对任意一级分类器可以设定训练终止判断条件,还可以整体 上对目标各类分别设定其期望训练性能(例如针对最底层类别分别设置总 误检率 ), 如果某个类别的训练已经达到了期望性能, 则这个类别不再 参与后续的各级分类器的训练。
第三实施例
[43] 第三实施例以轿车、 巴士和卡车作为待检测的目标, 描述了更详细 的对级联分类器的分类(训练)方法。
[44] 首先, 准备三类正样本集(汽车图像) P ( G ) ( i=l, 2, 3 )分别对应 轿车、 巴士和卡车, 将三类正样本合并为一类正样本集 P ( C。), 样本类 别树的结构如图 5b所示; 训练从 P ( G ) ( =0 )开始, 当需要进行正样本 类别拆分时将 P i=0 )拆分为 P ( G ) ( i= 2, 3 ); 并设定所有各类 的期望训练目标: 检测率 A和总误检率 Fi;
[45] 其次准备特征池, 将例如 Haar-like特征原形应用于 32x32 (像素) 的图像, 得到数十万具体特征。 - -
[46] 然后逐级训练各级分类器 SC7至 S Cn a 如图 6所示, 尤其示出了训 练第 级分类器 SCfc =l, 2, 3,···, n )的步骤:
[47] 在步骤 S601, 针对不同的类分别准备正样本集/ : 对应本级所 使用的正样本类别(一类或三类),使用前 k-l级分类器对正样本集 P( C 进行筛选, 通过去除判别为 -1 的样本得到当前正样本集/ 。 为每个正样 本賦予标记 +1 ©
同样在步骤 S601 ,对应各正样本集/ 分别准备负样本集 Nf 。可以通 过在背景图象中按照某种顺序截取与正样本图像尺寸相同的子图片,为各 类准备负样本集 。
优选地, 针对从第二级分类器开始的各级分类器, 为相关类别 G准 备负样本集包括: 使用前面所有的已有级分类器中的与 G相关的强分类 器组成的级联分类器,在背景图像中做窗口遍历搜索,将误判为正样本的 窗口图像添加到 G的负样本集 Nf 中。 负样本的数量可以根据实际需要确 定,例如可以规定某个类别的负样本的数量与其正样本的样本数目成固定 比例。 为每个负样本赋予标记 = -1。
在此可以整体上对各个样本类别分别设定其期望训练性能。例如定义 最底层类别 G的当前误检率为 FQ = U ( 为搜索得到的负样 量, 为搜索过的所有窗口图像的数目),如果类别 G的误检率 F 已经小于期 望总误检率 , 则类别 G不再参与后续训练。 如果所有类别的误检率都 小于其总误检率, 则退出全部训练过程。
同样在步骤 S601 , 为每个样本设定权重 = I/M (初始权重为 1/M ), M为样本总数。
在步骤 S601还可以设定级分类器的退出条件, 例如指定 T次迭代次 数, 在此是对各类目标设置期望最小检测率 φ和期望最大误检率 fio
[48] 从步骤 S602开始^^特征, 对每个类别搜索多个弱分类器 h(C,)的 过程, 最终通过迭代搜索各个弱分类器所使用的特征。
设定 ί=0, 1, ... , 进行第 个特征的挑选
a) 在步骤 S602, 搜索最优的特征共享样本类别集合 S (在此例如使用 前向顺序选择法确定是哪些类别而不一定是所有类别共享该 t个特征 ): i. 对于所有 和 Nf ,计算当第 ς类不参与特征共享时所引入的误差 = ; - - 对所有各类独立进行弱分类器训练,即在特征池中 ^一个特征能对 当前类的正负样本集做误差最小划分; 记录各类所挑选出的最优特征 及其分类误差 e, 对所有类别计算: ( = ef'+ ( 当不参与时
Cj.c, cj≠c, 的误差), 取 C^argmin^) (使 达到最小值时的 ^的取值)为优先进入特 征共享样本类别集合候选 S的第一类,得到特征共享样本类别集合候选 S1; 将 C;分别与其他各类组合, 进行两类联合弱分类器训练, 记录各种组 合下挑选出的最优特征 f 以及分类误差 e ; 对所有组合计算 eS2( ) = e - + ∑ ,,取 C2*=argmin( )作为特征共享样本类别集合候选 S的第 二类, 得到特征共享样本类别集合候选 。
以此类推, 直到处理完所有类;
在以上所得的所有 中,取特征共享误差最小的集合作为特征共享样 本类别集合 S, 即 S = argmm( ; 记录相应的/ 为最优特征 /,*。
b) 在步骤 S603, ^用 /为5中的所有各类构建决策树弱分类器 W,Q, 其结构如图 3a所示;
c) 在步骤 S604, 为特征共享样本类别集合 S中的所有样本类别更新
H,(C,)分类器: H ^ = HD + f:,Ci 并根据期望最小检测率 确定 分类 器的阈值 (即在当前阈值下, 分类器在当前正样本集的检测率为^); d) 在步骤 S605, 为特征共享样本类别集合 S中各类样本的训练有效性 度量: 例如通过设置各类样本的 (c)分类器的阈值为零, 测试此时的 H(Q 分类器对各类内部的正负样本的分类误差, 并在步骤 S606判断该误差是否 随着训练逐渐降低。 如果这个误差不再降低、 或者降低緩慢、 或者发生震 荡, 则退出第 级分类器 SCfc的训练并将样本类别按从粗到细拆分成下一层 样品类别(例如按图 5b所示进行拆分)后重新开始第 级分类器 SCfc的训练
(见步骤 S607 );
e) 在步骤 S608, 若步骤 S606的判断结果为否, 则进行训练终止判断。 在此采用期望最大误检率 进行判断。 具体地, 对所有属于特征共享样本 类别集合 S的类别 ( c,e5 )测试误检率 /( =
Figure imgf000014_0001
( 为分类器将负样本 - - 集中样本误检为正样本的数量, 为负样本的总数量), 如果/ (C,)< /,则类 别 ς已经满足训练终止条件, 退出 级分类器的训练; 如果所有样本类别 都满足训练终止条件, 则结束第 级分类器的训练, 通过更新样本集合 G:
(P 进行下一级训练(见步骤 S609和 S610 )。
f) 在步骤 S611, 对属于 S的 G ( c^ s ) , 则更新样本权重:
Figure imgf000015_0001
对 则保持样本权重不变; 同时使所有样 本权重归一化使得 Z = 1, 重新开始下一次迭代。
[49] 应了解, 前几级分类器 SCfc =l, 2, 3,···, n )、 尤其是第 1级分 类器 SC7如果仅对 1个类别进行判别,则对于该级分类器的特征挑选则无 需使用例如前向顺序选择法等来搜索特征共享样本类别集合 S o 而对于后 几级分类器 SCfc如果已针对最底层的样品类别进行训练,则无需有效性度 量来判断是否需要样本类别拆分。
[50] 另夕卜,应了解对检测装置的训练并不限定具体的 Boosting算法, 而 可以是 Gentle-Boosting, Real-Boosting等等其它算法。
根据第二和第三实施例的检测装置和检测方法
[51] 根据本发明第二或第三实施例的训练方法所获得的检测装置包括: 输入单元, 被配置成输入待检测数据; 以及级联分类器。 其中级联分类器 如图 7所示, 包括由多个(n个) 串联的级分类器。
[52] 在此所述多个(r个)类别可按相似性标准被逐级合并为预定多层 结构, 并且所述多个类别作为划分最细的类别设置在最底层, 相应地, 所 述多个级分类器被配置成按照由粗到细的策略对所述预定多层结构中的 各层类别进行分类处理,并且每个级分类器都包括数量与所处理类别数量 相对应的强分类器。
[53] 每个所述强分类器包括一组弱分类器,每个弱分类器使用一个特征 对所述待检测数据进行弱分类,其中每个所述级分类器包含一个共享特征 列表,所述共享特征列表中的每个特征被分别属于不同强分类器的一个或 多个弱分类器共享使用;使用同一特征的分属不同强分类器的弱分类器具 有彼此不同的参数值。
[54] 根据第二或第三实施例的检测装置其整体上看是由多个"级分类 器" SC串行联结的一个级联结构分类器,但它是为多个类别物体的同时检 测设计的,在每个级分类器内部多个类别的强分类器被共享特征列表(即 - - 共享特征组 )结合在一起。
[55] 以第三实施例的检测装置为例,待检测数据逐次输入级联分类器的 各级分类器。
[56] 其中, 当待检测数据 某级分类器时,依次被此级所包含的 m个 强分类器判别,如果某个强分类器输出 +1,则此强分类器判别其为属于相 应类别的目标, 称为被此强分类器通过, 否则输出 -1, 判别为非对应类别 的目标, 称为被此强分类器拒绝。
[57] 其中, 所述判别过程如下: 计算当前级分类器的特征列表中的所有 有效特征的值; 对此级的 m个强分类器, 依次按照对特征列表中各特征 的共享情况,根据计算已得的特征值确定各个弱分类器的输出,并相加得 到最终的强分类器的输出。
[58] 所述判别过程中, 如果待检测数据被某个用于检测类别 c,的强分类 器拒绝的情况下, 则后续级分类器中的用于检测类别 c,和其子类的相应强 分类器不再对所述输入的待检测数据继续判别,此时称所述待检测数据被 类别 c,所对应的叶子层类别拒绝。
[59] 所述判别过程中,各级分类器的特征列表中只与所述不再参与判别 过程的各强分类器相关的特征视为无效特征, 不再参与计算, 以节省计算 成本。
所述判别过程中,如果待检测数据被所有叶子层类别拒绝, 则中止判 别过程, 称此待检测数据为非目标数据。 所述判别过程的最后, 如果待检 测数据被最后一级分类器的某个强分类器通过,则判别此待检测数据为具 有所述强分类器所对应的目标类别属性,如果待检测数据被最后一级的多 个强分类器通过, 则判别此待检测数据具有相应的多重目标类别属性。
[60] 根据本发明的检测装置可对各种多类目标数据进行检测, 而在输入 的图像或视频中对多个类别的预定目标进行检测的情况下,根据本发明的 检测装置还可包括:被配置成对待检测图像或视频进行窗口遍历的窗口遍 历部件、和后处理部件,后处理部件被配置成将所述窗口遍历部件产生的 窗口进行合并, 并使用预定阈值对合并窗口进行过滤, 以获得最终的检测 结果。
[61] 图 8示出了根据本发明的检测装置检测图像或视频中预定多类目标 的絲。
[62] 窗口遍历过程 810: 对任意给定的待检测图像(步骤 S811 )或从待 - - 检测视频中截取的图像, 使用矩形窗口进行图 ^4历(步骤 S812 ), 在步 骤 S813依次得到窗口图像(其中遍历的顺序和方式任意, 可以是从左到 右、从上到下的, 也可以是从右到左, 从下到上的; 遍历时窗口平移的步 长任意, 可以是逐像素的, 也可以是隔多个像素的, 或者与当前窗口的尺 寸成比例关系)。
[63] 在遍历时, 依次对扫描过程中得到的每个窗口应用所述级联分类 器,使用训练所得的分类器中的特征通过对窗口图像进行特征计算(步骤 S814 )并应用所述分类器进行分类(步骤 S815 )。 如果级联分类器判别此 窗口图像为目标类别 (具有一种以上的目标类别属性), 则记录此窗口在 原始图像中的位置和尺寸, 以及其所具有的所有目标类别属性(步骤 S816 )o 窗口遍历结束后, 按照一定的比例因子将图像缩小, 重新进行上 述窗口遍历和窗口图像判定过程。重复以上过程, 直到当图像缩小到窗口 遍历无法进行(图像的高度小于窗口高度, 或图像的宽度小于窗口宽度) 为止(见步骤 S817和 S818 )。 将所有正响应窗口按照其对应的图像与原 图像的尺寸比例因子映射到原图像,得到所有正响应在原图像中的位置和 尺寸。
[64] 遍历图像时除了采用以上的模式 WinScanModel (即选择固定尺寸 的窗口遍历图像, 遍历结束后, 按一定比例缩小或放大图像的尺寸, 使用 固定尺寸的窗口重新遍历图像), 还可采用模式 WinScanMode2, 其中保 持图像的尺寸不变, 选择第一次遍历时窗口的尺寸, 当遍历结束后, 按一 定比例缩小或放大窗口的尺寸, 重新遍历原图像。
[65] 对每个窗口图像使用训练所得级联式分类器进行判别后,如果分类 结果为 +1, 则: 如果选择 WinScanModel , 记录当前窗口的尺寸和位置, 并按照缩放图像的比例将当前窗口的尺寸和位置映射回原图像坐标空间, 得到当前响应在原图像中的位置和尺寸; 如果选择 WinScanMode2,则直 接记录当前窗口的尺寸和位置。
[66] 后处理流程 820由后处理部件执行, 包括: 步骤 S821窗口合并以 便合并相邻的正响应结果和步骤 S822阈值过滤以便舍弃弱响应, 并将经 过窗口合并和阈值过滤后剩余的合并结果作为最终的检测结果(步骤 S830 )„
具体地, 在图像中的同一目标(汽车)附近会产生多重响应, 将邻近 的多重响应合并为一个输出响应。 - - 首先, 所述合并过程定义 "临近"为具有相邻的窗口中心位置、 相近 的尺寸比例和相同的目标类别属性,然后计算临近的一簇目标窗口的平均 中心位置、平均窗口尺寸,并将合并的窗口的数量作为合并结果的置信度, 其次,所述合并过程对合并后的位置中心相邻和尺寸相近的合并结果 进行目标属性合并,即如果图像中某个位置附近有多个具有不同目标属性 的合并结果,统计各个目标属性的数量,取数量最大的目标属性为最终目 标属性, 取各个目标属性的置信度的和为最终合并结果的置信度,
所述合并过程结束后,当合并窗口的置信度大于或等于预设置信度阈 值时, 接受此合并结果, 否则舍弃此合并结果。
技术效果
1、根据本发明的各个实施例, 多类目标的分类器之间进行特征共享, 但与共享特征相关的弱分类器在各类内部单独构建的方式,使得各类目标 间的差异得以有效区分,提高了训练的收敛速度, 同时也提高了联合分类 器对各类目标间的区分性能。不硬性规定特征在所有类别中进行共享的方 式减少了不必要的运算。
2、 根据本发明的各个实施例, 多类分类器间的特征共享减少了多类 分类器的特征计算成本。
例如在本发明的第三实施例中, 给定三类汽车 (轿车、 卡车和巴士) 的样本共 17000个, 分别训练三个并行的级联分类器和一个特征共享的联 合分类器, 训练所得分类器使用 Haar-like特征的数量如下表所示:
Figure imgf000018_0001
由上表可见本发明实施例的方法可以大大减少所使用特征的数量。 设定开放汽车测试集(集内样本未参与训练, 包含三类汽车样本共
2264个)和背景图像测试集(尺寸不统一, 提供窗口图像约 5300000个)。 并行使用三类汽车的级联分类器处理以上测试集,然后使用特征共享的联 合分类器处理以上测试集, 测试结果如下表所示:
检测率 汽车测试集处理时间 误检率 背景图像处理时间 - -
Figure imgf000019_0001
由上表可见, 两种方案具有类似的分类性能, ^^合分类器具有更高 的检测效率。分类器所用特征的计算越复杂,联合分类器的检测效率优势 就越明显。
3、 根据本发明的第二和第三实施例, 联合分类器既要区分(多类) 目标图像与非目标图像, 又要尽力体现各目标类别间的差异。 由粗到精的 多层次样本类别使用方式使得联合分类器优先体现目标与非目标之间的 整体差异, 而后考虑目标类别间的差异, 进一步提高了检测的效率。
4、 根据本发明的第二和第三实施例, 在多类联合训练时, 各类独立 使用负样本集的方式便于特征共享式的分类器采用 Cascade 结构形式以 获得更高的检测效率。
其他实施例
[67] 另外,还应该指出的是,上述系列处理和装置也可以通过软件和固件 实现。在通过软件或固件实现的情况下,从存储介质或网络向具有专用硬 件结构的计算机,例如图 9所示的通用计算机 900安装构成该软件的程序, 该计算机在安装有各种程序时, 能够执行各种功能等等。
[68] 在图 9中, 中央处理单元 (CPU)901根据只读存储器 (ROM)902中存 储的程序或从存储部分 908加载到随机存取存储器 (RAM)903的程序执行 各种处理。在 RAM 903中, 也根据需要存储当 CPU 901执行各种处理等 等时所需的数据。
[69] CPU 901、 ROM 902和 RAM 903经由总线 904彼此连接。 输入 /输 出接口 905也连接到总线 904。
[70] 下述部件连接到输入 /输出接口 905: 输入部分 906, 包括键盘、 鼠标 等等; 输出部分 907, 包括显示器, 比如阴极射线管 (CRT)、 液晶显示器 (LCD)等等, 和扬声器等等; 存储部分 908, 包括硬盘等等; 和通信部分 909, 包括网络接口卡比如 LAN卡、调制解调器等等。通信部分 909经由 网络比如因特网执行通信处理。
[71] 根据需要, 驱动器 910也连接到输入 /输出接口 905。 可拆卸介质 911 比如磁盘、 光盘、 磁光盘、 半导体存储器等等根据需要被安装在驱动器 910上, 使得从中读出的计算 呈序根据需要被安装到存储部分 908中。 - -
[72] 在通过软件实现上述系列处理的情况下,从网络比如因特网或存储介 质比如可拆卸介质 911安装构成软件的程序。
[73] 本领域的技术人员应当理解,这种存储介质不局限于图 9所示的其中 存储有程序、 与设备相分离地分发以向用户提供程序的可拆卸介质 911。 可拆卸介质 911的例子包含磁盘 (包含软盘 (注册商标))、 光盘 (包含光盘只 读存储器 (CD-ROM)和数字通用盘 (DVD))、 磁光盘(包含迷你盘 (MD) (注 册商标))和半导体存储器。 或者, 存储介质可以是 ROM 902、 存储部分 908中包含的硬盘等等, 其中存有程序, 并且与包含它们的设备一起被分 发给用户。
[74] 以上描述了本发明的优选实施方式。本领域的普通技术人员知道,本 发明的保护范围不限于这里所公开的具体细节,而可以具有在本发明的精 神实质范围内的各种变化和等效方案。

Claims

权利 要求 书
1. 一种用于对多个类别目标数据进行检测的检测装置, 包括: 输入单元, 被配置成输入待检测数据; 以及
联合分类器,其内部包含数量与所述类别数量相对应并用于分别检测 对应类别的目标数据的强分类器,其中,每个所述强分类器都由一组弱分 类器相加得到, 每个弱分类器使用一个特征对所述待检测数据进行弱分 类,
其中所述联合分类器内包含共享特征列表,所述共享特征列表中的每 个特征被分别属于不同强分类器的一个或多个弱分类器共享使用;使用同 一特征的分属不同强分类器的弱分类器具有彼此不同的参数值。
2.根据权利要求 1所述的检测装置, 还包括: 判别单元, 被配置成 根据所述多个强分类器的分类结果,对所述待检测数据属于哪个类别的目 标数据进行判别。
3. 一种用于对多个类别目标数据进行检测的检测装置, 其中, 所述 多个类别按相似性标准被逐级合并为预定多层结构,并且所述多个类别作 为划分最细的类别设置在最底层, 所述检测装置包括:
输入单元, 被配置成输入待检测数据; 及
级联分类器, 所述级联分类器包括多个串联的级分类器, 所述多个级 分类器被配置成按照由粗到细的策略对所述预定多层结构中的各层类别 分别进行分类处理,并且每个级分类器都包括数量与所处理类别数量相对 应的强分类器, 其中,每个所述强分类器包括一组弱分类器, 每个弱分类 器使用一个特征对所述待检测数据进行弱分类,
其中每个所述级分类器包含共享特征列表,所述共享特征列表中的每 个特征被分别属于不同强分类器的一个或多个弱分类器共享使用;使用同 一特征的分属不同强分类器的弱分类器具有彼此不同的参数值。
4. 根据权利要求 3所述的检测装置, 其中, 每个所述级分类器还被 配置成:针对输入的待检测数据,计算其共享特征列表中的各有效特征的 特征值; 以及, 针对所述级分类器中的各个强分类器, 根据针对强分类器 所使用的特征查询已计算所得的特征值列表从而确定所述强分类器的各 个弱分类器的输出, 并相加得到最终的强分类器的输出。
5. 根据权利要求 3所述的检测装置, 其中, 所述级联分类器被配置 成:使输入的待检测数据依次被各个级分类器中的各个强分类器判别, 并 且在输入的待检测数据被其中一个用于检测类别 c,的强分类器判别为非 目标数据的情况下,则后续的各级分类器中的用于检测类别 c,和 /或其子类 的相应强分类器不再对所述输入的待检测数据继续判别。
6. 根据权利要求 5所述的检测装置, 其中, 所述级联分类器被配置 成:针对每个所述级分类器,判断其共享特征列表中是否存在只与所述不 再参与判别过程的各强分类器相关的特征,如果有则标记该特征为无效特 征, 不再计算其特征值。
7. 根据权利要求 3所述的检测装置, 其中, 所述级联分类器被配置 成: 如果待检测数据被任意一级级分类器中的所有强分类器拒绝,则中止 分类处理; 并且将所述待检测数据判别为非目标数据。
8. 根据权利要求 3所述的检测装置, 其中, 所述多个级分类器中的 最后一级级分类器还包括判别单元,该判别单元被配置成: 如果待检测数 据被某个强分类器通过,则判别所述待检测数据为具有所述强分类器所对 应的目标类别属性;如果待检测数据被所述最后一级级分类器的多个强分 类器通过, 则判别此待检测数据具有相应的多重目标类别属性。
9. 根据权利要求 3所述的检测装置, 用于在输入的图像或视频中对 多个类别的预定目标进行检测,其中还包括:被配置成对待检测图像或从 待检测视频中截取的图像进行窗口遍历的窗口遍历部件,
所述级联分类器被配置成对所述窗口遍历部件获取的窗口图像进行 分类处理,并且在判别窗口图像为目标类别的情况下,记录所述窗口在原 始图像中的位置和尺寸及其具有的所有目标类别属性。
10. 根据权利要求 9所述的检测装置, 其中还包括后处理部件, 被配 置成将所述窗口遍历部件产生的具有目标类别属性的窗口进行局部临近 合并。
11. 根据权利要求 10所述的检测装置, 其中所述后处理部件被进一 步配置成:
针对具有相邻的窗口中心位置、相近的尺寸比例和相同的目标类别属 性的窗口, 计算临近的一簇目标窗口的平均中心位置、 平均窗口尺寸, 并 将合并的窗口的数量作为合并结果的置信度;
对合并后的位置中心相邻和尺寸相近的合并结果进行目标属性合并, 即如果所述图像中某个位置附近有多个具有不同目标属性的合并结果,则 统计各个目标属性的置信度总和,取置信度总和最大的目标属性为最终目 标属性, 并取各个目标属性的置信度总和的和为最终合并结果的置信度, 当所述最终合并结果的置信度大于或等于预设置信度阈值时,接受所 述最终合并结果, 否则舍弃所述最终合并结果。
12. 一种用于对多个类别目标数据进行检测的检测方法, 包括: 输入待检测数据; 以及
使用包括多个强分类器的联合分类器对所述待检测数据进行分类,其 中所述强分类器具有与所述类别数量相对应的数量并分别用于检测对应 类别的目标数据, 所述联合分类器内包含共享特征列表,所述共享特征列 表中的每个特征被分别属于不同强分类器的一个或多个弱分类器共享使 用; 使用同一特征的分属不同强分类器的弱分类器具有彼此不同的参数 值。
13.根据权利要求 12所述的检测方法, 还包括: 根据所述多个强分 类器各自的分类结果,对所述待检测数据属于哪个类别的目标数据进行判 别
14. 一种用于对多个类别目标数据进行检测的检测方法, 其中, 所述 多个类别按相似性标准被逐级合并为预定多层结构,并且所述多个类别作 为划分最细的类别设置在最底层, 所述检测方法包括:
输入待检测数据; 以及
使用包括多个串联的级分类器的级联分类器对所述待检测数据进行 分类,其中所述多个级分类器按照由粗到细的策略对所述预定多层结构中 的各层类别分别进行分类处理,并且每个级分类器都包括数量与所处理类 别数量相对应的强分类器,
所述使用级联分类器进行分类的步骤包括:使输入的待检测数据依次 被各个级分类器中的各个强分类器判别,其中每个所述级分类器包含共享 特征列表,所述共享特征列表中的每个特征被分别属于不同强分类器的一 个或多个弱分类器共享使用,使用同一特征的分属不同强分类器的弱分类 器具有彼此不同的^:值。
15. 根据权利要求 14所述的检测方法, 其中, 所述使输入的待检测 数据依次被各个级分类器中的各个强分类器判别的步骤包括: 针对输入的待检测数据,计算所述级分类器的共享特征列表中的各有 效特征的特征值; 以及, 针对所述级分类器中的各个强分类器, 根据针对 强分类器所使用的特征查询已计算所得的特征值列表从而确定此强分类 器的各个弱分类器的输出, 并相加得到最终的强分类器的输出。
16. 根据权利要求 14所述的检测方法, 其中, 所述使输入的待检测 数据依次被各个级分类器中的各个强分类器判别的步骤包括:在输入的待 检测数据被其中一个用于检测类别 c,的强分类器判别为非目标数据的情 况下,则后续的各级分类器中的用于检测类别 c,和 /或其子类的相应强分类 器不再对所述输入的待检测数据继续判别。
17. 根据权利要求 16所述的检测方法, 其中, 使输入的待检测数据 依次被各个级分类器中的各个强分类器判别的步骤包括:判断所述级分类 器的共享特征列表中是否存在只与所述不再参与判别过程的各强分类器 相关的特征, 如果有则标记该特征为无效特征, 不再计算其特征值。
18. 根据权利要求 14所述的检测方法, 其中, 所述使用级联分类器 进行分类的步骤还包括:如果待检测数据被任意一级级分类器中的所有强 分类器拒绝,则中止分类处理;并且将所述待检测数据判别为非目标数据。
19. 根据权利要求 14所述的检测方法, 其中, 在使用最后一级级分 类器进行分类处理后, 还包括:
如果待检测数据被某个强分类器通过,则判别所述待检测数据为具有 所述强分类器所对应的目标类别属性;如果待检测数据被所述最后一级级 分类器的多个强分类器通过,则判别此待检测数据具有相应的多重目标类 别属性。
20. 根据权利要求 14所述的检测方法, 用于在输入的图像或视频中 对多个类别的预定目标进行检测,其中还包括:对待检测图像或从待检测 视频中截取的图像进行窗口遍历,
使用所述级联分类器对所述待检测数据进行分类的步骤包括:使用所 述级联分类器对所述窗口遍历所获取的窗口图像进行分类处理,并且在判 别窗口图像为目标类别的情况下,记录所述窗口在原始图像中的位置和尺 寸及其具有的所有目标类别属性。
21. 根据权利要求 20所述的检测方法, 其中还包括: 将所述窗口遍 历部件产生的具有目标类别属性的窗口进行局部临近合并。
22. 根据权利要求 21所述的检测方法, 其中所述局部临近合并步骤 包括:
针对具有相邻的窗口中心位置、相近的尺寸比例和相同的目标类别属 性的窗口, 计算临近的一簇目标窗口的平均中心位置、 平均窗口尺寸, 并 将合并的窗口的数量作为合并结果的置信度;
对合并后的位置中心相邻和尺寸相近的合并结果进行目标属性合并, 即如果所述图像中某个位置附近有多个具有不同目标属性的合并结果,则 统计各个目标属性的置信度总和,取置信度总和最大的目标属性为最终目 标属性, 并取各个目标属性的置信度总和的和为最终合并结果的置信度, 当所述最终合并结果的置信度大于或等于预设置信度阈值时,接受所 述最终合并结果, 否则舍弃所述最终合并结果。
PCT/CN2010/071193 2009-04-01 2010-03-23 多类目标的检测装置及检测方法 WO2010111916A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2012502431A JP5500242B2 (ja) 2009-04-01 2010-03-23 複数クラスの目標の検出装置および検出方法
US13/257,617 US8843424B2 (en) 2009-04-01 2010-03-23 Device and method for multiclass object detection
EP10758018A EP2416278A1 (en) 2009-04-01 2010-03-23 Device and method for multiclass object detection

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200910132668A CN101853389A (zh) 2009-04-01 2009-04-01 多类目标的检测装置及检测方法
CN200910132668.0 2009-04-01

Publications (1)

Publication Number Publication Date
WO2010111916A1 true WO2010111916A1 (zh) 2010-10-07

Family

ID=42804869

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/071193 WO2010111916A1 (zh) 2009-04-01 2010-03-23 多类目标的检测装置及检测方法

Country Status (5)

Country Link
US (1) US8843424B2 (zh)
EP (1) EP2416278A1 (zh)
JP (1) JP5500242B2 (zh)
CN (1) CN101853389A (zh)
WO (1) WO2010111916A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011216069A (ja) * 2010-03-16 2011-10-27 Panasonic Corp 物体識別装置、物体識別方法、及び、物体識別装置の学習方法
CN103150903A (zh) * 2013-02-07 2013-06-12 中国科学院自动化研究所 一种自适应学习的视频车辆检测方法
CN107180244A (zh) * 2016-03-10 2017-09-19 北京君正集成电路股份有限公司 一种基于级联分类器的图像检测方法及装置
CN111144478A (zh) * 2019-12-25 2020-05-12 电子科技大学 一种穿帮镜头的自动检测方法
CN111598833A (zh) * 2020-04-01 2020-08-28 江汉大学 一种目标样本瑕疵检测的方法、装置及电子设备

Families Citing this family (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853389A (zh) * 2009-04-01 2010-10-06 索尼株式会社 多类目标的检测装置及检测方法
US8509526B2 (en) * 2010-04-13 2013-08-13 International Business Machines Corporation Detection of objects in digital images
CN102411716A (zh) * 2010-09-21 2012-04-11 索尼公司 目标检测和分类方法和装置
CA2827122A1 (en) 2011-02-11 2012-08-16 Arizona Board Of Regents For And On Behalf Of Arizona State University Methods, systems, and media for determining carotid intima-media thickness
US9684957B2 (en) 2011-02-11 2017-06-20 Arizona Board Of Regents, A Body Corporate Of The State Of Arizona, Acting For And On Behalf Of Arizona State University Systems methods, and media for detecting an anatomical object in a medical device image using a multi-stage classifier
JP2012243180A (ja) * 2011-05-23 2012-12-10 Sony Corp 学習装置および方法、並びにプログラム
CN102855500A (zh) * 2011-06-27 2013-01-02 东南大学 一种基于Haar和HoG特征的前车检测方法
US9330336B2 (en) * 2011-09-16 2016-05-03 Arizona Board of Regents, a body corporate of the State of Arizona, acting for and on behalf of, Arizona State University Systems, methods, and media for on-line boosting of a classifier
US8649613B1 (en) * 2011-11-03 2014-02-11 Google Inc. Multiple-instance-learning-based video classification
WO2013116865A1 (en) 2012-02-02 2013-08-08 Arizona Board Of Regents, For And On Behalf Of, Arizona State University Systems, methods, and media for updating a classifier
WO2013116867A1 (en) 2012-02-03 2013-08-08 Arizona Board Of Regents, For And On Behalf Of, Arizona State University Systems, methods, and media for monitoring the condition of a patient's heart
JP5780979B2 (ja) * 2012-02-17 2015-09-16 株式会社東芝 車両状態検出装置、車両挙動検出装置及び車両状態検出方法
US9443137B2 (en) * 2012-05-08 2016-09-13 Samsung Electronics Co., Ltd. Apparatus and method for detecting body parts
US9449381B2 (en) 2012-09-10 2016-09-20 Arizona Board Of Regents, A Body Corporate Of The State Of Arizona, Acting For And On Behalf Of Arizona State University Methods, systems, and media for generating and analyzing medical images having elongated structures
WO2015017796A2 (en) * 2013-08-02 2015-02-05 Digimarc Corporation Learning systems and methods
CN105723419B (zh) * 2013-11-19 2019-07-23 哈曼国际工业有限公司 对象追踪
US10387796B2 (en) * 2014-03-19 2019-08-20 Empire Technology Development Llc Methods and apparatuses for data streaming using training amplification
KR102445468B1 (ko) 2014-09-26 2022-09-19 삼성전자주식회사 부스트 풀링 뉴럴 네트워크 기반의 데이터 분류 장치 및 그 데이터 분류 장치를 위한 뉴럴 네트워크 학습 방법
CN105718937B (zh) * 2014-12-03 2019-04-05 财团法人资讯工业策进会 多类别对象分类方法及系统
CN104809435B (zh) * 2015-04-22 2018-01-19 上海交通大学 一种基于视觉一致性约束的图像目标分类方法
CN106295666B (zh) * 2015-05-14 2020-03-03 佳能株式会社 获取分类器、检测对象的方法和装置及图像处理设备
US10157467B2 (en) 2015-08-07 2018-12-18 Arizona Board Of Regents On Behalf Of Arizona State University System and method for detecting central pulmonary embolism in CT pulmonary angiography images
US10180782B2 (en) * 2015-08-20 2019-01-15 Intel Corporation Fast image object detector
WO2017029758A1 (ja) * 2015-08-20 2017-02-23 三菱電機株式会社 学習装置および学習識別システム
US9600717B1 (en) * 2016-02-25 2017-03-21 Zepp Labs, Inc. Real-time single-view action recognition based on key pose analysis for sports videos
CN109155069A (zh) 2016-03-09 2019-01-04 新加坡科技研究局 用于自动光学引线接合检验的自确定检验方法
US9471836B1 (en) * 2016-04-01 2016-10-18 Stradvision Korea, Inc. Method for learning rejector by forming classification tree in use of training images and detecting object in test images, and rejector using the same
CN107341428B (zh) * 2016-04-28 2020-11-06 财团法人车辆研究测试中心 影像辨识系统及自适应学习方法
CN106446832B (zh) * 2016-09-27 2020-01-10 成都快眼科技有限公司 一种基于视频的实时检测行人的方法
CN108072909A (zh) * 2016-11-17 2018-05-25 富士通株式会社 物品检测方法、装置和系统
CN106951899A (zh) * 2017-02-24 2017-07-14 李刚毅 基于图像识别的异常检测方法
CN108931540A (zh) * 2017-05-27 2018-12-04 富士通株式会社 物品检测装置
CN109308480A (zh) * 2017-07-27 2019-02-05 高德软件有限公司 一种图像分类方法及装置
CN109961079B (zh) * 2017-12-25 2021-06-04 北京君正集成电路股份有限公司 图像检测方法及装置
CN110163033B (zh) * 2018-02-13 2022-04-22 京东方科技集团股份有限公司 正样本获取方法、行人检测模型生成方法和行人检测方法
CN108388919B (zh) * 2018-02-28 2021-08-10 大唐高鸿信息通信(义乌)有限公司 车载短距离通信网安全特征的识别和预警方法
CN110414541B (zh) * 2018-04-26 2022-09-09 京东方科技集团股份有限公司 用于识别物体的方法、设备和计算机可读存储介质
CN109190455B (zh) * 2018-07-18 2021-08-13 东南大学 基于高斯混合和自回归滑动平均模型的黑烟车识别方法
CN109359683B (zh) * 2018-10-15 2021-07-27 百度在线网络技术(北京)有限公司 目标检测方法、装置、终端和计算机可读存储介质
TW202018727A (zh) * 2018-11-09 2020-05-16 財團法人工業技術研究院 整體式學習預測方法與系統
US11720621B2 (en) * 2019-03-18 2023-08-08 Apple Inc. Systems and methods for naming objects based on object content
CN110163183B (zh) * 2019-05-30 2021-07-09 北京旷视科技有限公司 目标检测算法的评估方法、装置、计算机设备和存储介质
US11120273B2 (en) * 2019-06-21 2021-09-14 Gfycat, Inc. Adaptive content classification of a video content item
US11132577B2 (en) * 2019-07-17 2021-09-28 Cognizant Technology Solutions India Pvt. Ltd System and a method for efficient image recognition
US11379991B2 (en) * 2020-05-29 2022-07-05 National Technology & Engineering Solutions Of Sandia, Llc Uncertainty-refined image segmentation under domain shift
CN111783876B (zh) * 2020-06-30 2023-10-20 西安全志科技有限公司 自适应智能检测电路及图像智能检测方法
CN112508062A (zh) * 2020-11-20 2021-03-16 普联国际有限公司 一种开集数据的分类方法、装置、设备及存储介质
CN113673576A (zh) * 2021-07-26 2021-11-19 浙江大华技术股份有限公司 图像检测方法、终端及其计算机可读存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1952954A (zh) * 2005-10-09 2007-04-25 欧姆龙株式会社 特定被摄体检测装置及方法
US20070154079A1 (en) * 2005-12-16 2007-07-05 Chao He Media validation
CN101315670A (zh) * 2007-06-01 2008-12-03 清华大学 特定被摄体检测装置及其学习装置和学习方法

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5675711A (en) * 1994-05-13 1997-10-07 International Business Machines Corporation Adaptive statistical regression and classification of data strings, with application to the generic detection of computer viruses
EP1049030A1 (en) * 1999-04-28 2000-11-02 SER Systeme AG Produkte und Anwendungen der Datenverarbeitung Classification method and apparatus
US6424960B1 (en) * 1999-10-14 2002-07-23 The Salk Institute For Biological Studies Unsupervised adaptation and classification of multiple classes and sources in blind signal separation
US7565030B2 (en) * 2003-06-26 2009-07-21 Fotonation Vision Limited Detecting orientation of digital images using face detection information
US20050114313A1 (en) * 2003-11-26 2005-05-26 Campbell Christopher S. System and method for retrieving documents or sub-documents based on examples
US7769228B2 (en) * 2004-05-10 2010-08-03 Siemens Corporation Method for combining boosted classifiers for efficient multi-class object detection
KR100682906B1 (ko) * 2004-12-09 2007-02-15 삼성전자주식회사 부스트 알고리즘을 이용한 영상의 얼굴 검출 장치 및 방법
JP4667912B2 (ja) * 2005-03-09 2011-04-13 富士フイルム株式会社 判別器生成装置、判別器生成方法およびそのプログラム
AU2006201849A1 (en) * 2005-05-03 2006-11-23 Tangam Gaming Technology Inc. Gaming object position analysis and tracking
US7817855B2 (en) * 2005-09-02 2010-10-19 The Blindsight Corporation System and method for detecting text in real-world color images
US7756313B2 (en) * 2005-11-14 2010-07-13 Siemens Medical Solutions Usa, Inc. System and method for computer aided detection via asymmetric cascade of sparse linear classifiers
JP4221430B2 (ja) * 2006-09-06 2009-02-12 株式会社東芝 識別器及びその方法
US7840059B2 (en) * 2006-09-21 2010-11-23 Microsoft Corporation Object recognition using textons and shape filters
US7756799B2 (en) * 2006-10-27 2010-07-13 Hewlett-Packard Development Company, L.P. Feature selection based on partial ordered set of classifiers
US7962428B2 (en) * 2006-11-30 2011-06-14 Siemens Medical Solutions Usa, Inc. System and method for joint optimization of cascaded classifiers for computer aided detection
US8031961B2 (en) * 2007-05-29 2011-10-04 Hewlett-Packard Development Company, L.P. Face and skin sensitive image enhancement
US20080298643A1 (en) * 2007-05-30 2008-12-04 Lawther Joel S Composite person model from image collection
US8160322B2 (en) * 2007-08-02 2012-04-17 Siemens Medical Solutions Usa, Inc. Joint detection and localization of multiple anatomical landmarks through learning
US20090161912A1 (en) * 2007-12-21 2009-06-25 Raviv Yatom method for object detection
CN101853389A (zh) * 2009-04-01 2010-10-06 索尼株式会社 多类目标的检测装置及检测方法
US8861842B2 (en) * 2010-02-05 2014-10-14 Sri International Method and apparatus for real-time pedestrian detection for urban driving
CN102147851B (zh) * 2010-02-08 2014-06-04 株式会社理光 多角度特定物体判断设备及多角度特定物体判断方法
US8401250B2 (en) * 2010-02-19 2013-03-19 MindTree Limited Detecting objects of interest in still images
JP2011181016A (ja) * 2010-03-04 2011-09-15 Fujifilm Corp 判別器生成装置および方法並びにプログラム
JP5394959B2 (ja) * 2010-03-23 2014-01-22 富士フイルム株式会社 判別器生成装置および方法並びにプログラム
US8879800B2 (en) * 2011-06-15 2014-11-04 Honeywell International Inc. Quality driven image processing for ocular recognition system
WO2013063765A1 (en) * 2011-11-01 2013-05-10 Intel Corporation Object detection using extended surf features

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1952954A (zh) * 2005-10-09 2007-04-25 欧姆龙株式会社 特定被摄体检测装置及方法
US20070154079A1 (en) * 2005-12-16 2007-07-05 Chao He Media validation
CN101315670A (zh) * 2007-06-01 2008-12-03 清华大学 特定被摄体检测装置及其学习装置和学习方法

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
A. TORRALBA, K.P. MURPHY, W.T. FREEMAN: "Sharing Features: Efficient Boosting Procedures for Multiclass Object Detection", CVPR, 2004
ANTONIO TORRALBA ET AL.: "Sharing visual features for multiclass and multiview object detection", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 29, no. 5, 31 May 2007 (2007-05-31), pages 854 - 869, XP011175348 *
C. HUANG, H. AI, Y. LI, S. LAO: "Vector Boosting for Rotation Invariant Multi-View Face Detection", ICCV, 2005
CHANG HUANQ HAIZHOU AI ET AL.: "The IEEE International Conference on Computer Vision (ICCV-05), pp.446-453, Beijing, China, 20 Oct. 2005 (20.10.2005)", 20 October 2005, BEIJING, CHINA, article "Vector Boosting for Rotation Invariant Multi-View Face Detection", pages: 446 - 453, XP010854821 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011216069A (ja) * 2010-03-16 2011-10-27 Panasonic Corp 物体識別装置、物体識別方法、及び、物体識別装置の学習方法
CN103150903A (zh) * 2013-02-07 2013-06-12 中国科学院自动化研究所 一种自适应学习的视频车辆检测方法
CN107180244A (zh) * 2016-03-10 2017-09-19 北京君正集成电路股份有限公司 一种基于级联分类器的图像检测方法及装置
CN107180244B (zh) * 2016-03-10 2020-10-23 北京君正集成电路股份有限公司 一种基于级联分类器的图像检测方法及装置
CN111144478A (zh) * 2019-12-25 2020-05-12 电子科技大学 一种穿帮镜头的自动检测方法
CN111144478B (zh) * 2019-12-25 2022-06-14 电子科技大学 一种穿帮镜头的自动检测方法
CN111598833A (zh) * 2020-04-01 2020-08-28 江汉大学 一种目标样本瑕疵检测的方法、装置及电子设备

Also Published As

Publication number Publication date
EP2416278A1 (en) 2012-02-08
US20120089545A1 (en) 2012-04-12
JP2012523027A (ja) 2012-09-27
JP5500242B2 (ja) 2014-05-21
US8843424B2 (en) 2014-09-23
CN101853389A (zh) 2010-10-06

Similar Documents

Publication Publication Date Title
WO2010111916A1 (zh) 多类目标的检测装置及检测方法
JP7458328B2 (ja) マルチ分解能登録を介したマルチサンプル全体スライド画像処理
Caldelli et al. Fast image clustering of unknown source images
JP5282658B2 (ja) 画像学習、自動注釈、検索方法及び装置
CN105184260B (zh) 一种图像特征提取方法及行人检测方法及装置
CN112101430A (zh) 用于图像目标检测处理的锚框生成方法及轻量级目标检测方法
JP2010514041A (ja) 複数画像レジストレーション装置及び方法
CN112633382A (zh) 一种基于互近邻的少样本图像分类方法及系统
WO2000055811A1 (fr) Processeur de donnees, procede de traitement de donnees, et support d&#39;enregistrement
Chokkadi et al. A Study on various state of the art of the Art Face Recognition System using Deep Learning Techniques
Luo et al. SFA: small faces attention face detector
CN117011563B (zh) 基于半监督联邦学习的道路损害巡检跨域检测方法及系统
WO2015146113A1 (ja) 識別辞書学習システム、識別辞書学習方法および記録媒体
CN113158777A (zh) 质量评分方法、质量评分模型的训练方法及相关装置
Shang et al. Improving training and inference of face recognition models via random temperature scaling
Tiwari et al. Dgsac: Density guided sampling and consensus
CN104123382B (zh) 一种社会媒体下的图像集摘要生成方法
Abayomi-Alli et al. Facial image quality assessment using an ensemble of pre-trained deep learning models (EFQnet)
JP2023029236A (ja) オブジェクト検出モデルを訓練するための方法及びオブジェクト検出方法
Sangineto Statistical and spatial consensus collection for detector adaptation
CN114332523A (zh) 用分类模型进行分类的装置和方法及计算机可读存储介质
CN112766139A (zh) 目标识别方法及装置、存储介质及电子设备
CN113298087B (zh) 图片分类模型冷启动的方法、系统、装置及介质
Xie et al. Siamese network with phash for video vibration detection
CN108304870B (zh) 点线特征融合的错误匹配剔除方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10758018

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010758018

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2012502431

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 13257617

Country of ref document: US