WO2010109645A1

WO2010109645A1 - Subject identifying method, subject identifying program, and subject identifying device

Info

Publication number: WO2010109645A1
Application number: PCT/JP2009/056230
Authority: WO
Inventors: 亨米澤
Original assignee: グローリー株式会社
Priority date: 2009-03-27
Filing date: 2009-03-27
Publication date: 2010-09-30

Abstract

A subset determining section selects a predetermined number of feature values used for separating subject image samples and non-subject image samples in order of degree of separation of the samples, the subject image sample most separated from the non-subject image samples as the most-separated sample with respect to the selected feature values, extracts a subset including the selected most-separated sample from the subject image samples, and derives a discriminator corresponding to the subset by learning by the LDA array method. A leaf node determining section determines a subset by expanding the subset by using the discriminator, removes the determined subset from the subject image samples, and determines a combination of the subset determined by repeating the feature value selection, the most-separated sample selection, and the subset determination and a discriminator corresponding to the subset to serve as a leaf node. A face image identifying device is constructed to realize the above constitution.

Description

Subject identification method, subject identification program, and subject identification device

The present invention uses a discriminator arranged at each node of the tree structure, and applies the discriminator from the root node, which is the vertex of the tree structure, to the terminal leaf node, so that the subject image and the non-subject image In particular, when identifying a subject while classifying the subject into a plurality of categories, the identification accuracy of the subject is improved and the time required for the identification process is reduced. The present invention relates to a subject identification method, a subject identification program, and a subject identification device.

Conventionally, there is known a face image identification method for automatically identifying whether or not a human face is included in an image captured by a monitoring camera or an authentication camera. A technique such as a subspace method is generally used for such a face image identification method.

For example, as a face image identification method using the Integral Image method, a plurality of rectangular areas are set in an image, and the total value obtained by adding the feature values of all the pixels included in each rectangular area is used. There is a technique for detecting a face image based on this (see Patent Document 1, Patent Document 2, and Non-Patent Document 1).

In order to identify faces in a plurality of directions using these techniques, the face angles are divided into predetermined angle ranges (for example, every 30 degrees), and templates corresponding to the divided ranges (discrimination) are determined. It is common to create a device in advance.

Also, by classifying the created template into a tree structure in advance and tracing from the root node that is the vertex of the tree structure to the leaf node that is the end of the tree structure, it is possible to determine which template the input face image corresponds to. It is necessary to judge.

JP 2007-34723 A JP 2007-109229 A

However, the method of creating a template for each range of a predetermined angle in a so-called decision has a problem that the identification performance of subject identification is greatly influenced by the way of dividing the face angle. Also, how to divide the face angle is determined based on experience values such as experimental data, and it is not clear how the face angle can be divided to obtain better discrimination performance. .

Also, when classifying templates corresponding to multiple face orientations into a tree structure, the number of stages of the tree structure and the way of branching are determined based on empirical values. It is not known whether better identification performance can be obtained if arranged.

The number of templates increases / decreases depending on how the face angle is divided, and the combination of templates that are passed through at the time of identification changes depending on what kind of branch tree structure the template is classified into. Processing time varies. For this reason, there is a problem that the identification performance is sufficient but the processing time is too long, or the processing time is sufficiently short but the identification performance is insufficient.

Furthermore, when a face image is detected by a face image identification method using the Integral Image method, in order to shorten the processing time required for the face image detection process, the areas of the rectangular areas for which the feature value summation value is calculated are compared. It is necessary to set large. However, if the area of the rectangular region is increased, the feature value summation value fluctuates greatly due to the influence of direct sunlight in an image where the direct sunlight hits the face, and the detection accuracy of the face image decreases. Also, when detecting a face image using the subspace method, the subspace method has a large amount of computation, and therefore processing time required for the face image detection process is increased.

For these reasons, the face image identification method, the face image identification program, or the like that can reduce the processing time required for the identification process while improving the identification accuracy of the face image for a plurality of face orientations such as a front face and an oblique face. How to realize a face image identification device is a big problem.

Such a problem occurs not only when identifying facial images by classifying them according to face orientation, but also when identifying facial features by classifying them into categories such as men, women, and races. It is a problem. Such a problem is not a problem that occurs only when a face image is an identification target, but is a problem that similarly occurs when a specific subject is the identification target.

The present invention has been made to solve the above-described problems of the prior art, and when classifying a subject while classifying it into a plurality of categories, the time required for the identification process is improved while improving the identification accuracy of the subject. It is an object to provide a subject identification method, a subject identification program, and a subject identification device that can be shortened.

In order to solve the above-described problems and achieve the object, the present invention uses a discriminator arranged at each node of the tree structure, and from the root node, which is the vertex of the tree structure, toward the terminal leaf node. A subject identification method for identifying a subject image and a non-subject image by applying the discriminator, wherein a plurality of feature quantities used for separation of the subject image sample and the non-subject image sample are separated from each other. Selecting a predetermined number of features from the highest, and selecting the subject image sample that is most separated from the non-subject image sample as the most separated sample for the feature amount selected by the feature amount selection step A most separated sample selection step, and a subset including the most separated sample selected by the most separated sample selection step. A subset determination step of extracting the discriminator corresponding to the subset from the image sample and deriving the discriminator by learning using an LDAArray method, and determining the subset by expanding the subset based on the discriminator. And removing the subset determined by the subset determination step from the subject image sample and then repeating the feature selection step, the most separated sample selection step, and the subset determination step. And a leaf node determination step of determining each of the subsets and the set of discriminators corresponding to the subsets as the leaf nodes.

Further, in the present invention according to the above invention, in the subset determination step, when the subset including the most separated sample is first extracted from the subject image sample, the distance from the most separated sample is short. A predetermined number of the subject image samples are included in the subset and extracted.

Also, the present invention is characterized in that, in the above invention, the subset determining step stops the expansion of the subset when the number of variations of the subset is less than a predetermined threshold.

Further, the present invention is the above-described invention, wherein in the above invention, a branch that represents the mother set composed of the subsets determined as the leaf nodes by the leaf node determination step as a set of node candidates including a predetermined number of the subsets. An evaluation value calculating step for calculating an evaluation value indicating an acceptance rate of the non-subject image sample by the branching plan for each branching plan, and the evaluation value calculated by the evaluation value calculating step is minimized. Each of the node candidates included in the branching plan is determined as a node immediately below the root node, and when the node candidate includes a plurality of the subsets, the number of the subsets included in the node candidates All node decisions that determine all nodes are repeated by repeating the node determination until 1 becomes 1. Characterized in that it further includes a step.

Further, the present invention is the above invention, wherein the evaluation value calculation step is any one of the subset determined as the leaf node by the leaf node determination step and the discriminator corresponding to the subset. The acceptance rate of the non-subject image sample is calculated for every combination that assumes the input of any one of the subsets to the one discriminator, and the evaluation of the branching plan is performed based on the acceptance rate A value is calculated.

Further, the present invention is the above invention, wherein the evaluation value calculating step sets the maximum acceptance rate for all the subsets included in the node candidate for each of the node candidates included in the branching plan. Based on the representative acceptance rate of the node candidate, the evaluation value for the branch plan is calculated based on the representative acceptance rate of the node candidate included in the branch plan.

Also, in the present invention according to the above invention, the all-node determining step may include, when the node candidate determined as the node includes a plurality of the subsets, the non-subject image sample and the node candidate. The discriminator corresponding to the node candidate is derived by learning by the LDAArray method using all the included subsets as inputs.

The present invention further includes an LDAArray step number determining step for determining the number of LDAArray steps in the discriminator so that the number of steps in which the discriminator calculates the non-subject image sample is minimized. Features.

Also, in the present invention according to the above-described invention, the LDAArray stage number determining step may include the LDAArray stage number in the node and the LDAArray in all the subordinate nodes when the node corresponding to the discriminator has a subordinate node. The number of LDAArray stages is determined such that the total number of LDAArray stages, which is the sum of the number of stages, is minimized.

Further, the present invention uses a discriminator arranged at each node of the tree structure, and applies the discriminator from the root node that is the vertex of the tree structure toward the terminal leaf node, thereby subject image Feature selection program for selecting a plurality of feature quantities used for separation of a subject image sample and a non-subject image sample from a higher degree of separation between both samples. A most separated sample selection procedure for selecting the subject image sample most separated from the non-subject image sample as the most separated sample for the feature amount selected by the feature amount selection procedure; and the most separated sample Extracting a subset including the most separated sample selected by the selection procedure from the subject image sample; A subset determination procedure for determining the subset by deriving the classifier corresponding to the subset by learning by the LDAArray method and extending the subset based on the classifier; and the subset determination The subset obtained by repeating the feature amount selection procedure, the most separated sample selection procedure and the subset determination procedure after removing the subset determined by the procedure from the subject image sample and the portion A leaf node determination procedure for determining a set of discriminators corresponding to a set as the leaf nodes is executed by a computer.

Further, the present invention uses a discriminator arranged at each node of the tree structure, and applies the discriminator from the root node that is the vertex of the tree structure toward the terminal leaf node, thereby subject image And a non-subject image discriminating device for selecting a plurality of feature amounts used for separation of a subject image sample and a non-subject image sample from a higher number of the separation degree of both samples. And a most separated sample selecting means for selecting the subject image sample most separated from the non-subject image sample as the most separated sample for the feature quantity selected by the feature quantity selecting means, and the most separated sample A subset including the most separated sample selected by the selection means is extracted from the subject image sample and the unit The classifier corresponding to the set is derived by learning by the LDAArray method, and the subset is determined by expanding the subset based on the classifier, and the subset determining unit After the determined subset is removed from the subject image sample, the feature amount selection unit, the most separated sample selection unit, and the subset determination unit are repeated to obtain the subset and the subset. And a leaf node determining means for determining the corresponding set of discriminators as the leaf nodes.

According to the present invention, a predetermined number of feature quantities to be used for separation of the subject image sample and the non-subject image sample are selected from the ones with a higher degree of separation between both samples, and the non-subject image sample is selected for the selected feature quantity. The most separated subject image sample is selected as the most separated sample, a subset including the selected most separated sample is extracted from the subject image sample, and a discriminator corresponding to this subset is learned by the LDAArray method. Deriving and determining the subset by expanding the subset based on this discriminator, removing the determined subset from the subject image sample, selecting the feature quantity, selecting the most separated sample, and the subset A subset obtained by repeating the decision and a set of discriminators corresponding to this subset are used as leaf nodes. Since it was decided to determine respectively, so dynamically generating identification Yoboku structure by learning, while improving the accuracy of identifying the object, an effect that it is possible to shorten the time required for the discrimination processing.

According to the present invention, when a subset including the most separated sample is first extracted from the subject image sample, a predetermined number of subject image samples are included in the subset from the shortest distance from the most separated sample. Since extraction is performed, there is an effect that the initial member of the subset can be determined by simple processing.

Further, according to the present invention, when the variation number of the subset becomes less than the predetermined threshold, the expansion of the subset is stopped, so by detecting the variation convergence of the number of members of the subset, There is an effect that the expansion of the subset can be stopped at an appropriate timing.

In addition, according to the present invention, with respect to a branching plan in which a mother set composed of subsets determined as leaf nodes is represented as a set of node candidates including a predetermined number of subsets, the acceptance of non-subject image samples by the branching plan is provided. An evaluation value indicating the rate is calculated for each branch plan, and each of the node candidates included in the branch plan that has the smallest calculated evaluation value is determined as a node immediately below the root node, and a plurality of subsets are included in the node candidates. If all nodes are determined by repeating node determination until the number of subsets included in this node candidate is 1, all nodes in the identification tree structure are moved. The effect is that it can be determined automatically.

In addition, according to the present invention, for each of the subsets determined as leaf nodes and the discriminators corresponding to the subsets, all of the assumptions that any one subset is input to any one discriminator Since the acceptance rate of each non-subject image sample is calculated for each combination, and the evaluation value for the branch plan is calculated based on the calculated acceptance rate, the evaluation value taking into account the accuracy and processing amount of each branch plan is calculated. By using it, there is an effect that an appropriate branching plan can be selected.

Further, according to the present invention, for every node candidate included in the branching plan, the maximum acceptance rate for all subsets included in the node candidate is set as the representative acceptance rate of this node candidate, and then the branching plan is included. Since the evaluation value for the branch plan is calculated based on the representative acceptance rate of the included node candidates, the appropriate branch plan is calculated by calculating the evaluation value of the branch plan based on the representative acceptance rate for each node candidate. There is an effect that can be selected.

Further, according to the present invention, when a plurality of subsets are included in a node candidate determined as a node, learning by the LDAArray method using a non-subject image sample and all the subsets included in the node candidate as inputs. Thus, the classifier corresponding to this node candidate is derived, so that an appropriate classifier can be derived for all the nodes other than the leaf nodes.

Also, according to the present invention, the number of LDAArray stages in this discriminator is determined so that the number of stages in which the discriminator calculates the non-subject image sample is minimized, so that by suppressing the processing amount of each discriminator There is an effect that the processing time of the entire identification process can be reduced.

Further, according to the present invention, when a node corresponding to a discriminator has subordinate nodes, the total number of LDAArray stages, which is the sum of the number of LDAArray stages in this node and the number of LDAArray stages in all subordinate nodes, is minimized. Since the LDAArray stage number is determined, it is possible to determine the appropriate LDAArray stage number by appropriately estimating the processing amount of the discriminator corresponding to each node.

FIG. 1 is a diagram showing an outline of a subject identification method according to the present invention. FIG. 2 is a block diagram illustrating the configuration of the face image identification apparatus according to the present embodiment. FIG. 3 is a diagram showing an outline of the subset determination process. FIG. 4 is a diagram illustrating an example of leaf node information. FIG. 5 is a diagram illustrating an example of a branching plan. FIG. 6 is a diagram illustrating a combination of a subset and a discriminator and an acceptance rate. FIG. 7 is a diagram illustrating an example of all node information. FIG. 8 is a diagram showing an outline of the LDAArray stage number determination process. FIG. 9 is a diagram illustrating a relationship between a predetermined number of LDAArray stages and the total number of LDAArray stages for all pixels. FIG. 10 is a diagram showing the face image detection capability. FIG. 11 is a flowchart illustrating a processing procedure of leaf node determination processing. FIG. 12 is a flowchart illustrating a processing procedure of all node determination processing. FIG. 13 is a diagram showing an outline of the LDAArray method. FIG. 14 is a block diagram showing the configuration of the LDAArray unit. FIG. 15 is a diagram illustrating processing for acquiring a feature amount from a sample image. FIG. 16 is a diagram illustrating a process of calculating an aggregate discriminator candidate. FIG. 17 is a diagram illustrating a process for calculating an offset of an aggregation discriminator candidate. FIG. 18 is a diagram illustrating an example of the aggregate discriminator selection. FIG. 19 is a diagram illustrating a process for deriving an aggregation classifier. FIG. 20 is a flowchart illustrating a processing procedure executed by the LDAArray unit. FIG. 21 is a flowchart showing the processing procedure of the aggregate discriminator determination process. FIG. 22 is a diagram showing an outline of the AdaBoost method.

Explanation of symbols

DESCRIPTION OF SYMBOLS 10 Face image identification apparatus 11 Control part 11a Subset determination part 11b Leaf node determination part 11c Branch plan generation part 11d Branch plan determination part 11e Other node determination part 11f LDAArray stage number determination part 11g Identification part 12 Storage part 12a Face image sample 12b Non Face image sample 12c All node information 12ca Leaf node information 12cb Other node information 100 LDAArray unit 111 Control unit 111a Adaboost processing unit 111b Aggregation discriminator derivation unit 111c Aggregation weight coefficient determination unit 111d Sample weight update unit 111e Final discriminator determination unit 112 Storage Part 112a Face image sample 112b Non-face image sample 112c Aggregation discriminator candidate 112d Aggregation discriminator 112e Aggregation weight coefficient

Hereinafter, with reference to the accompanying drawings, preferred embodiments of the subject identification method according to the present invention will be described in detail. In the following, an outline of a subject identification method according to the present invention will be described with reference to FIG. 1, and then an embodiment of a face image identification device to which the subject identification method according to the present invention is applied will be described. Hereinafter, a case where the subject to be identified is a face image will be described.

FIG. 1 is a diagram showing an outline of a subject identification method according to the present invention. Note that (A) in the figure shows a case where facial images are classified and classified by face orientation, and (B) in the figure shows facial features in categories such as male, female, and race. Each case of classification and identification is shown.

As shown in FIG. 1A, in the subject identification method according to the present invention, a “leaf node” that is a terminal node of a tree structure (tree structure) is determined by learning using an “LDAArray method” (FIG. 1A). (Refer to (A-1)), by determining the “inner node” that is each branch of the tree structure and the “root node” that is the vertex of the tree structure based on the determined “leaf node” ( The main feature is that the discrimination tree structure is dynamically generated (see (A-2) in the figure).

Here, the “LDAArray method” is an improved version of the AdaBoost method that is widely used as a boosting learning method, and a predetermined number of unbinarized discriminators are represented by an LDA (Linear Discriminant Analysis) method. An aggregation discriminator is derived by aggregating using, and a final discriminator is derived based on the derived aggregation discriminator. Details of the LDAArray method will be described later with reference to FIG.

Conventionally, when discriminating face images using classifiers placed at each node (root node, internal node, or leaf node) of a discrimination tree structure (tree structure), what sorter is placed at which node Whether to do so was based on experience, so-called decisions.

In other words, in the past, discriminators to be placed at each node of the tree structure are determined in advance by decisions based on experience, and the branching method of the tree structure and the number of stages of the tree structure are also determined within the decisions. It was.

For this reason, when the sample group of face images to be identified is class A and the sample group of non-face images to be excluded is class B, conventionally, the rejection rate of class B that is an exclusion target is sufficiently high. There was no problem. This is because the tree structure determined in advance is not the optimal tree structure for separating class A and class B. In other words, the tree structure is the actual class A and class B. This is probably because the sample distribution was not reflected.

Therefore, in the subject identification method according to the present invention, each node of the tree structure is determined by learning using the above-described class A (face image sample group) and class B (non-face image sample group). In such learning, the above-described LDAArray method is used, so that the discrimination accuracy of each discriminator associated with each node is increased and the elimination rate of class B by the final tree structure itself is increased.

Furthermore, in the subject identification method according to the present invention, the processing amount required for the identification processing is reduced by deriving the discriminator corresponding to each node while taking into account the processing amount required for learning by the LDAArray method.

Specifically, in the subject identification method according to the present invention, as shown in FIG. 1A, first, a “leaf node” that is a terminal node of a tree structure is determined (a1, a2, a3, a4 and a5). Here, each “leaf node” is associated with a discriminator derived by learning by the LDAArray method and a “subset of class A” by which the discriminator can be separated from class B.

Then, determine “other nodes (internal node and root node)” other than “leaf nodes” by evaluating both the identification accuracy and the processing amount for how the determined “leaf nodes” should be combined. (A6, a7, a8 and a9 in the figure). That is, the number of stages of the tree structure and the branching method are not determined in advance, and differ depending on the determination result of “leaf node” and “other node”.

Note that the “subset of class A” associated with leaf node a1 is referred to as “subset a1”, and the subset of class A associated with leaf node a2 is referred to as “subset a2”. The “subset a6” associated with the other node a6 is a direct sum of the “subset a1” and the “subset a2”.

By the way, FIG. 1A shows a case where the face image is classified while being classified according to the orientation of the face. However, the subject identification method according to the present invention can be applied to general identification processing using a tree structure. it can. For example, as shown in FIG. 1B, the subject identification method according to the present invention is also applied to the case where the front face features are classified while being classified into categories such as men, women, and races. Can do.

Specifically, “leaf nodes” are determined by learning using the LDAArray method (b1, b2, b3, b4, b5, and b6 in the figure), and combinations of discriminators corresponding to these “leaf nodes” are determined. By evaluating, "other nodes" are determined (b7, b8 and b9 in the figure). The details of the classifier combination evaluation will be described later with reference to FIGS.

By doing so, as shown in FIG. 1B, the root node is b9, the nodes immediately below the root node are b7, b8, b5 and b6, and the nodes directly below b7 are directly below b1, b2, and b8. A tree structure having nodes b3 and b4 is generated.

Here, each face image shown in FIG. 1B is a “class A subset” corresponding to each node. For example, the leaf node b2 corresponds to a woman, the leaf node b4 corresponds to a deeply carved person, and the leaf node b6 corresponds to a person who has received backlight. The face image shown in the other node b7 is a direct sum of the face image shown in the leaf node b1 (class A subset b1) and the face image shown in the leaf node b2 (class A subset b2). It becomes.

As described above, the subject identification method according to the present invention generates a tree structure for identification based on learning by the LDAArray method, and determines each node while taking the processing amount into account when generating the tree structure. Therefore, the time required for the identification process can be shortened while improving the identification accuracy of the subject. Hereinafter, the subject identification method according to the present invention is referred to as “LDAFlow method”.

Hereinafter, a case where the subject identification method (LDAFlow method) according to the present invention is applied to a face image identification apparatus that identifies a face image and a non-face image (for example, a background image) will be described.

FIG. 2 is a block diagram illustrating the configuration of the face image identification device 10 according to the present embodiment. As shown in the figure, the face image identification device 10 includes an LDAArray unit 100, a control unit 11, and a storage unit 12.

In addition, the control unit 11 identifies the subset determination unit 11a, the leaf node determination unit 11b, the branch plan generation unit 11c, the branch plan determination unit 11d, the other node determination unit 11e, and the LDAArray stage number determination unit 11f. 11g. And the memory | storage part 12 memorize | stores the face image sample 12a, the non-face image sample 12b, and all the node information 12c. Here, the all node information 12c includes leaf node information 12ca corresponding to the leaf node, and other node information 12cb corresponding to another node which is a node other than the leaf node.

The LDAArray unit 100 is a processing unit that performs learning by the LDAArray method described above. Here, the LDAArray unit 100 performs a process of receiving a predetermined face image sample set and a non-face image sample set from the control unit 11 and passing the discriminator derived by learning by the LDAArray method to the control unit 11. The configuration and processing contents of the LDAArray unit 100 will be described later with reference to FIG.

The control unit 11 determines a leaf node in the tree structure for identification, and determines a branch of the tree structure and the number of stages based on the determined leaf node, thereby performing processing for determining all nodes in the tree structure It is. That is, the control unit 11 is a processing unit that performs tree structure determination by the “LDAFlow method”.

Based on the face image sample 12a and the non-face image sample 12b read from the storage unit 12, the subset determination unit 11a temporarily determines a subset corresponding to each leaf node of the tree structure and determines a final subset. It is a processing part which performs the process to perform. Here, the subset refers to a subset of the face image sample 12a that can be separated from the non-face image sample 12b by each leaf node when the entire face image sample 12a is the entire set.

Further, the subset determination unit 11a is also a processing unit that updates the subset temporarily determined using the discriminator received from the leaf node determination unit 11b. Then, the subset determination unit 11a repeats the notification of the tentatively determined subset to the leaf node determination unit 11b and the reception of the discriminator from the leaf node determination unit 11b, so that the final corresponding to each leaf node is obtained. A subset is determined and notified to the leaf node determination unit 11b.

The leaf node determination unit 11b sends the subset (subset of the face image sample 12a) provisionally determined by the subset determination unit 11a and the non-face image sample 12b received via the subset determination unit 11a to the LDAArray unit 100. This is a processing unit that performs a process of notifying and receiving the discriminator derived by the LDAArray unit 100 as a temporary decision discriminator.

Further, the leaf node determination unit 11b notifies the temporarily determined discriminator to the subset determination unit 11a as needed, and repeats the process of receiving the subset determined temporarily by the subset determination unit 11a, thereby corresponding to each leaf node. The process of finally determining the discriminator and the subset to be performed is also performed. Then, the leaf node determination unit 11b registers the finally determined subset and classifier pair in the leaf node information 12ca of the storage unit 12.

Here, an outline of the subset determination process performed by the subset determination unit 11a will be described with reference to FIG. FIG. 3 is a diagram showing an outline of the subset determination process. Note that (A) in the figure shows the sample distribution for each feature amount used to separate class A (a set of face image samples 12a) and class B (a set of non-face image samples 12b). (B-1) to (B-6) in the figure show the procedure of the subset determination process, respectively.

As shown in FIG. 3A, when a predetermined feature amount is selected, the sample distribution for the selected feature amount is represented as a graph in which class A and class B have overlapping portions. The degree of overlap between class A and class B differs for each feature amount. In FIG. 3A, the horizontal axis represents the feature amount, and the vertical axis represents the frequency related to the sample distribution.

Therefore, the subset determination unit 11a selects a predetermined number of feature amounts from the one with the higher degree of separation between the class A and the class B. In the following, a case will be described in which the subset determining unit 11a selects two feature amounts from the one with the higher degree of separation between class A and class B. In the following, a case will be described in which the face image sample 31 belonging to class A and the non-face image sample 32 belonging to class B are arranged on a two-dimensional plane having two feature amounts as the vertical axis and the horizontal axis, respectively.

As shown in FIG. 3B-1, when the face image sample 31 belonging to class A is represented by a black circle and the non-face image sample 32 belonging to class B is represented by a white circle, a distribution map of each sample is obtained. It is done. Here, the subset determination unit 11 a selects a sample (most separated sample 33) farthest from the center of gravity position of the non-face image sample 32 distribution from the face image samples 31.

Subsequently, as shown in FIG. 3B-2, the subset determining unit 11a selects a predetermined number of face image samples 31 from the closest distance (Euclidean distance) from the most separated sample 33. In addition to A1 (see 34 in the figure). At the stage shown in FIG. 3B-2, the number of members of the subset A1 is four including the most separated sample 33.

Then, the subset determination unit 11a notifies the LDAArray unit 100 of the temporarily determined subset A1 (the number of members is 4) via the leaf node determination unit 11b. Then, the LDAArray unit 100 performs learning by the LDAArray method using the subset A1 (34) and class B (a set of non-face image samples 12b) as inputs, and determines the discriminator F1 corresponding to the subset A1 (34). To derive.

Subsequently, as shown in FIG. 3B-3, when the discriminator F1 is received via the leaf node determination unit 11b, the subset determination unit 11a uses the discriminator F1 to set the subset A1 ( The face image samples 31 other than 34) are evaluated.

Then, the face image sample 31 whose value evaluated by the discriminator F1 is equal to or greater than a predetermined value is added to the subset A1 (34) to generate a new subset A1 (35a). In the case shown in FIG. 3B-3, the number of members of the subset A1 (35a) is six including the most separated sample 33.

Furthermore, the subset determining unit 11a notifies the temporarily determined subset A1 (35a) to the LDAArray unit 100 via the leaf node determining unit 11b, and the LDAArray unit 100 includes the subset A1 (35a) and the class B (non-face). Learning is performed by the LDAArray method using a set of image samples 12b as input, and a discriminator F1 corresponding to the subset A1 (35a) is derived.

Subsequently, the subset determining unit 11a uses the discriminator F1 to evaluate the face image samples 31 other than the subset A1 (35a), and the face image sample whose value evaluated by the discriminator F1 is a predetermined value or more. 31 is added to the subset A1 (35a) to generate a new subset A1 (35b).

Then, the reconstruction and learning of the subset A1 shown in 34, 35a, and 35b in the figure are repeated, and the subset A1 is determined when the change in the number of members of the subset A1 becomes less than a predetermined threshold. Also, the discriminator F1 corresponding to the subset A1 is determined. The set of the subset A1 and the discriminator F1 determined in this way corresponds to the first leaf node.

Subsequently, if the subset A1 (35b) is determined as the final subset A1, the subset determination unit 11a sets the subset A1 (35b) as shown in (B-4) of FIG. The face image sample 31 belonging to class A is removed.

Then, as shown in (B-5) of FIG. 3, with respect to the face image sample 31 from which the subset A1 has been removed, the portion based on selection of the most separated sample 36, provisional determination of the subset A2 (37a), and learning The reconstruction of the set A2 (37a) is repeated to determine the final subset A2 (37b). Also, the discriminator F2 corresponding to the subset A2 is determined. The set of the subset A2 and the discriminator F2 determined in this way corresponds to the second leaf node.

Subsequently, as shown in (B-6) of FIG. 3, the reconstruction is repeated for the subset A3 as in the case of the subset A1 and the subset A2 (see 38a and 38b in FIG. 3). The final subset A3 and the discriminator F3 are determined. Hereinafter, each leaf node is determined by repeating the procedures (B-2) to (B-5) in FIG.

FIG. 3 shows the case where two feature amounts are selected from the one with the greater degree of separation between class A and class B. However, the feature amount to be selected is set to a predetermined number (n) of three or more. Also good.

In this case, the face image sample 31 belonging to class A and the non-face image sample 32 belonging to class B may be arranged on an n-dimensional plane having n feature values as axes. As the distance between samples, the Euclidean distance in the n-dimensional plane may be used.

Returning to the description of FIG. 2, the branch plan generation unit 11c will be described. The branch plan generation unit 11c is a processing unit that performs processing to generate a branch plan of a tree structure in which only leaf nodes are determined based on the leaf node information 12ca stored in the storage unit 12 by the leaf node determination unit 11b. The branch plan generation unit 11c also performs a process of notifying all the generated branch plans to the branch plan determination unit 11d.

Specifically, when n leaf nodes are determined, the branch plan generation unit 11c uses all combinations of leaf nodes ( _n C ₂ to _n C _i : 2 ≦ i ≦ n) as branch plans. Generate and notify all the generated branch plans to the branch plan decision unit 11d.

For example, when there are three leaf nodes (α, β, and γ), “Group A (α and β), Group B (only γ)” as branch plan 1 and “Group A (α) as branch plan 2 And γ), group B (only β) ”, branching plan 3 as“ group A (β and γ), group B (α only) ”, and branching plan 4 as“ group A (α, β and γ only) ” Thus, four branch plans are generated.

The branch plan determination unit 11d is a processing unit that performs processing to narrow down all branch plans received from the branch plan generation unit 11c to one branch plan. Specifically, the branching plan determining unit 11d selects any one part for any one classifier with respect to the subsets (A1 to An) and classifiers (F1 to Fn) included in the leaf node information 12ca. The acceptance rate of class B is calculated for every combination that assumes a set input.

Subsequently, the branch plan determination unit 11d calculates a representative acceptance rate that represents the group for each group included in each branch plan, and calculates an evaluation value for evaluating each branch plan based on the calculated representative acceptance rate. calculate. Then, the branch plan with the smallest evaluation value is determined as the branch to be used. Here, the evaluation value refers to the acceptance rate of class B as the entire branch plan.

If a group included in the branch plan determined by the branch plan determination unit 11d includes a plurality of leaf nodes, the branch plan generation unit 11c for the group includes all combinations of leaf nodes ( _n C ₁ to _n C _i ; 1 ≦ i ≦ n) is generated as each branch plan.

Then, the branching plan determining unit 11d calculates the evaluation value of each branching plan, and determines the branching plan having the smallest evaluation value as a branch under the group. Furthermore, the branching plan determining unit 11d repeats these processes until all the groups are composed of only one leaf node.

The other node determination unit 11e receives all branches determined from the branch plan determination unit 11d, that is, all nodes other than leaf nodes (other nodes), and determines a subset and a discriminator corresponding to the received other nodes. It is a processing part which performs the process to perform.

Specifically, when the predetermined other node is a group of a plurality of leaf nodes, the other node determination unit 11e calculates the direct sum of the subsets corresponding to the leaf nodes in the group. As a subset corresponding to. If the predetermined other node is a group of one leaf node, a subset and a discriminator corresponding to the leaf node are employed as they are.

Further, the other node determination unit 11e also performs a process of determining a discriminator corresponding to the determined subset. Specifically, the other node determination unit 11e notifies the LDAArray unit 100 of the determined subset. Then, the LDAArray unit 100 performs learning by the LDAArray method using the notified subset and class B (a set of non-face image samples 12b) as inputs, and derives and derives a discriminator corresponding to the subset. The classifier is returned to the other node determination unit 11e.

In this way, the other node determination unit 11e determines a subset and a set of discriminators for all nodes (other nodes) other than the leaf node already determined by the leaf node determination unit 11b. When the other node determination unit 11e determines a set of subsets and discriminators for all other nodes, the group and the branching relationship of each node are registered in the other node information 12cb.

Note that the face image sample 12a that does not belong to any subset may belong to the subset that is farthest from class B. In addition, when it is considered that the sample is clearly erroneous as a result of visual inspection, the face image sample 12a may be deleted.

Here, FIG. 4 illustrates an example of leaf node information 12ca, FIG. 5 illustrates an example of a branch plan generated by the branch plan generation unit 11c, FIG. 6 illustrates a specific example of the branch determination process performed by the branch plan determination unit 11d, and the like. An example of the node information 12cb will be described using FIG.

First, an example of the leaf node information 12ca will be described with reference to FIG. FIG. 4 is a diagram illustrating an example of the leaf node information 12ca. Note that (A) in the figure shows an example of the leaf node information 12ca, and (B) in the figure shows a branching example using the leaf node information 12ca.

FIG. 4A shows a case where six leaf nodes are determined by the leaf node determination unit 11b. As shown in FIG. 4A, the subset A1 and the discriminator F1 are determined as the first leaf node, the subset A2 and the discriminator F2 are determined as the second leaf node, and so on. In addition, six leaf nodes have been determined.

When six leaf nodes are determined in this way, for example, as shown in FIG. 4B, another node of the tree structure (a node indicated by white letters in the figure) 11e is determined. Here, the level corresponding to the root node of the tree structure is referred to as 0 level, the level immediately below the root node as 1 level, and the level immediately below 1 level as 2 levels.

In the case shown in FIG. 4B, the first-stage internal node that bundles the second-stage subset A1 and subset A2 is determined, and the second-stage subset A3 and subset A4 are bundled. The first internal node is determined. Then, the root node of the 0th stage is determined as a node that bundles all the nodes of the first stage.

Note that the tree structure shown in FIG. 4B is an example, and the number of stages of the tree structure and the way of branching differ depending on the result of the determination process by the other node determination unit 11e. In addition, the description “A1 + A2” shown in the drawing represents a direct sum of the subset A1 and the subset A2.

Next, an example of the branch plan generated by the branch plan generation unit 11c will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of a branching plan. In the figure, an example of a branch plan generated by the branch plan generation unit 11c when six leaf nodes are determined (see FIG. 4A) is shown. In the following description, a node corresponding to the subset A1 is referred to as a node A1.

As shown in FIG. 5, the branch plan generator 11c generates a branch plan 1 for branching into a group 1 consisting of node A1, node A2, node A3 and node A4 and a group 2 consisting of node A5 and node A6. To do. Further, a branch plan 2 is generated that branches into a group 1 composed of the nodes A1, A2 and A3, a group 2 composed of the nodes A4 and A5, and a group 3 composed only of the node A6.

Similarly, the branch plan generation unit 11c generates all grouping patterns for the six leaf nodes. In the figure, a case where the branch plan generation unit 11c generates m grouping patterns, that is, a case where m branch plans are generated is illustrated.

Next, a specific example of the branch determination process performed by the branch plan determination unit 11d will be described with reference to FIG. FIG. 6 is a diagram showing the relationship between the distribution of the subset and the threshold value of the discriminator and the acceptance rate. Here, the acceptance rate in the figure refers to the acceptance rate of the class B image group when the class A threshold obtained by inputting the class A image group to the discriminator Fn is used. And

Note that (A) in the figure shows the relationship between the distribution of the subset and the threshold value of the discriminator, and (B) in the figure shows the acceptance of class B (set of non-face image samples 12b) in each combination. Examples of rates are shown for each. In FIG. 6B, 64 indicates a record for the discriminator F1, and 65 similarly indicates a record for the discriminator F2.

As shown in (A) of FIG. 6, the branching plan determining unit 11d performs all combinations of all subsets An included in the leaf node information 12ca and all discriminators Fn included in the leaf node information 12ca. Consider.

Here, since the discriminator F1 and the subset A1 are originally generated as a set for one leaf node, when the subset A1 and the class B are input to the discriminator F1, the class B is selected from the subset A1. It should be able to be separated efficiently (see 61 in the figure).

That is, in this case, the acceptance rate of class B is a low value. Here, the broken line shown by 61 in the figure corresponds to a predetermined deviation (for example, 3σ or 4σ) in the class A distribution. The ratio of class B distributed on the class A side from the broken line is the acceptance rate of class B.

On the other hand, since the subset A2 is originally generated as a set with the discriminator F2, when the subset A2 and class B are input to the discriminator F1, compared to the case where the subset A1 is input, Class B cannot be separated efficiently (see 62 in the figure). As described above, when a subset An originally generated as a pair with another classifier Fn is input to the classifier F1, the compatibility between the classifier F1 and each subset is determined.

Also, with respect to the discriminator F2, when the subset A1 is input (see 63 in the figure), when the subset A2 is input, the combinations with all the subsets An are considered, Similar processing is repeated for the discriminator Fn.

In this way, as shown in FIG. 6B, the acceptance rate of class B is calculated for each combination of the discriminator and the subset. Based on the acceptance rate, a representative acceptance rate representing each group included in the branch plan is calculated. For example, the representative acceptance rate representing the group 1 shown in the branch plan 2 of FIG. 5 is calculated by the following procedure.

First, the acceptance rate of class B when the classifier F1 and the subset A1 are combined is represented as AR (F1, A1). For example, in FIG. 6B, AR (F1, A1) = 1%.

The group 1 of the branch plan 2 includes three nodes A1 to A3. For this reason, the total number of combinations of classifiers and subsets is nine (3 × 3). In this case, the acceptance rates of the nine combinations are AR (F1, A1), AR (F1, A2), AR (F1, A3), AR (F2, A1), AR (F2, A2), AR, respectively. (F2, A3), AR (F3, A1), AR (F3, A2), and AR (F3, A3).

Here, the branching plan determining unit 11d sets the maximum acceptance rate among the nine class B acceptance rates as the representative acceptance rate representing the group 1 of the branching plan 2. For example, assuming that AR (F1, A2) = 70% is the maximum acceptance rate, the branching plan determining unit 11d calculates the representative acceptance rate in the group 1 of the branching plan 2 as 70%.

In addition, the branch plan determination unit 11d calculates the representative acceptance rate in the same manner for the other groups (group 2 and group 3) of the branch plan 2. Similarly, the branch plan determination unit 11d calculates the representative acceptance rate of the group included in each branch plan for all branch plans (in the case of FIG. 5, branch plan 1 to branch plan m).

Then, the branch plan deciding unit 11d calculates the evaluation value of each branch plan using the formula “evaluation value = Σ ((representative acceptance rate−γ) × number of leaf nodes + 1)”. Here, “Σ” represents the total sum regarding the number of groups, and “γ” represents a predetermined adjustment value. As described above, the branching plan determining unit 11d calculates the evaluation value of each branching plan, and adopts the branching plan having the smallest evaluation value.

In the case of branching plan 2 in FIG. 5, group 1 includes three nodes, and group 2 includes two nodes, so there is a possibility of further branching. For this reason, the branching plan determining unit 11d repeats the evaluation value calculation process until all the nodes included in each group become one. If all branches are determined, the branch plan determining unit 11d registers all the determined nodes and branch relationships in the all node information 12c of the storage unit 12.

Next, an example of all node information 12c will be described with reference to FIG. FIG. 7 is a diagram illustrating an example of all node information 12c. In the figure, the nodes represented by white letters represent the other nodes determined by the branching plan determining unit 11d, and the other nodes represent the leaf nodes determined by the leaf node determining unit 11b. .

For example, the node (A1 + A2) consisting of a set of the subset (A1 + A2) and the discriminator Fβ is determined by the branching plan determining unit 11d as the first stage node. Here, the discriminator Fβ is derived by learning by the LDAArray method by the LDAArray unit 100 having a subset (A1 + A2) and class B (a set of non-face image samples 12b) as inputs.

Further, as the 0th stage node, a branch plan deciding unit 11d decides a node (A1 + A2 + A3 + A4 + A5 + A6) consisting of a set of a subset (A1 + A2 + A3 + A4 + A5 + A6) and a discriminator Fα. Here, the discriminator Fα is derived by learning by the LDAArray method by the LDAArray unit 100 having a subset (A1 + A2 + A3 + A4 + A5 + A6) and class B (a set of non-face image samples 12b) as inputs.

Thus, the all node information 12c is information including a subset and a discriminator corresponding to all the nodes constituting the tree structure. When the other node determination unit 11e registers the other node information 12cb in the storage unit 12, the leaf node information 12ca and the other node information 12cb are combined to complete the entire node information 12c. Therefore, the LDA Array stage number determination unit 11f determines the “LDAArray stage number” of the discriminator corresponding to each node.

Here, “the number of LDAArray stages” refers to the number of aggregate discriminators (K) included in the discriminator derived by learning by the LDAArray method. Note that, by adjusting the number of LDAArray stages from the viewpoint of reducing the processing amount, it is possible to reduce the amount of calculation as a whole.

Referring back to FIG. 2, the LDAArray stage number determination unit 11f will be described. The LDAArray stage number determination unit 11f is a processing unit that performs a process of determining the LDAArray stage number of the discriminator corresponding to each node included in the all node information 12c. Here, the LDAArray stage number determination unit 11f determines the number of LDAArray stages of each discriminator so that the total number of LDAArray stages of class B when each discriminator is used is minimized.

Here, the outline of the LDAArray stage number determination process performed by the LDAArray stage number determination unit 11f will be described with reference to FIG. 8, and the relationship between the predetermined LDAArray stage number and the total LDAArray stage number will be described with reference to FIG.

First, an outline of the LDAArray stage number determination process performed by the LDAArray stage number determination unit 11f will be described with reference to FIG. FIG. 8 is a diagram showing an outline of the LDAArray stage number determination process. Note that (A) in the figure shows the arrangement of the discriminator as a premise for explanation, and (B) to (D) in the figure show the procedure for determining the number of LDAArray stages.

As shown in FIG. 8A, the classifier Fα is located at the root node, the classifiers Fβ, Fγ, F5, and F6 are subordinate to the classifier Fα, and the classifiers F1 and F6 are subordinate to the classifier Fβ. The procedure of the LDAArray stage number determination process when F2 and discriminators F3 and F4 are arranged under the discriminator Fγ will be described below.

As shown in FIG. 8B, the LDAArray stage number determination unit 11f first collates the class B image group with the discriminator Fα, and each pixel of each image is excluded by what number of LDAArray stage numbers. Calculate. Specifically, such a pixel is excluded when the number of pixels to be excluded becomes a predetermined threshold value or less. In the case shown in the figure, the pixel in the upper left corner is in the 5th row, the adjacent pixel is in the 20th row, and the adjacent pixel is in the 30th row. Yes.

If the number of stages of each pixel is calculated in this way, it is assumed that the collation has been stopped at a predetermined number of stages (10 in the figure), and the pixels that can be eliminated up to the predetermined number of stages are masked. The reason why the pixels that can be eliminated up to a predetermined number of stages are masked is that there is no need to further discriminate the subordinate classifier in the tree structure.

Subsequently, as shown in FIG. 8C, the LDAArray stage number determination unit 11f uses the discriminators under the discriminator Fα for pixels other than the pixels masked in FIG. 8B. The number of LDAArray stages to be excluded for each pixel is calculated. Then, the number of stages obtained in (C) of FIG. 8 is added to the number of stages obtained in (B) of FIG.

For example, the number of stages of the second pixel from the upper left corner to the right in FIG. 8B is “20”, and the number of excluded stages when the discriminator Fβ is used for such a pixel is “5”. Hereinafter, similarly, “7” for the discriminator Fγ, “3” for the discriminator F5, and “9” for the discriminator F6. In this case, the total number of LDAArray stages required for such pixel determination is 20 + 5 + 7 + 3 + 9 = 44.

In this manner, the LDAArray stage number determination unit 11f calculates the relationship between the total number of LDAArray stages for each pixel of class B and the predetermined number of LDAArray stages (10 stages in the figure). In the discriminator Fβ and the discriminator Fγ, pixels that could not be eliminated by a predetermined number of LDAArray stages (10 stages in the figure) are further masked (see FIG. 8D), for example, in the classifier F1. The number of exclusion stages is added.

Then, the LDAArray stage number determination unit 11f obtains the total number of LDAArray stages corresponding to the predetermined LDAArray stage number for each pixel while changing the predetermined LDAArray stage number from one stage to a predetermined stage number for each discriminator. Then, when the total number of LDAArray stages for each pixel is added for all pixels, the relationship between the predetermined number of LDAArray stages and the total number of LDAArray stages for all pixels is obtained for each discriminator.

FIG. 9 is a diagram showing the relationship between a predetermined number of LDAArray stages and the total number of LDAArray stages for all pixels. The horizontal axis of the graph shown in the figure represents a predetermined number of LDAArray stages, and the vertical axis represents the total number of LDAArray stages for all pixels.

As shown in FIG. 9, when the curve representing the relationship between the predetermined number of LDAArray steps and the total number of LDAArray steps for all pixels takes a minimum value 91, the LDAArray step number determination unit 11f corresponds to the minimum value 91. The number of LDAArray stages is determined as the number of LDAArray stages of the corresponding discriminator. In the case shown in the figure, the number of LDAArray stages is seven.

Thus, the LDAArray stage number determination unit 11f determines the LDAArray stage number for each discriminator included in the all node information 12c, and registers the determined LDAArray stage number in the all node information 12c.

Returning to the description of FIG. 2, the identification unit 11g will be described. The identification unit 11g is a processing unit that performs input image discrimination processing using the completed tree structure included in the all-node information 12c, the discriminator arranged at each node of the tree structure, and the number of LDAArray stages of each discriminator. .

Specifically, the identification unit 11g uses the completed tree structure (for example, see FIG. 8A) and applies each discriminator from the root node, which is the vertex of the tree structure, to the terminal leaf node. By evaluating, it is determined which node the input image corresponds to. If none of the nodes correspond, the input image is determined to belong to class B (a set of non-face image samples 12b).

The storage unit 12 is a storage unit configured by a storage device such as a nonvolatile memory or a hard disk drive, and stores a face image sample 12a, a non-face image sample 12b, and all node information 12c.

The face image sample 12a is a group of face image samples belonging to class A. The non-face image sample 12b is a sample group of non-face images (for example, background images) belonging to class B. Since all node information 12c has already been described with reference to FIG. 4 or FIG. 7, description thereof is omitted here.

Next, experimental data obtained by calculating the rejection rate of non-face images by experiments using the face image identification device 10 will be described with reference to FIG. FIG. 10 is a diagram showing the face image detection capability.

Note that the horizontal axis of the graph shown in FIG. 10 indicates the number of cases in which a non-face image is misrecognized as a face image, and the vertical axis indicates the ratio in which the face image is correctly recognized as a face image. For comparison, the graph also shows LDAArray method data and AdaBoost method data, in addition to the LDAFlow method data performed by the face image identification device 10. Note that the number of face images and the number of non-face images used in the graph shown in FIG. 10 are both about 10,000.

As shown in FIG. 10, as a result of conducting an experiment on all face image samples 12a stored in the DB (storage unit 12), the positive recognition rate (see the solid curve) by the LDAFlow method is different from that of the other methods. It turned out to be higher.

Specifically, the graph shown in the figure uses the distribution of two populations of a class A face image group and a class B non-face image group, and the number of non-face images shown on the horizontal axis in the figure is represented by a face. When a threshold value is set at a position that is erroneously recognized as an image, a ratio in which the face image is correctly recognized as a face image with the threshold value is plotted in the vertical axis direction.

That is, the smaller the number of non-face images that are mistakenly recognized as face images, the stricter the threshold value, and the threshold value indicated by the broken line in FIG. Conversely, the greater the number of mis-recognized non-face images as face images, the lower the threshold value, and the threshold value indicated by the broken line in FIG.

That is, in the graph shown in FIG. 10, according to the LDAFlow method performed by the face image identification device 10, the face image is correctly recognized as a face image even if the number of non-face images mistakenly recognized as face images increases. It can be seen that the ability to do is good.

Next, a processing procedure of leaf node determination processing performed by the leaf node determination unit 11b and the like will be described with reference to FIG. FIG. 11 is a flowchart illustrating a processing procedure of leaf node determination processing. This figure shows a case where the subset determining unit 11a selects one feature amount in the process of selecting a predetermined number of feature amounts from the one with the greater degree of separation between class A and class B. .

As shown in FIG. 11, in the leaf node determination process, first, a counter i is initialized to 1 (step S101), and a feature quantity that most separates class A and class B is selected (step S102). Then, a sample (MAX) that is most separated from class B with respect to the selected feature quantity is extracted from class A (step S103).

Subsequently, the extracted MAX (most separated sample) and a predetermined number of class A samples within a predetermined distance are added to the subset (Ai) (step S104). Then, the discriminator (Fi) is derived by performing learning using the LDAArray method using the subset (Ai) and class B (step S105).

Then, another sample of class A is evaluated using the unbinarized discriminator (fi) of the discriminator (Fi) (step S106), and the unbinarized discriminator (fi) becomes equal to or greater than the threshold value (α). Other samples are added to Ai (step S107).

Subsequently, it is determined whether or not the change in the number of members of the subset (Ai) is less than the threshold (β) (step S108). If the change is less than the threshold (β) (step S108, Yes), the partial The set (Ai) is confirmed and removed from class A (step S109). On the other hand, when the determination condition of step S108 is not satisfied (step S108, No), the processing after step S105 is repeated.

Then, it is determined whether or not the number of samples remaining in the class A is equal to or less than a predetermined number or cannot be separated (step S110). If the determination condition of step S110 is satisfied (step S110, Yes), the process ends. To do. On the other hand, when the determination condition of step S110 is not satisfied (step S110, No), the counter i is counted up (step S111), and the processes after step S102 are repeated.

FIG. 11 shows the case where the subset determining unit 11a selects one feature amount in the process of selecting a predetermined number of feature amounts from the one with the greater degree of separation between class A and class B. Two or more feature amounts may be selected.

Next, the processing procedure of all node determination processing performed by the other node determination unit 11e will be described with reference to FIG. FIG. 12 is a flowchart illustrating a processing procedure of all node determination processing. In the drawing, a subset extracted from class A is described as “class Ak”. Further, “n” in the figure represents the number of leaf nodes determined by the leaf node determination unit 11b.

As shown in FIG. 12, in the all-node determination process, first, counters i and k are each initialized to 1 (step S201), and a threshold (γ) based on the variance of class Ak when discriminator Fi is used. Is calculated (step S202). Then, the acceptance rate of class B when the threshold value (γ) is used is calculated (step S203).

Subsequently, it is determined whether or not the counter k is equal to the number of leaf nodes n (step S204). If not equal (step S204, No), the counter k is incremented (step S205), and then step S203. The subsequent processing is repeated. On the other hand, when the determination condition of step S204 is satisfied (step S204, Yes), the process proceeds to step S206.

Subsequently, it is determined whether or not the counter i is equal to the number of leaf nodes n (step S206). If the counter i is not equal (step S206, No), the counter i is counted up (step S207), and then step S203. The subsequent processing is repeated. On the other hand, when the determination condition of step S206 is satisfied (step S206, Yes), the process proceeds to step S208.

The branch plan generation unit 11c generates each branch plan (step S208), and the branch plan determination unit 11d calculates a class B acceptance rate for each group included in each branch plan (step S209). Subsequently, the branching plan determining unit 11d calculates an evaluation value of each branching plan (step S210), and determines a branching plan having the smallest evaluation value as a new stage in the tree structure (step S211).

Then, it is determined whether or not the number of members of all the groups has reached 1 until the determined level (level in the tree structure) (step S212). If the number of members has all become 1 (step S212, Yes). ), The process is terminated. On the other hand, when the determination condition of step S212 is not satisfied (step S212, No), the processing after step S208 is repeated.

As described above, in the present embodiment, the subset determination unit selects a predetermined number of feature amounts used for separating the subject image sample and the non-subject image sample from the one with the higher degree of separation between both samples, For the selected feature amount, the subject image sample that is most separated from the non-subject image sample is selected as the most separated sample, and a subset including the selected most separated sample is extracted from the subject image sample and the subset. And the leaf node determination unit determines a subset by expanding the subset based on the discriminator, and the determined subset is determined as a subject image sample. Subsets obtained by repeating feature selection, selection of most separated samples, and subset determination To constitute a face image identification apparatus to determine each and a set of classifiers corresponding to the subset as a leaf node.

Therefore, by dynamically generating the tree structure used for separation of the face image sample and the non-face image sample based on learning, it is possible to improve the identification accuracy of the face image and reduce the time required for the identification process. it can.

Further, the above-described LDAFlow method can be applied not only to the identification of a face image but also to image identification such as banknote identification and currency identification.

In the following, the configuration and processing contents of the LDAArray unit 100 shown in FIG. 2 will be described. In the following description, an AdaBoost method widely used as a boosting learning method will be described with reference to FIG. 22, and an outline of the LDAArray method will be described with reference to FIG. 13, and then an LDAArray portion to which the LDAArray method is applied. 100 will be described.

FIG. 22 is a diagram showing an outline of the AdaBoost method. The AdaBoost method is a learning method for deriving a final discriminator having a high correct answer rate by combining a large number of binarized discriminators that output binarized discrimination results such as YES / NO and positive / negative based on the learning results. It is.

Here, the classifiers to be combined are weak classifiers (hereinafter referred to as “weak classifiers”) whose correct answer rate slightly exceeds 50%. That is, in the AdaBoost method, a final discriminator with a high correct answer rate is derived by combining a number of weak discriminators with a low correct answer rate.

First, the mathematical formula used for the AdaBoost method will be described. In the following, a case will be described in which a sample group of face images is class A, a sample group of non-face images is class B, and class A and class B are discriminated.

In AdaBoost technique, learning frequency of s (1 ≦ s ≦ S) , the weighting coefficient of each feature quantity x, a classifier corresponding to the feature quantity x h- _s (x), classifier _h s (x) _α _{If s} , then the final discriminator H (x) is

It is expressed as in equation (1-1).

Here, the function sign () is a binarization function that is +1 if the value in the parentheses is 0 or more and -1 if the value is less than 0. Further, as shown in the equation (1-2), the discriminator h _s (x) is a binarization discriminator that takes a value of −1 or +1. If it is determined as class B, it takes a value of -1.

In the Adaboost method, the discriminators h _s (x) shown in the equation (1-1) are selected one by one in one learning, and the weighting coefficient α _s corresponding to the selected discriminator h _s (x) is selected. The final discriminator H (x) is derived by repeating the sequential determination process. Hereinafter, the Adaboost method will be described in more detail.

_Assuming that x _i is each feature quantity and y _i is {−1, + 1} (the above-mentioned class A is +1, the above-mentioned class B is −1), the learning sample is {(x ₁ , y ₁ ), ( x ₂ , y ₂ ),..., (x _N , y _N )}. Here, N is the total number of feature quantities to be discriminated.

Also, _assuming that D _s (i) is a sample weight when the s-th learning is performed on the i-th learning sample, the initial value of D _s (i) is an expression “D ₁ (i) = 1. / N ". Then, _assuming that the discriminator corresponding to each feature quantity x _i is h _s (x _i ) and the weighting coefficient of each discriminator is α _s , each formula used in the Adaboost method is

It becomes.

In the following, with reference to FIG. 22, the above equations (2-1) to (2-4) will be described. As shown in (1) of the figure, in the first learning, the sample weight D ₁ (i) is set to 1 / N, and the learning sample distribution for each discriminator h _s is calculated. By doing so, a class A distribution and a class B distribution are obtained as shown in FIG.

Then, as shown in (2) of the figure, the error rate for each discriminator h _s (for example, the probability of misclassifying a sample of class A as class B) ε _s is calculated using equation (2-1). The classifier h _s calculated and having the lowest error rate ε _s , that is, the best discrimination is selected as the best discriminator.

Subsequently, as shown in (3-1) of the same figure, the weighting coefficient α _{s of the} discriminator h _s (the best discriminator selected in (2) of the same figure) using the equation (2-2). To decide. Then, each learning sample weight D _{s + 1} in the next learning is updated using Expression (2-3). Note that Z _s , which is the denominator of Expression (2-3), is expressed by Expression (2-4).

In this way, when the next learning sample weight D _{s + 1} is updated, the learning sample distribution for each discriminator h _s is shown in (1) of the figure, as shown in (4) of the figure. It will be different from the distribution. Then, the number of learning times s is counted up, the distribution shown in (1) in the figure is updated with the distribution calculated in (4) in the figure, and then the processes after (2) in the figure are repeated.

Here, the equation (2-3) represents the next learning sample weight so that the best discriminator selected in (2) in the figure becomes a discriminator having an error rate of 0.5 in the next learning. It shows that D _{s + 1} is determined. In other words, the process of selecting the next best classifier is performed using the learning sample weight that the best classifier is not good at.

In this way, the AdaBoost method repeats learning to perform selection of the discriminator and optimization of the weight coefficient of each discriminator, and finally, a final discriminator having a high correct answer rate can be derived. . However, as shown in the equation (1-2), the discriminator h _s (x) selected by the Adaboost method is a binarization discriminator, and finally the value held in the discriminator is 2 Output after converting to a value. That is, there is a problem in that a decision branch accompanying binary conversion is required, and the amount of calculation is increased.

Note that the RealBoost method uses a multi-value discriminator, so it is possible to avoid the problem of increasing the amount of computation due to the decision branch that occurs in the Adaboost method, but for each of the multi-values held by the multi-value discriminator. Since it is necessary to hold the corresponding weighting coefficient, there is a problem that the memory usage increases.

Therefore, by improving the Adaboost method, the “LDAArray method” was devised, which avoids the problem of increased computational complexity due to decision branching and improves the identification accuracy without requiring a large memory as in the real boost method. Below, the outline | summary of this LDAArray method is demonstrated using FIG.

FIG. 13 is a diagram showing an outline of the LDAArray method. Note that (A) in the figure shows an outline of the Adaboost method described with reference to FIG. 10, and (B) in the figure shows an outline of the LDAArray method. Also, the h _i the binary discriminator shown in the figure (A), f _i shown in the same figure (B) is a function before the h _i is binarized by a predetermined threshold value Each unbinarized discriminator is shown.

As shown in FIG. 13A, in the AdaBoost method, the discriminator having the smallest error rate is determined as h _{1 in the first} learning (see (A-1) in FIG. 13). Then, the weighting factor of h ₁ is determined (see (A-2) in the figure). In the next learning, the sample for each sample is set so that h ₁ becomes a discriminator having an error rate of 0.5. The weight is updated (see (A-3) in the figure).

Then, the final discriminator is derived by repeating selection of the discriminator, determination of the weight coefficient for the selected discriminator, and update of the sample weight.

On the other hand, as shown in FIG. 13B, in the LDAArray method, an aggregation discriminator is derived by aggregating a predetermined number of unbinarized discriminators fi using an LDA (Linear Discriminant Analysis) method, The main feature is that one final discriminator is derived based on one or more derived aggregate discriminators.

Specifically, the unbinarized discriminators are aggregated according to a predetermined procedure (see (B-1) in the figure), and the aggregate discriminators are derived using LDA (see (B-2) in the figure). ). Further, the weight coefficient of the derived aggregation discriminator is determined (see (B-3) in the figure), and the sample weight for each sample is updated (see (B-4) in the figure).

Then, the selection of the aggregate classifier, the determination of the weighting coefficient for the selected aggregate classifier, and the update of the sample weight are repeated to derive one final classifier. In this way, in the LDAArray method, a predetermined number of unbinarized discriminators are linearly combined, so that it is possible to reduce the amount of calculation involved in the discrimination processing.

That is, since the aggregate non-binary discriminator the elimination target (class described above B) until it is possible to some extent separate, wasteful decision branch (h _i shown in FIG. 13 (A) is always (Decision branch accompanying binary conversion to be performed) can be reduced. In addition, since the relationship between the feature amounts not considered in the Adaboost method shown in FIG. 13A can be grasped as a new feature, the discrimination accuracy can be improved.

FIG. 14 is a block diagram showing the configuration of the LDAArray unit 100. As shown in the figure, the LDAArray unit 100 includes a control unit 111 and a storage unit 112. The control unit 111 further includes an Adaboost processing unit 111a, an aggregate discriminator derivation unit 111b, an aggregate weight coefficient determination unit 111c, a sample weight update unit 111d, and a final discriminator determination unit 111e. Then, the storage unit 112 stores a face image sample 112a, a non-face image sample 112b, an aggregate discriminator candidate 112c, an aggregate discriminator 112d, and an aggregate weight coefficient 112e.

14 shows the case where the LDAArray unit 100 includes the control unit 111 and the storage unit 112, each processing unit in the control unit 111 is arranged in the control unit 11 shown in FIG. Each information stored in the storage unit 112 may be stored in the storage unit 12 illustrated in FIG. Further, the face image sample 112a shown in FIG. 14 is the same as the face image sample 12a shown in FIG. 2, and the non-face image sample 112b shown in FIG. 14 and the non-face image sample 12b shown in FIG. May be the same.

The control unit 111 is a processing unit that performs processing for deriving a final discriminator by learning using the above-described LDAArray method.

The Adaboost processing unit 111a is a processing unit that performs processing for executing the Adaboost method already described with reference to FIG. Further, the AdaBoost processing unit 111a repeats learning using the face image sample 112a and the non-face image sample 112b read from the storage unit 112 as samples, and collectively discriminates a set of the selected binarization discriminator and the determined weight coefficient. The processing to be passed to the container derivation unit 111b is also performed.

When the updated sample weight is received from the sample weight update unit 111d, the AdaBoost processing unit 111a updates the sample weight D _s (see FIG. 22) with the received sample weight. Subsequently, the Adaboost processing unit 111a starts the selection of the binarization discriminator from the beginning. That is, after the learning frequency s shown in FIG. 22 is set to 1, the binarization discriminator selection process and the like are repeated.

Here, the face image sample 112a and the non-face image sample 112b used for learning of the AdaBoost processing unit 111a will be described with reference to FIG. FIG. 15 is a diagram illustrating processing for acquiring a feature amount from a sample image.

Note that (A) in the figure shows a flow of processing for acquiring a feature amount from a face image, and (B) in the same drawing shows a flow of processing for acquiring a feature amount from a non-face image such as a background image. Respectively. Also, it is assumed that the size of each face image and each non-face image shown in the figure has been adjusted by a prior enlargement / reduction process.

As shown in (A) of the figure, the face image is divided into blocks of a predetermined size (see (A-1) of the figure), and for each block, the edge direction, its strength (thickness), and overall strength (See (A-2) in the figure).

For example, for the block 161 corresponding to the left eye of the face image, feature quantities such as an upward edge strength 162a, an upper right edge strength 162b, a right edge strength 162c, a right lower edge strength 162d, and an overall strength 162e of the block 161 are extracted. The thickness of the arrows shown at 162a to 162e represents the strength. Further, 162a to 162e shown in the figure are examples of feature amounts, and the types of feature amounts are not limited.

Thus, by repeating the process of extracting feature values for each block for the entire face image, the feature values for one face image are aligned. The face image sample 112a is obtained by performing the same process on other face images.

Also, as shown in (B) of the figure, the non-face image is divided into blocks similar to the face image (see (B-1) of the figure), and the same procedure as the face image is performed for each block. To extract the feature amount (see (B-2) in the figure). For example, with respect to the block 163 at a position corresponding to the face image block 161, feature amounts such as an upward edge strength 164a, an upper right edge strength 164b, a right edge strength 164c, a right lower edge strength 164d, and an overall strength 164e of the block 163 are extracted. Is done.

Thus, by repeating the process of extracting feature values for each block for the entire non-face image, the feature values for one non-face image are aligned. The non-face image sample 112b is obtained by performing the same process on other non-face images.

The aggregation discriminator derivation unit 111b is a processing unit that performs processing for deriving the aggregation discriminator 112d in the LDAArray method described above. Specifically, the aggregate discriminator deriving unit 111b, when a predetermined number of binarization discriminators are selected by the Adaboost processing unit 111a, sets a combination of the selected binarization discriminator and the determined weight coefficient. Is a processing unit that performs processing for deriving an aggregate discriminator by combining these binarization discriminators by LDA.

The aggregate discriminator derivation unit 111b derives an aggregate discriminator candidate 112c, which is an aggregate discriminator candidate, in accordance with the number of binarized discriminators, and one aggregate discriminator among the derived aggregate discriminator candidates 112c. A process for determining the discriminator 112d is also performed.

Here, the LDAArray method will be described using each mathematical expression. Assuming that the aggregation counter representing the number of times of deriving the aggregation discriminator is t (1 ≦ t ≦ T), the feature quantity is x, the aggregation discriminator corresponding to the feature quantity x is K _t (x), and the predetermined offset value is th, The final discriminator F (x)

It is expressed as equation (3-1). Here, the function sign () is a binarization function that is +1 if the value in the parentheses is 0 or more and -1 if the value is less than 0. Note that the offset value th can be calculated in the same procedure as the offset _t calculation procedure described later with reference to FIG.

If the unbinarized discriminator is f _ts (x), the weight of f _ts (x) calculated by LDA is β _ts , and the predetermined offset value is offset _t , the aggregate discriminator K _t (x) Is expressed as in equation (3-2).

The procedure for calculating the offset value offset _t will be described later with reference to FIG. Further, the offset value offset _t in the equation (3-2) is not essential, and the final adjustment may be performed with the offset value th in the equation (3-1) after omitting the offset value offset _t .

Here, the relationship between the unbinarized discriminator f _s (i) and the binarized discriminator h _s (i) is:

It is expressed by equation (4). That is, the binarized discriminator h _s (i) is obtained by binarizing the unbinarized discriminator f _s (i) with the function sign ().

In the LDAarray method, for each aggregation counter t, one aggregation classifier Kt (x) is selected from among a plurality of aggregation classifier candidates, and the weight coefficient α corresponding to the selected aggregation classifier K _t (x) is selected. _The final discriminator F (x) is derived by repeating the process of sequentially determining _t . Hereinafter, the LDAarray method will be described in more detail.

Further, when L _t (i) is a sample weight when the t-th discriminator aggregation is performed on the i-th learning sample, the initial value of Lt (i) is the expression “L ₁ (i) = 1”. / N ". Then, assuming that the aggregate discriminator corresponding to the feature quantity x _i is K _t (x _i ), each formula used in the LDAarray method is

It becomes.

In the LDAarray method, the error rate for each aggregate discriminator K _t (for example, the probability of misclassifying a class A sample as class B) ε _t is calculated using equation (5-1). Then, the weighting factor α _t of the aggregate discriminator K _t is determined using the error rate ε _t calculated by the equation (5-1) and the equation (5-2). Further, each learning sample weight L _{t + 1} in the next aggregation is updated using Expression (5-3). Note that Z _t which is the denominator of Expression (5-3) is a normalization factor for setting L _{t + 1} to “ΣL _{t + 1} (i) = 1”, and is expressed by Expression (5-4).

Here, Expression (5-3) indicates that the aggregation discriminator K _t determines the next learning sample weight L _{t + 1} so that the next discriminator becomes a discriminator having an error rate of 0.5. ing.

Thus, when the learning sample weight L _{t + 1} in the next aggregation is updated, the learning sample weight L _t is copied to the learning sample weight D _s in the Adaboost process in the LDAarray method. Then, in AdaBoost process will be repeated classifier selection processing learning samples weights D _s updated by LDAarray method as an initial value.

Returning to the description of FIG. 14, the description of the aggregate discriminator deriving unit 111b will be continued. The aggregate discriminator deriving unit 111b has two dimension numbers, that is, a minimum LDA dimension number (min_lda_dim) and a maximum LDA dimension number (max_lda_dim). Here, the “dimension number” represents, for example, the number of feature quantities. In addition, as the above two dimension numbers (minimum LDA dimension number and maximum LDA dimension number), values (empirical values) derived from the balance between processing time and accuracy can be used.

Then, when the number (s) of discriminators selected by the Adaboost processing unit 111a is equal to or greater than the minimum LDA dimension number (min_lda_dim), an aggregate discriminator candidate 112c is derived by LDA. Then, the derivation process of the aggregate discriminator candidate 112c is repeated until the number of discriminators (s) becomes equal to the maximum number of LDA dimensions (max_lda_dim).

For example, when the minimum number of LDA dimensions (min_lda_dim) is 2 and the maximum number of LDA dimensions (max_lda_dim) is 5, an aggregate classifier candidate 112c in which two classifiers are aggregated, three classifiers are aggregated. The aggregate discriminator candidate 112c, the aggregate discriminator candidate 112c in which the four discriminators are aggregated, and the aggregate discriminator candidate 112c in which the five discriminators are aggregated are derived, respectively, and one of the derived aggregate discriminator candidates 112c is derived. One aggregation discriminator 112d is selected.

Here, an outline of the aggregate discriminator candidate calculation process performed by the aggregate discriminator derivation unit 111b will be described with reference to FIG. FIG. 16 is a diagram illustrating a process of calculating an aggregate discriminator candidate. In the figure, the minimum LDA dimension number (min_lda_dim) is 4 and the maximum LDA dimension number (max_lda_dim) is 20.

When the number of discriminators (s) selected by the Adaboost processing unit 111a is equal to 4, ie, the minimum number of LDA dimensions (min_lda_dim), the aggregate discriminator deriving unit 111b class A (face image sample 112a) and class B Discriminant analysis by LDA is performed using (non-face image sample 112b). In this way, the aggregate discriminator candidate k _t4 (x) when s is 4 is calculated. The same processing is repeated until s is equal to 20, that is, the maximum number of LDA dimensions (max_lda_dim).

Here, the calculation procedure of each offset value (offset _tn ) shown in FIG. 16 will be described with reference to FIG. FIG. 17 is a diagram illustrating processing for calculating the offset of the aggregation discriminator candidate 112c. 181a, 182a and 183a shown in the figure are graphs representing the probability density distribution of class A (face image sample 112a), and 181b, 182b and 183b shown in the figure are class B (non-face image sample 112b). Graphs representing the probability density distributions of are respectively shown. The horizontal axis shown in the figure the values of the aggregate classifier candidate (k _s), the vertical axis represents probability density shown in the figure, represents respectively.

As shown in FIG. 17, offset _t4 is calculated as a horizontal axis value corresponding to a point where the class A graph 181a and the class B graph 181b intersect. That is, offset _t4 is adjusted so that the probability that a face image is mistakenly recognized as a non-face image is equal to the probability that a non-face image is mistakenly recognized as a face image. Further, the error rate ε _t4 is calculated as the area of the hatched portion shown in FIG.

Note that, as shown in FIG. 17, the value of offset _tn also changes as the LDA dimension number (s) changes. Therefore, the aggregate discriminator deriving unit 111b calculates offset _tn for each LDA dimension number (s).

The aggregate discriminator deriving unit 111b calculates the candidate k _tn (x) of each aggregate discriminator by performing the processing shown in FIG. 16 and FIG. Subsequently, the aggregate discriminator deriving unit 111b performs a process of selecting one aggregate discriminator 112d from the calculated aggregate discriminator candidates 112c. Here, an example of such selection processing will be described with reference to FIG.

FIG. 18 is a diagram illustrating an example of the aggregate discriminator selection. In the figure, the total scan area (for a sample image such as class B) when it is assumed that the LDA function is executed only once between the minimum number of LDA dimensions (min_lda_dim) and the maximum number of LDA dimensions (max_lda_dim). A graph 191 showing a change in the total scan area) is shown. Further, in the figure, the graph 191 illustrates the case where the minimum value 192 is taken when the LDA dimension number (s) is 6.

For example, if the number of LDA dimensions (s) for executing the LDA function is n, the total scan area is n × image area + (max_lda_dim−n) × (area of area that could not be eliminated by n full scans). Become. The relationship between the total scan area calculated in this way and n is, for example, a graph 191.

Here, in the figure, the case where the LDA dimension number (s) is 6 and the minimum value 192 is shown is shown. However, when the aggregation counter t changes, the dimension number at which the total scan area becomes minimum also changes. Therefore, the aggregation discriminator deriving unit 111b performs the determination process shown in FIG. 18 using the aggregation discriminator candidate 112c corresponding to the aggregation counter t, and the LDA dimension number (s) candidate that minimizes the total scan area. k _tn is selected as the aggregate discriminator K _t .

FIG. 18 shows the case where the candidate k _tn having the LDA dimension number (s) that minimizes the total scan area is selected as the aggregate discriminator K _t , but the LDA dimension number (s) is fixed. It is good. By doing so, since the processing load of the LDA processing does not change depending on the aggregation counter t, parallel processing becomes possible. Therefore, the processing time can be shortened.

Returning to the description of FIG. 14, the aggregate weight coefficient determination unit 111 c will be described. Aggregate weighting coefficient determining unit 111c, when the aggregate classifier deriving portion 111b has derived the aggregate classifier _{K t,} and determines the weighting factor for the aggregate classifier _{K t} (aggregate weight coefficient alpha _t), as an aggregate weight factor 112e It is a processing unit that performs processing to be stored in the storage unit 112. The aggregation weighting coefficient α _t is calculated using the above equation (5-2).

Sample weight updating unit 111d, each learning sample weight L in the next aggregated according to the aggregation weight coefficient alpha _t determined by the aggregate classifier K _t and aggregate weighting coefficient determining section 111c derived by aggregating discriminator deriving portion 111b _This is a processing unit that performs a process of updating _{t + 1} (see Expression (5-3)). Further, the sample weight updating unit 111d is a learning sample weight L _t, is also a processing unit performs a process of copying the learning samples weights D _s to be used by the AdaBoost processing unit 111a.

In this way, the aggregation discriminator 112d and the aggregation weighting coefficient 112e corresponding to the aggregation counter t are stored in the storage unit 112 while counting up the aggregation counter t. The final discriminator determining unit 111e then sets the aggregation counter on condition that the correct answer rate of the final discriminator F using the aggregation discriminator 112d (K _t ) and the aggregation weight coefficient 112e (α _t ) is equal to or greater than a predetermined value. End the loop using t. Note that the final discriminator determining unit 111e ends this loop even when there is no binarization discriminator (h _s ) to be aggregated.

Here, the aggregate discriminator derivation process performed by the control unit 111 will be summarized. Figure 19 is a diagram showing a process of deriving the aggregate classifier K _t. As shown in the figure, the control unit 111 performs LDA candidate (aggregate classifier candidate) extraction (see Fig (A)), to determine the aggregate classifiers K ₁ learning first (in the figure (See (B)).

Then, if determined the _{K 1,} followed by, (see FIG. (C)) to start the process of determining the _{K 2,} (see FIG. (D)) to determine the _{K 2.} Further, the process of determining K ₃ is started (see (E) in the figure), and K ₃ and K ₄ are sequentially determined. In the drawing, the LDA dimension number of K ₁ is 4 and the LDA dimension number of K ₂ is 5. However, the LDA dimension number does not always increase as the number of subsequent K increases. Absent.

Returning to the description of FIG. 14, the storage unit 112 will be described. The storage unit 112 is a storage unit configured by a storage device such as a non-volatile memory or a hard disk drive, and includes a face image sample 112a, a non-face image sample 112b, an aggregation discriminator candidate 112c, an aggregation discriminator 112d, and an aggregation discriminator. The weight coefficient 112e is stored. Note that the information stored in the storage unit 112 has already been described in the description of the control unit 111, and thus the description thereof is omitted here.

Next, a processing procedure executed by the LDAArray unit 100 will be described with reference to FIG. FIG. 20 is a flowchart showing a processing procedure executed by the LDAArray unit 100. As shown in the figure, the minimum LDA dimension (min_lda_dim) and the maximum LDA dimension (max_lda_dim) are set (step S301), the aggregation counter (t) is set to 1 (step S302), and the Adaboost counter (s) is set. 1 (step S303). Note that when the discriminator f in FIG. 19 is represented using the aggregation counter (t) and the Adaboost counter (s), ft _−s .

The Adaboost processing unit 111a selects the best discriminator (h _s ) (step S304), calculates the weight coefficient (α _s ) of the best discriminator (h _s ) selected in step S304 (step S305). ), And updates the sample weight (D _s ) for each sample (step S306).

Subsequently, the aggregate discriminator deriving unit 111b determines whether or not the Adaboost counter (s) is equal to or greater than the minimum LDA dimension number (min_lda_dim) (step S307), and the Adaboost counter (s) is the minimum LDA dimension number. When it is less than (min_lda_dim) (No at Step S307), the Adaboost counter (s) is counted up (Step S310), and the processes after Step S304 are repeated.

On the other hand, when the Adaboost counter (s) is equal to or greater than the minimum LDA dimension number (min_lda_dim) (step S307, Yes), LDA is performed on the unbinarized discriminators (f ₁ to f _s ), and the aggregate discriminator. A candidate (k _s ) is calculated (step S308).

Subsequently, it is determined whether or not the Adaboost counter (s) is equal to the maximum LDA dimension number (max_lda_dim) (step S309). If the Adaboost counter (s) is not equal to the maximum LDA dimension number (max_lda_dim), (No at Step S309), the Adaboost counter (s) is counted up (Step S310), and the processes after Step S304 are repeated.

On the other hand, if AdaBoost counter (s) is equal to the maximum LDA dimensionality (max_lda_dim) (step S309, Yes), it performs a process of determining the aggregate classifier _{(K t)} (step S311). The detailed processing procedure of step S311 will be described later with reference to FIG.

Subsequently, the aggregation weight coefficient determination unit 111c determines the weight coefficient (α _t ) of the aggregation discriminator (K _t ) (step S312), and the sample weight update unit 111d updates the sample weight (L _t ) ( Step S313). Then, the final discriminator determining unit 111e either determines whether the class A and the class B are sufficiently separated based on the discrimination result by the final discriminator (F) or there is no unaggregated discriminator. It is determined whether or not the condition is satisfied (step S314).

If the determination condition of step S314 is satisfied (step S314, Yes), the final discriminator (F) is determined and the process ends. On the other hand, when the determination condition of step S314 is not satisfied (step S314, No), the sample weight (L _t ) used by the aggregate discriminator derivation unit 111b is copied to the sample weight (D _s ) used by the Adaboost processing unit 111a. (Step S315). Then, the aggregation counter (t) is counted up (step S316), and the processes after step S303 are repeated.

Next, a detailed processing procedure of the aggregation discriminator determination process shown in step S311 of FIG. 20 will be described with reference to FIG. FIG. 21 is a flowchart showing the processing procedure of the aggregate discriminator determination process. As shown in the figure, the aggregate discriminator deriving unit 111b sets the initial value of the LDA dimension number (s) as the minimum LDA dimension number (min_lda_dim) (step S401), and calculates the total scan area (s × total area). (Step S402).

Subsequently, after setting the area of the area that could not be excluded by s full scans as a remaining area (step S403), a partial scan total area ((max_lda_dim-s) × residual area) is calculated (step S404). Then, the total scan area (total scan total area + partial scan total area) is calculated (step S405).

Subsequently, it is determined whether or not s is equal to the maximum number of LDA dimensions (max_lda_dim) (step S406). If s is not equal to the maximum number of LDA dimensions (max_lda_dim) (step S406, No), s is counted. (Step S407), the process after Step S402 is repeated. On the other hand, when s is equal to the maximum number of LDA dimensions (max_lda_dim) (Yes in step S406), the aggregate discriminator candidate (k _s ) corresponding to the LDA dimension number (s) having the smallest total scan area is the aggregate discriminator. (K _t ) (step S408), and the process is terminated.

Thus, according to the LDAArray method, it is possible to avoid the problem of an increase in the amount of calculation due to the decision branch in the Adaboost method, and to improve the identification accuracy without requiring a large memory as in the real boost method.

As described above, the subject identification method, the subject identification program, and the subject identification device according to the present invention are useful when it is desired to perform processing for identifying a specific subject from a predetermined image with high speed and high accuracy. This is suitable for the case where it is desired to dynamically generate an identification tree structure in which containers are arranged.

Claims

Using discriminators arranged at each node of the tree structure, the subject image and the non-subject image are obtained by applying the discriminator from the root node that is the vertex of the tree structure toward the leaf node at the end. An object identification method for identifying,
A feature amount selection step of selecting a predetermined number of feature amounts used for separation of the subject image sample and the non-subject image sample from the one with the higher degree of separation between both samples;
A most separated sample selection step for selecting the subject image sample most separated from the non-subject image sample as the most separated sample for the feature amount selected by the feature amount selection step;
A subset including the most separated sample selected by the most separated sample selection step is extracted from the subject image sample, and the discriminator corresponding to the subset is derived by learning using the LDAArray method, A subset determining step of determining the subset by expanding the subset based on the method;
The portion obtained by repeating the feature amount selection step, the most separated sample selection step, and the subset determination step after removing the subset determined by the subset determination step from the subject image sample. A leaf node determining step of determining a set and a set of the discriminators corresponding to the subset as the leaf nodes, respectively.
The subset determination step includes:
When the subset including the most separated sample is first extracted from the subject image sample, a predetermined number of the subject image samples are included in the subset and extracted from the shortest distance from the most separated sample. The subject identification method according to claim 1, wherein:
The subset determination step includes:
The subject identification method according to claim 2, wherein the expansion of the subset is stopped when the variation number of the subset is less than a predetermined threshold.
The non-subject image according to the branching plan for a branching plan in which a mother set consisting of the subsets determined as the leaf nodes by the leaf node determining step is represented as a set of node candidates including a predetermined number of the subsets. An evaluation value calculation step for calculating an evaluation value indicating the acceptance rate of the sample for each of the branch plans,
Each of the node candidates included in the branch plan that minimizes the evaluation value calculated by the evaluation value calculating step is determined as a node immediately below the root node, and the node candidates include a plurality of subsets. The node determination step of determining all nodes by repeating the determination of the nodes until the number of subsets included in the node candidate is 1. The subject identification method according to claim 1, 2 or 3.
The evaluation value calculation step includes:
Assume that any one of the subsets is input to any one of the classifiers with respect to the subsets determined as the leaf nodes by the leaf node determination step and the classifiers corresponding to the subsets. 5. The subject identification method according to claim 4, wherein an acceptance rate of each of the non-subject image samples is calculated for every combination, and the evaluation value for the branching plan is calculated based on the acceptance rate. .
The evaluation value calculation step includes:
For each of all the node candidates included in the branching plan, the maximum acceptance rate for all the subsets included in the node candidate is set as the representative acceptance rate of the node candidate, and is included in the branching plan. The subject identification method according to claim 5, wherein the evaluation value for the branch plan is calculated based on the representative acceptance rate of the node candidate.
The all node determination step includes:
When the node candidate determined as the node includes a plurality of the subsets, the node is obtained by learning by the LDAArray method using the non-subject image sample and all the subsets included in the node candidate as inputs. 7. The subject identifying method according to claim 4, wherein the discriminator corresponding to a candidate is derived.
The LDAArray step number determining step of determining an LDAArray step number in the discriminator so that the number of steps in which the discriminator calculates the non-subject image sample is minimized. Subject identification method described in one.
The LDAArray stage number determining step includes:
When the node corresponding to the discriminator has a subordinate node, the number of LDAArray stages is such that the total number of LDAArray stages, which is the sum of the number of LDAArray stages in the node and the number of LDAArray stages in all the subordinate nodes, is minimized. 9. The subject identification method according to claim 8, wherein the subject identification method is determined.
Using discriminators arranged at each node of the tree structure, the subject image and the non-subject image are obtained by applying the discriminator from the root node that is the vertex of the tree structure toward the leaf node at the end. A subject identification program for identifying,
A feature quantity selection procedure for selecting a predetermined number of feature quantities to be used for separating the subject image sample and the non-subject image sample from the one with the higher degree of separation between the two samples;
A most separated sample selection procedure for selecting the subject image sample that is most separated from the non-subject image sample as the most separated sample for the feature amount selected by the feature amount selection procedure;
A subset including the most separated sample selected by the most separated sample selection procedure is extracted from the subject image sample, and the discriminator corresponding to the subset is derived by learning using the LDAArray method. A subset determination procedure for determining the subset by extending the subset based on
The portion obtained by repeating the feature amount selection procedure, the most separated sample selection procedure, and the subset determination procedure after removing the subset determined by the subset determination procedure from the subject image sample. And a leaf node determination procedure for determining a set of classifiers corresponding to the set and the subset as leaf nodes.
Using discriminators arranged at each node of the tree structure, the subject image and the non-subject image are obtained by applying the discriminator from the root node that is the vertex of the tree structure toward the leaf node at the end. A subject identification device for identifying,
A feature amount selection means for selecting a predetermined number of feature amounts used for separation of the subject image sample and the non-subject image sample from a higher degree of separation of both samples;
For the feature amount selected by the feature amount selection unit, a most separated sample selection unit that selects, as the most separated sample, the subject image sample that is most separated from the non-subject image sample;
A subset including the most separated sample selected by the most separated sample selection means is extracted from the subject image sample, and the discriminator corresponding to the subset is derived by learning using the LDAArray method, and is sent to the discriminator. A subset determining means for determining the subset by extending the subset based on the subset;
The portion obtained by repeating the feature amount selecting unit, the most separated sample selecting unit, and the subset determining unit after removing the subset determined by the subset determining unit from the subject image sample. And a leaf node determination means for determining a set of the classifiers corresponding to the set and the subset as the leaf nodes, respectively.