WO2013012093A1

WO2013012093A1 - Information processing system, method of learning recognition dictionary, and information processing program

Info

Publication number: WO2013012093A1
Application number: PCT/JP2012/068747
Authority: WO
Inventors: 利憲細井; 博義宮野
Original assignee: 株式会社Ｎｅｃ情報システムズ
Priority date: 2011-07-19
Filing date: 2012-07-18
Publication date: 2013-01-24
Also published as: JPWO2013012093A1

Abstract

[Problem] To create a recognition dictionary, even when categories to which input vectors to be used for learning belong is not completely known. [Solution] An information processing system is characterized in being provided with: a reference vector storage means for retaining reference vectors; an instance selecting means for selecting one instance from a bag containing a plurality of instances; a reference vector specifying means for specifying a related reference vector that is most related to the selected instance, from among the reference vectors stored in the reference vector storage means; an instance probability calculating means for calculating an instance-correct probability that is the probability that the category of the instance is correct; a bag probability calculating means for calculating a bag-correct probability that is the probability that the category of the bag is correct; and a reference vector modifying means for modifying the related reference vector using the bag correct probability.

Description

Information processing system, recognition dictionary learning method, and information processing program

The present invention relates to an information processing system, a recognition dictionary learning method, and an information processing program, and more particularly to an information processing system, a recognition dictionary learning method, and an information processing program used in pattern recognition.

Recognition of an input vector may be performed by setting a category to which a reference vector closest to the input vector belongs as a recognition result of the input vector based on a distance calculation between a reference vector group called a recognition dictionary and the input vector. is there. In this case, the recognition accuracy varies depending on the value of the reference vector. Therefore, how to set the value of the reference vector is important for improving the recognition accuracy.
Patent Document 3 and Non-Patent Document 1 disclose Learning Vector Quantization (LVQ) as a recognition dictionary learning method using a reference vector. According to LVQ, the learning process is completed in a very short time compared to a statistical pattern recognition method that does not use other reference vectors such as a perceptron type neural network or a support vector machine. In addition, according to LVQ, the number of reference vectors can be freely designed, so that both the learning process and the recognition process can be accelerated only by reducing the number of reference vectors. Furthermore, the recognition process can be realized only by the distance calculation process, and it is easy to make a parallel type calculation. Therefore, it is easy to implement software programming or an IC chip.
Patent Document 1 discloses an example of a technique improved from this LVQ. Non-Patent Document 2 discloses specific effects for one of the methods described in Patent Document 1.
Non-Patent Document 3 discloses an improved method of these methods. Here, the reference vector value is automatically set using a plurality of input vectors as learning data.

Japanese Patent No. 3452160 JP-A-5-124550 JP-A-6-333052

In the methods described in Patent Document 1, Patent Document 3, Non-Patent Document 1, Non-Patent Document 2, and Non-Patent Document 3, it is necessary to correctly assign the category to which all input vectors used for learning belong. . However, it takes time and effort to assign the correct category in advance. Also, there are cases where correct categories cannot be assigned in advance.
And in the technique of patent document 1, patent document 3, non-patent document 1, non-patent document 2, non-patent document 3, when the information prepared in advance is incomplete for the category to which the input vector used for learning belongs, We couldn't make a recognition dictionary from that information.
The objective of this invention is providing the information processing system which solves the above-mentioned subject.

In order to achieve the above object, an information processing system according to the present invention includes a reference vector storage unit that holds a group of reference vectors, an instance selection unit that selects one instance from a bag including a plurality of instances, and the reference vector. Reference vector specifying means for specifying the related reference vector most relevant to the selected instance from the reference vector group stored in the storage means, and instance probability calculating means for calculating the instance correct probability that the category of the instance is correct Bag probability calculation means for calculating a bag correct probability that the bag category is correct using the probability that the instance category included in the bag is correct, and the related reference using the bag correct probability A reference vector correcting means for correcting the vector, It is characterized in.
To achieve the above object, a recognition dictionary learning method according to the present invention selects one instance from a bag including a plurality of instances, and selects the instance selected from a reference vector group stored in a reference vector storage means. A related reference vector that is most relevant to the category, calculates an instance correct probability that the selected category of the instance is correct, and uses the probability that the category of the instance included in the bag is correct, A bag correct probability that is a correct answer is calculated, and the related reference vector is corrected using the bag correct probability.
In order to achieve the above object, an information processing program according to the present invention includes a computer that includes an instance selection unit that selects one instance from a bag including a plurality of instances, and a reference vector group stored in a reference vector storage unit. Included in the bag; a reference vector specifying means for specifying a related reference vector most relevant to the selected instance; an instance probability calculating means for calculating an instance correct probability that the category of the selected instance is correct; Bag probability calculation means for calculating a bag correct probability that the bag category is correct using the probability that the category of the instance is correct, and a reference vector correction that corrects the related reference vector using the bag correct probability It is characterized by operating as a means.

According to the present invention, a recognition dictionary can be created even when the category to which the input vector used for learning belongs is not completely known.

It is a block diagram which shows the structure of 1st Embodiment of this invention. It is a figure for demonstrating the view of the category given to a bag and a bag. It is a figure for demonstrating the category determination of the instance in a bag. It is a block diagram which shows the structure of 2nd Embodiment of this invention. It is a flowchart which shows operation | movement of 2nd Embodiment of this invention.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings. However, the constituent elements described in the following embodiments are merely examples, and are not intended to limit the technical scope of the present invention only to them.
[First Embodiment]
An information processing system 100 as a first embodiment of the present invention will be described with reference to FIG. The information processing system 100 is a device for learning a recognition dictionary used in pattern recognition.
As illustrated in FIG. 1, the information processing system 100 includes a reference vector storage unit 101, an instance selection unit 102, a reference vector identification unit 103, an instance probability calculation unit 104, a bag probability calculation 105, and a reference vector correction unit 106. Including.
The reference vector storage unit 101 holds a reference vector group. The instance selection unit 102 selects one instance from the bag including a plurality of instances.
Further, the reference vector specifying unit 103 specifies a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage unit 101. The instance probability calculation unit 104 calculates an instance correct probability that the instance category is correct. Then, the bag probability calculation unit 105 calculates the bag correct probability that the category of the bag is correct using the probability that the category of the instance included in the bag is correct. The reference vector correction unit 106 corrects the related reference vector using the bag correct answer probability.
With the above configuration, a recognition dictionary can be created even when the category to which the input vector used for learning belongs is not completely known.
[Second Embodiment]
(Prerequisite technology)
As a second embodiment of the present invention, an information processing system 400 based on a recognition dictionary learning method (LVQ) using reference vectors will be described. First, before entering a description of a specific embodiment, a concept called “Multiple Instance Learning” that defines a rule of how to assign a category in the field of statistical pattern recognition will be described. In this concept, an input vector is called an “instance”. A set of instances is called a “bag”. Furthermore, a category is not assigned to every instance, but only to a bag.
FIG. 2 illustrates the concept of bags and instances. When there are only two types of categories, “positive” and “negative”, if there is at least one positive instance in the bag, the category of the bag is defined as “positive”. If only a negative instance exists in the bag, the category of the bag is defined as “negative”.
Based on the definition of this category, let pij be the probability that the j-th instance in the i-th bag is positive. At this time, the probability that one or more instances in the bag are positive, that is, the probability that the bag is positive, pi, is calculated by the following equation.

For example, when classifying into two categories of “positive” and “negative (non-positive)”, even if there is incomplete information that any one of n sets of instances is “positive”, the information The processing system 400 cannot learn well. In order to obtain the coefficient for updating the reference vector, it is necessary that all instances used for learning and the category to which each instance belongs have a one-to-one correspondence. For this reason, the information processing system 400 cannot apply information of a category that is not complete to the learning process.
That is, before the information processing system 400 performs the learning process, the information regarding the category needs to be completely given to all the input vectors used for learning. However, this work takes time and effort. In addition, when it is difficult to accurately define the category of the input vector, it is difficult to completely give information on the category in the first place. For example, if the input vector is an image pattern and the categories are two types of categories, “Face” and “Non-face”, a pattern that includes only the eyes, nose, and mouth of the face, It is difficult to determine whether the category of the pattern including the shoulder should be defined as “face” or “non-face”. For a pattern that includes only the eyes, nose, and mouth of the face, a pattern that includes the outline of the face, and a group of patterns that includes the neck and shoulders, at least one is a face. It is a natural way of thinking to define.
For this reason, in this embodiment, the information processing system 400 calculates the probability that the instance is the correct category, and further calculates the probability that the bag is the correct category. Then, the information processing system 400 corrects the reference vector using the probability that the bag to which the instance belongs is in the correct category. In this way, the information processing system 400 can perform learning processing (that is, correction of a reference vector) even if the category to which the instance belongs is not completely clear.
(Constitution)
FIG. 3 schematically represents a recognition method using a recognition dictionary by the information processing system 400 of the present embodiment. The recognition dictionary includes reference vectors 311 to 31n belonging to a certain category A and reference vectors 321 to 32n belonging to another category B. When there is an instance 303 for which the category is to be recognized, the information processing system 400 identifies the instance 303 and the closest reference vector 312 among the reference vectors 311 to 31n belonging to the category A. Then, the information processing system 400 calculates the distance d1. Further, the information processing system 400 specifies the instance 303 and the closest reference vector 322 among the reference vectors 321 to 32n belonging to the category B. Then, the information processing system 400 calculates the distance d2. Then, the information processing system 400 determines that the instance 303 also belongs to the category to which the closest reference vector 312 belongs. In the case of FIG. 3, since d1 <d2, the information processing system 400 determines that the instance 303 belongs to category A.
Next, an information processing system 400 according to the present embodiment will be described with reference to FIG. FIG. 4 is a diagram for explaining a functional configuration of the information processing system 400 according to the present embodiment. The information processing system 400 calculates a distance between a group of reference vectors (also referred to as templates, prototypes, and representative vectors) called a recognition dictionary and an input vector. Based on the calculation result, the information processing system 400 sets the category (also referred to as a class or label) to which the reference vector closest to the input vector belongs as the recognition result of the input vector.
Hereinafter, for the sake of explanation, one instance is represented by x and the reference vector is represented by wk (k is 1 to K) (each is a vector, but the bar is omitted here for simplification).
Referring to FIG. 4, the present embodiment includes a data processing device 401 and a storage device 402. The data processing device 401 includes a bag selection unit 411, an instance selection unit 412, a reference vector identification unit 413, an instance probability calculation unit 414, a bag probability calculation unit 415, a probability base correction coefficient calculation unit 416, and a reference vector. A correction unit 417 and an end determination unit 418 are included.
The storage device 402 includes a learning data storage unit 421 and a reference vector storage unit 422. The learning data storage unit 421 stores information related to learning instance groups (bags). The reference vector storage unit 422 stores information on a reference vector group that is a recognition dictionary. Specifically, the reference vector storage unit 422 stores individual reference vectors and category information to which the reference vectors belong. Note that the reference vector is modified during the operation of the present invention, resulting in a learned reference vector.
The bag selection unit 411 selects one bag i from all bags used for learning. The instance selection unit 412 selects one instance (one input vector) j from the bag i selected by the bag selection unit 411.
The reference vector specifying unit 413 determines a reference vector pair related to the instance j selected by the instance selecting unit 412. Specifically, the reference vector specifying unit 413 is closest to the reference vector group in the same category as the instance j from the inter-vector distance between the instance j and each reference vector held in the reference vector storage unit 422. The first reference vector W1 and its first distance d1 are calculated. Further, the reference vector specifying unit 413 further calculates the second reference vector W2 closest to the reference vector group in a category different from the instance j and the second distance d2. Note that it is sufficient for the reference vector specifying unit 413 to search for the closest reference vector. Therefore, the reference vector specifying unit 413 does not necessarily calculate the distances from all the reference vectors.
The instance probability calculation unit 414 estimates a probability pij that the instance is a correct category. Here, the correct category is a category assigned to the bag i to which the instance j belongs. Note that any method may be used as a method of estimating the probability pij. Hereinafter, an example of a method for estimating the probability pij will be described. The probability pij can be expressed as follows.

On the other hand, if the correct answer category likelihood of an instance is μij, and an arbitrary monotonically increasing function with a value range of [0, 1] is R (−), the probability pij is also written as the following equation.

There are various derivation methods for μij. For example, μij can be obtained based on the following equation.

In addition, μij is written by Hiroyoshi Miyano and Nagaki Ishidera, “Learning Vector Quantization Method Using Heavy Distribution Function of Support”, IEICE, IEICE Technical Report, vol. 110, no. 187, PRMU 2010-81, pp. 185-192, September 2010, ηk may be used.
Further, the monotone increasing function R (−) may be, for example, the following expression.

In consideration of the fact that the recognition reliability increases as the reference vector learning is repeated, the monotonically increasing function R (−) may be expressed by the following equation using the reference vector learning frequency t.

Furthermore, the monotonically increasing function R (−) may be the following equation using the reference vector learning count t and arbitrary constants γ0 and γ1.

The bag probability calculation unit 415 calculates a probability qi that the bag is likely to be the correct category from the probability pij that each instance in the bag i is likely to be the correct category. This calculation formula is expressed by the following formula.

The probability-based correction coefficient calculation unit 416 calculates, for each instance, a probability-based correction coefficient uij for updating the reference vector based on the probability pij that the instance is a correct category and the probability qi that the bag to which the instance belongs is the correct category. calculate.
The probability-based correction coefficient calculation unit 416 may obtain the correction coefficient uij so as to maximize the likelihood L, which means the likelihood of the recognition result. The likelihood L may be defined based on the probability qi.
For example, assume that L is defined by the following equation.

In this case, the probability-based correction coefficient calculation unit 416 may calculate the probability-based correction coefficient uij using the following formula.

The reference vector correction unit 417 corrects the first reference vector w1 and the second reference vector w2 stored in the reference vector storage unit 422 based on the probability base correction coefficient uij calculated by the probability base correction coefficient calculation unit 416. To do. More specifically, the reference vector correction unit 417 calculates a value calculated from the function D2 (·) that converts the second distance d2, a probability-based correction coefficient uij, an arbitrary coefficient α1, and an instance x and w1. A vector obtained by multiplying the difference vector is added (or subtracted) to w1 as a vector. By this process, the reference vector correction unit 417 corrects the first reference vector w1.
On the other hand, the reference vector correction unit 417 calculates a value calculated from the function D1 (•) that converts the first distance d1, a probability-based correction coefficient uij, and an arbitrary coefficient α2, and the difference between the instances x and w2. A vector obtained by multiplying the vector is added (or subtracted) as a vector to w2. By this process, the reference vector correction unit 417 corrects the second reference vector w2. This correction calculation can be expressed as in Equation 3 below.

The function D2 (•) includes a power calculation of d2 and a calculation that further divides by the power of the sum of d1 and d2. The function D1 (•) is composed of a power calculation of d1 and a calculation for further dividing by the power of the sum of d1 and d2. The power here refers to the calculation of the y power using an arbitrary real number y of 0 or more. For example, D1 (•) and D2 (•) may be functions represented by the following expressions.

The end determination unit 418 determines the end of learning (correction) of the reference vector.
(Operation)
Next, the overall operation of the present embodiment will be described in detail with reference to the flowchart of FIG. In the following description, the number of bags used for learning is M. The number of instances in the bag i is Ni. Before operation, an instance (input vector) used for learning is prepared, and a bag composed of one or more instances is prepared. As an actual data structure, it is sufficient to associate a bag number with an instance. A category is assigned to all bags.
First, the bag selection unit 411 selects one bag i from the learning bag group stored in the learning data storage unit 421 (S501). Next, the instance selection unit 412 selects one instance in the selected bag (S503). Then, the reference vector specifying unit 413 searches for the reference vector closest to the selected instance (S505). Further, the instance probability calculation unit 414 calculates a probability Pij that the instance is correct (S507). Next, the instance selection unit 412 determines whether or not the processing in steps S503 to S507 has been completed for all instances in the bag (S509), and if not, the processing returns to step S503.
As described above, the information processing apparatus 400 includes the instance selection process (S503) by the instance selection unit 412, the reference vector specification process (S505) by the reference vector specification unit 413, and the instance probability calculation process (S507) by the instance probability calculation unit 414. Is repeatedly executed.
Next, the bag probability calculation unit 415 calculates a probability qi that the selected bag i is correct (step S511). The instance selection unit 412 selects one instance again from the bag. Then, the probability-based correction coefficient calculation unit 416 calculates a probability-based correction coefficient for the selected instance (S513). Further, the reference vector correction unit 417 uses the first reference vector w1 and the second reference vector w2 stored in the reference vector storage unit 422 based on the probability base correction coefficient uij calculated by the probability base correction coefficient calculation unit 415. Is corrected (S515). Next, the instance selection unit 412 determines whether or not the processing of steps S512 to S515 has been completed for all instances in the bag (S516), and if not, returns to step S503.
The information processing apparatus 400 repeatedly executes the processes in steps S511 to S515 for all instances in the selected bag i. Thus, the information processing apparatus 400 corrects all reference vectors stored in the reference vector storage unit 422. Thus, the learning process for each bag is completed.
Next, the bag selection unit 411 determines whether or not the processing of steps S501 to S516 has been completed for all bags (S517), and if not, the processing returns to step S501.
The information processing apparatus 400 repeatedly executes the processes in steps S501 to S517 for all bags. Thus, the information processing apparatus 400 corrects all reference vectors stored in the reference vector storage unit 422.
In step S518, the end determination unit 418 determines whether or not to end the learning of the reference vector. If the end determination unit 418 determines not to end, the process returns to step S501. Then, the information processing apparatus 400 repeatedly executes the processing from step S501 to step S517 for all learning bags.
On the other hand, when the end determination unit 418 determines to end the learning of the reference vector, the information processing apparatus 400 ends the process. For example, the end determination unit 418 may determine to end the learning of the reference vector when the information processing apparatus 400 repeatedly executes the above-described process a predetermined number of times.
As described above, in the present embodiment, the bag probability calculation unit 415 calculates the probability that the bag is the correct category. Then, the reference vector correction unit 417 corrects the reference vector so as to maximize the likelihood L derived therefrom. Therefore, the information processing apparatus 400 can perform learning only with the information of the category assigned to the bag, not the instance.
Next, it will be described that the likelihood L derived from the probability that the bag is the correct category can be maximized by using the probability base correction coefficient uij calculated by the probability base correction coefficient calculation unit 416 used in the present embodiment. . Let us consider maximizing L by the steepest descent method when Pij is defined as Equation 1 and L is defined as Equation 2.
In this case, the function S to be minimized can be defined as follows using a logarithmic function that is a monotone function.

In the steepest descent method, the update formula of the reference vector wk is expressed as follows using a constant α.

The following expression is an expression obtained by expanding the second term on the right side of the above expression.

Further expansion of the above expression leads to the following expression 4 and expression 5 using the constant β.

Equation 4 corresponds to a probability-based correction coefficient. Equation 3 is derived from Equation 4 and Equation 5.
Since the following expression described above can be interpreted as an expression for normalizing the magnitude of the distance value, the same effect can be obtained even if the numerator and the denominator are raised.

The operation of the present embodiment is described using “probability of being correct”. In the case of two categories of “positive” and “negative (non-positive)”, the “probability of incorrect answer” or “probability of positive” or “probability of negative” is obvious. . Therefore, it can be said that this embodiment uses these probabilities.
In the present embodiment, since the reference vector is corrected based on the probability that the bag is correct, a learning process for maximizing the correctness (evaluation scale) of the entire bag to be learned is performed. As a result, even if there is no category information corresponding to all the input vectors used for learning, the present embodiment learns only from the category information corresponding to the set (bag) of input vectors (instances) used for learning. it can. Therefore, this embodiment can create a recognition dictionary. In the present embodiment, the preparation for the learning process can be completed only by adding coarser information before the learning process. In addition, the present embodiment can perform learning even when it is difficult to accurately define the category of each input vector in the first place or when the category can be defined only for a set of input vectors.
The present embodiment can be applied to a use for detecting a specific object from a video or a use for identifying and authenticating a person or object from a video. In particular, when enormous amounts of data are learned and used for recognition, it is difficult to completely assign the correct category of the object. However, since the correct category only needs to be assigned to a set of data, this embodiment can be applied practically.
[Third Embodiment]
The probability base correction coefficient calculation unit 416 according to the second embodiment may calculate the probability base correction coefficient uij by the following formula.

As long as a category is assigned to a bag that is a set of instances, this embodiment can be used. For example, a vector generated by minutely changing a value of a vector component of an instance to be learned (input vector) may be created. In this way, a set in which a plurality of instances that are originally generated from a single instance with a slight variation may be prepared as one bag.
In this way, if the instance is an instance including noise, for example, if one of the instances generated from the instance is reduced in noise, the present embodiment It is possible to execute a learning process that is hardly affected. On the other hand, if there is an instance in which, for example, a category is correctly assigned, one bag only needs to be configured by that one instance.
In the present embodiment, the same effect as in the second embodiment can be obtained. It is assumed that the likelihood L derived from the probability that the bag is the correct category is defined by the following mathematical formula.

Then, the function S to be minimized by the steepest descent method can be obtained as in the following equation.

In this case, the following equation corresponding to the equation used to explain the effects of the second embodiment can be derived.

Therefore, L can be maximized by calculating the probability-based correction coefficient uij. Thereby, this embodiment can perform learning only with the information of the category provided to the bag instead of the instance.
[Other Embodiments]
As mentioned above, although embodiment of this invention was explained in full detail, the system or apparatus which combined the separate characteristic contained in each embodiment how was included in the category of this invention.
Further, the present invention may be applied to a system composed of a plurality of devices. The present invention may be applied to a single device. Furthermore, the present invention can also be applied to a case where an information processing program that implements the functions of the embodiments is supplied directly or remotely to a system or apparatus. Therefore, in order to realize the functions of the present invention with a computer, a program installed in the computer, a medium storing the program, and a WWW (World Wide Web) server that downloads the program are also included in the scope of the present invention. .
The information processing system 100, the information processing system 400, the data processing device 401, and the storage device 402 are respectively a computer and a program that controls the computer, dedicated hardware, or a program that controls the computer and the computer and dedicated hardware. It can be realized by a combination.
Instance selection unit 102, reference vector specification unit 103, instance probability calculation unit 104, bag probability calculation unit 105, reference vector correction unit 106, bag selection unit 411, instance selection unit 412, reference vector specification unit 413, instance probability calculation unit 414 The bag probability calculation unit 415, the probability base correction count calculation unit 416, the reference vector correction unit 417, and the end determination unit 418 are, for example, for realizing the function of each unit read into the memory from the recording medium storing the program. It can be realized by a dedicated program and a processor that executes the program. The reference vector storage unit 101, the learning data storage unit 421, and the reference vector storage unit 422 can be realized by a memory or a hard disk device included in the computer. Alternatively, the reference vector storage unit 101, the instance selection unit 102, the reference vector specification unit 103, the instance probability calculation unit 104, the bag probability calculation unit 105, the reference vector correction unit 106, the bag selection unit 411, the instance selection unit 412, the reference vector specification Unit 413, instance probability calculation unit 414, bag probability calculation unit 415, probability-based correction count calculation unit 416, reference vector correction unit 417, end determination unit 418, learning data storage unit 421, part of reference vector storage unit 422 or All can be realized by a dedicated circuit for realizing the function of each unit.
[Other expressions of embodiment]
A part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
Reference vector storage means for holding a reference vector group;
An instance selection means for selecting one instance from a bag including a plurality of instances;
Reference vector specifying means for specifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
Instance probability calculating means for calculating an instance correct probability that the category of the instance is correct;
Bag probability calculation means for calculating a bag correct probability that the category of the bag is correct using a probability that the category of the instance included in the bag is correct;
Reference vector correcting means for correcting the related reference vector using the bag correct answer probability;
An information processing system comprising:
(Appendix 2)
The reference vector specifying means has a first reference vector that is the closest to the instance in the same category as the bag to which the selected instance belongs, and a distance from the instance that is the closest to a category different from the bag to which the instance belongs. The information processing system according to appendix 1, wherein a near second reference vector is specified as the related reference vector.
(Appendix 3)
The reference vector specifying means calculates a first distance indicating a distance between the instance and the first reference vector, and a second distance indicating a distance between the instance and the second reference vector;
The information processing system according to supplementary note 2, wherein the reference vector correcting unit corrects the related reference vector using the first distance and the second distance.
(Appendix 4)
The reference vector correcting means includes
Supplementary note 3 wherein the related reference vector is corrected using a conversion distance value obtained by conversion by a power calculation of the first distance and a calculation by dividing by a power of the sum of the first distance and the second distance. Information processing system described in 1.
(Appendix 5)
The instance probability calculating means calculates the instance correct probability by dividing a difference between the first distance and the second distance by a sum of the first distance and the second distance. The information processing system according to appendix 3 or 4.
(Appendix 6)
Probability-based correction coefficient calculating means for calculating a probability-based correction coefficient for correcting the related reference vector using the bag correct answer probability,
The information processing system according to any one of appendices 1 to 5, wherein the reference vector correcting unit corrects the reference vector based on a probability-based correction coefficient.
(Appendix 7)
The probability-based correction coefficient calculating means is configured to multiply the instance correct answer probability by the bag incorrect answer probability as the probability that the bag to which the instance belongs is an incorrect answer category, and the instance incorrect answer probability as the probability that the instance is an incorrect answer category. 7. The information processing system according to claim 6, wherein a probability-based correction coefficient is calculated by multiplying the bag correct answer probability.
(Appendix 8)
The probability-based correction coefficient calculating means calculates a probability-based correction coefficient based on a value obtained by dividing the product of the instance correct answer probability and the bag incorrect answer probability by the bag correct answer probability, and the instance incorrect answer probability. The information processing system according to 6 or 7.
(Appendix 9)
The supplementary note 1 according to any one of supplementary notes 1 to 8, wherein the bag probability calculating means calculates a probability that a category of at least one instance among the instances included in the bag is correct. The information processing system in any one of thru | or 8.
(Appendix 10)
Select an instance from within a bag containing multiple instances,
Identifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
Calculating an instance correct probability that the category of the selected instance is correct;
Using the probability that the category of the instance included in the bag is correct, the bag correct probability that the category of the bag is correct is calculated,
The related reference vector is corrected using the bag correct probability.
A recognition dictionary learning method characterized by that.
(Appendix 11)
Computer
An instance selection means for selecting one instance from a bag including a plurality of instances;
Reference vector specifying means for specifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
An instance probability calculating means for calculating an instance correct probability that the category of the selected instance is correct;
Bag probability calculation means for calculating a bag correct probability that the category of the bag is correct using a probability that the category of the instance included in the bag is correct;
Reference vector correcting means for correcting the related reference vector using the bag correct answer probability;
Information processing program.
While the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.
This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2011-158339 for which it applied on July 19, 2011, and takes in those the indications of all here.

Claims

Reference vector storage means for holding a reference vector group;
An instance selection means for selecting one instance from a bag including a plurality of instances;
Reference vector specifying means for specifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
Instance probability calculating means for calculating an instance correct probability that the category of the instance is correct;
Bag probability calculation means for calculating a bag correct probability that the category of the bag is correct using a probability that the category of the instance included in the bag is correct;
Reference vector correcting means for correcting the related reference vector using the bag correct answer probability;
An information processing system comprising:
The reference vector specifying means has a first reference vector that is the closest to the instance in the same category as the bag to which the selected instance belongs, and a distance from the instance that is closest to a category different from the bag to which the instance belongs. The information processing system according to claim 1, wherein a second reference vector that is close is specified as the related reference vector.
The reference vector specifying means calculates a first distance indicating a distance between the instance and the first reference vector, and a second distance indicating a distance between the instance and the second reference vector;
The information processing system according to claim 2, wherein the reference vector correcting unit corrects the related reference vector using the first distance and the second distance.
The reference vector correcting means includes
The related reference vector is corrected using a conversion distance value obtained by conversion by a power calculation of a first distance and a calculation of dividing by a power of a sum of the first distance and the second distance. 3. The information processing system according to 3.
The instance probability calculating means calculates the instance correct probability by dividing a difference between the first distance and the second distance by a sum of the first distance and the second distance. The information processing system according to claim 3 or 4.
Probability-based correction coefficient calculating means for calculating a probability-based correction coefficient for correcting the related reference vector using the bag correct answer probability,
The information processing system according to claim 1, wherein the reference vector correcting unit corrects the reference vector based on a probability-based correction coefficient.
The probability-based correction coefficient calculating means includes multiplying the instance correct answer probability by the bag incorrect answer probability as the probability that the bag to which the instance belongs is an incorrect answer category, and the instance incorrect answer probability as the probability that the instance is an incorrect answer category. The information processing system according to claim 6, wherein a probability-based correction coefficient is calculated by performing multiplication of the bag correct answer probability.
The probability-based correction coefficient calculation means calculates a probability-based correction coefficient based on a value obtained by dividing a product of an instance correct answer probability and a bag incorrect answer probability by a bag correct answer probability, and the instance incorrect answer probability. Item 8. The information processing system according to Item 6 or 7.
Select an instance from within a bag containing multiple instances,
Identifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
Calculating an instance correct probability that the category of the selected instance is correct;
Using the probability that the category of the instance included in the bag is correct, the bag correct probability that the category of the bag is correct is calculated,
The recognition dictionary learning method, wherein the related reference vector is corrected using the bag correct answer probability.
Computer
An instance selection means for selecting one instance from a bag including a plurality of instances;
Reference vector specifying means for specifying a related reference vector most relevant to the selected instance from the reference vector group stored in the reference vector storage means;
An instance probability calculating means for calculating an instance correct probability that the category of the selected instance is correct;
Bag probability calculation means for calculating a bag correct probability that the category of the bag is correct using a probability that the category of the instance included in the bag is correct;
A recording medium for storing an information processing program which is operated as reference vector correcting means for correcting the related reference vector using the bag correct answer probability.