US20200387756A1 - Learning data generation apparatus, learning model generation system, learning data generation method, and non-transitory storage medium - Google Patents
Learning data generation apparatus, learning model generation system, learning data generation method, and non-transitory storage medium Download PDFInfo
- Publication number
- US20200387756A1 US20200387756A1 US17/001,716 US202017001716A US2020387756A1 US 20200387756 A1 US20200387756 A1 US 20200387756A1 US 202017001716 A US202017001716 A US 202017001716A US 2020387756 A1 US2020387756 A1 US 2020387756A1
- Authority
- US
- United States
- Prior art keywords
- label
- threshold
- learning data
- object image
- learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 24
- 238000011156 evaluation Methods 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 230000008859 change Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 22
- 238000004891 communication Methods 0.000 description 20
- 230000008569 process Effects 0.000 description 14
- 230000005540 biological transmission Effects 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 8
- 239000000284 extract Substances 0.000 description 8
- 238000010801 machine learning Methods 0.000 description 8
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
Images
Classifications
-
- G06K9/6259—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
- G06F18/2155—Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/217—Validation; Performance evaluation; Active pattern learning techniques
- G06F18/2193—Validation; Performance evaluation; Active pattern learning techniques based on specific statistical tests
-
- G06K9/6265—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/136—Segmentation; Edge detection involving thresholding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
Definitions
- the present application relates to a learning data generation apparatus, a learning model generation system, a learning data generation method, and a non-transitory storage medium.
- the deep learning is a technique for causing a multi-layered neural network to perform machine learning, and it is possible to improve accuracy by performing supervised learning on a large amount of learning data.
- the learning data it is possible to assign a label to an object, such as an image, and classify the image.
- 2017-111731 discloses a system in which, when a verification image is erroneously detected, a similar image is extracted from unlearned image data by an unsupervised image classifier, and the similar image is added as learning data. Further, Japanese Laid-open Patent Publication No. 2006-343791 discloses a method of increasing learning data by tracking a face region in a moving image.
- Japanese Laid-open Patent Publication No. 2017-111731 it is possible to improve classification accuracy of the verification image, but there is room for improvement in collection of learning data that is suitable for an object to be classified.
- Japanese Laid-open Patent Publication No. 2006-343791 it is possible to increase learning data with respect to a known label, but there is room for improvement in an increase in learning data that is suitable for an object not classified as a known label. Therefore, there is a need to appropriately collect learning data that is suitable for an object to be classified.
- a learning data generation apparatus a learning model generation system, a learning data generation method, and a non-transitory storage medium are disclosed.
- a learning data generation apparatus comprising: an object extraction unit configured to extract an object image from an image; a classification evaluation unit configured to evaluate the object image based on a learned model and to thereby calculate reliability indicating a degree of probability that the object image is classified as a candidate label; a classification determination unit configured to, if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold, associate a temporary label different from the candidate label with the object image; and a learning data generation unit configured to generate learning data based on the object image that is associated with the temporary label.
- a learning data generation method comprising: extracting an object image from an image; evaluating the object image based on a learned model and calculating reliability indicating a degree of possibility that the object image is classified as a candidate label; associating a temporary label different from the candidate label with the object image if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold; and generating learning data based on the object image that is associated with the temporary label.
- a non-transitory storage medium that stores a computer program for causing a computer to execute: extracting an object image from an image; evaluating the object image based on a learned model and calculating reliability indicating a degree of possibility that the object image is classified as a candidate label; associating a temporary label different from the candidate label with the object image if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold; and generating learning data based on the object image that is associated with the temporary label.
- FIG. 1 is a schematic block diagram of an object classification system according to a present embodiment.
- FIG. 2 is a diagram schematically illustrating a concept of classification by a learned model.
- FIG. 3 is a diagram illustrating an example of a reliability table.
- FIG. 4 is a diagram for explaining determination by a classification determination unit.
- FIG. 5 is a diagram illustrating an example of a label table.
- FIG. 6 is a diagram illustrating an example of a temporary label table.
- FIG. 7 is a diagram illustrating an example of the label table.
- FIG. 8 is a diagram illustrating an example of a learned data table.
- FIG. 9 is a diagram illustrating an example of the learned data table.
- FIG. 10 is a flowchart for explaining a flow of processes performed by a controller of a learning data generation apparatus.
- FIG. 11 is a diagram for explaining determination by a classification determination unit in another example of the present embodiment.
- FIG. 12 is a diagram for explaining determination by the classification determination unit in another example of the present embodiment.
- FIG. 13 is a diagram illustrating an example of a learned data table in another example of the present embodiment.
- FIG. 1 is a schematic block diagram of an object classification system according to a present embodiment.
- An object classification system 1 according to the present embodiments is a system that classifies an object image by assigning a label to the object image based on a learned model. Further, the object classification system 1 improves classification accuracy of the object image by generating learning data and updating the learned model. That is, the object classification system 1 may be referred to as a learning model generation system.
- the object classification system 1 includes a learning data generation apparatus 10 and a learning apparatus 12 .
- the learning data generation apparatus 10 is a terminal that is installed at a predetermined position in the present embodiment.
- the learning data generation apparatus 10 includes an imager 20 , a storage 22 , a communication unit 24 , and a controller 26 .
- the learning data generation apparatus 10 may include, for example, an input unit that allows a user to perform input and an output unit that is able to output information.
- the input unit may be an input device, such as a button, that causes the imager 20 to capture an image, or may be a mouse, a keyboard, a touch panel, or the like.
- the output unit is, for example, a display and is able to display a captured image or the like.
- the imager 20 is an imaging element that captures an image under a control of the controller 26 .
- the imager 20 captures a moving image, but the imager 20 may capture a still image.
- the learning data generation apparatus 10 in the present embodiment is an imaging apparatus that includes the imager 20 .
- the learning data generation apparatus 10 need not always include the imager 20 .
- the learning data generation apparatus 10 may acquire an image from an external apparatus via communication.
- the imager 20 captures an image with image resolution of 1920 ⁇ 1080 and at a frame rate of 30 frames per second.
- imaging conditions such as the resolution and the frame rate, are not limited to this example.
- the storage 22 is a memory for storing information on calculation contents and programs of the controller 26 , images captured by the imager 20 , and the like.
- the storage 22 includes at least one external storage device, such as a random access memory (RAM), a read only memory (ROM), or a flash memory, for example.
- RAM random access memory
- ROM read only memory
- flash memory for example.
- the communication unit 24 transmits and receives data by performing communication with an external apparatus, e.g., the learning apparatus 12 in this example, under the control of the controller 26 .
- the communication unit 24 is, for example, an antenna, and transmits and receives data to and from the learning apparatus 12 via radio communication using a wireless local area network (LAN), Wi-fi (registered trademark), Bluetooth (registered trademark), or the like.
- LAN wireless local area network
- Wi-fi registered trademark
- Bluetooth registered trademark
- the communication unit 24 may be connected to the learning apparatus 12 via a cable, and transmit and receive information via wired communication.
- the controller 26 is an arithmetic device, i.e., a central processing unit (CPU).
- the controller 26 includes an image acquisition unit 30 , an object extraction unit 32 , a learned model acquisition unit 34 , a classification evaluation unit 36 , a classification determination unit 38 , a label assigning unit 40 , a learning data generation unit 42 , and a learning data transmission controller 44 .
- the image acquisition unit 30 , the object extraction unit 32 , the learned model acquisition unit 34 , the classification evaluation unit 36 , the classification determination unit 38 , the label assigning unit 40 , the learning data generation unit 42 , and the learning data transmission controller 44 perform processes described later by reading software (program) stored in the storage 22 .
- the image acquisition unit 30 controls the imager 20 and causes the imager 20 to capture images.
- the image acquisition unit 30 acquires the images captured by the imager 20 .
- the image acquisition unit 30 stores the acquired images in the storage 22 .
- the object extraction unit 32 extracts an object image P from the images acquired by the image acquisition unit 30 .
- the object image P is an image that is included in a partial region in a certain image, and is an image to be classified.
- the object image P is a face image of a person who appears in a certain image.
- the object extraction unit 32 extracts multiple object images P from a single image. That is, if multiple face images are included in a certain image, the object extraction unit 32 extracts each of the face images as the object image P.
- the object images P are not limited to the face images of persons, but may be arbitrary images as long as the object images P are images that are to be classified. Examples of the object images P include animals, plants, buildings, and various devices, such as vehicles.
- the object extraction unit 32 extracts the object images P by detecting feature amounts of the image.
- the object extraction unit 32 is, for example, a device (Haar-like detector) that recognizes a face using Haar-like features, but may be a feature amount detector capable of recognizing a face using a different method. That is, the object extraction unit 32 may adopt an arbitrary extraction method as long as it is possible to extract the object images P.
- the object images P extracted by the object extraction unit 32 are stored in the storage 22 .
- the learned model acquisition unit 34 controls the communication unit 24 and acquires a learned model from the learning apparatus 12 .
- the learned model acquisition unit 34 stores the acquired learned model in the storage 22 .
- the learned model according to the present embodiments is configured with a model (configuration information on the neural network) which defines the neural network constituting classifiers that are learned through deep learning, and variables.
- the learned model is able to recreate a neural network that obtain the same classification result if the same input data is input.
- the deep learning is a learning method for causing a deep neural network to perform learning by back propagation (error back propagation algorithm).
- FIG. 2 is a diagram schematically illustrating a concept of classification by the learned model.
- the classification evaluation unit 36 reads the learned model from the storage 22 , and classifies the object image P by using the learned model. More specifically, the classification evaluation unit 36 calculates an probability (reliability described later) that the object image P is classified as a candidate label for each of the candidate labels by using the learned model.
- the classification evaluation unit 36 inputs the object image P as input data to the learned model. Accordingly, the learned model extracts multiple kinds of feature amounts from the object image P, and calculates a degree of probability that the object image P is classified as each of the candidate labels based on the feature amounts.
- the learned model is configured with a multi-layered neural network, and classifies the object image P by extracting different feature amounts for the respective layers. That is, in the learned model, if, for example, classification into two candidate labels is to be performed, as illustrated in FIG. 2 , it is determined whether the object image P is classified as either one of the candidate labels based on whether the extracted feature amounts of the object image P are equal to or larger than a set boundary line L or equal to or smaller than the set boundary line L.
- a convolutional neural network (CNN) is adopted as a model that is used as the learned model, for example.
- a classification model and a classification method of the learned model may be arbitrary as long as the object image P is classified based on the feature amounts of the object image P.
- the deep learning may be implemented by software so as to be operated by a CPU, a GPU, or the like by using a deep learning framework, such as TensorFlow (Registered Trademark).
- the classification evaluation unit 36 evaluates (analyzes) the object image P based on the learned model, and calculates reliability for each of the candidate labels.
- the reliability is an index, in this example, a value, that indicates a degree of probability that the object image P is classified as each of the candidate labels. For example, the reliability is equal to or larger than 0 and equal to or smaller than 1, and a sum of the reliability of all of the labels is 1. In this case, the reliability may be represented as posterior probability.
- the candidate labels are labels that are set in advance in the learned model. Each of the object images P may be different types of face images, and the candidate labels are labels indicating the types of the object images P.
- the classification evaluation unit 36 calculates, as the reliability, probability that the object image P corresponds to each of the types.
- the candidate label is set for each individual (same person). That is, one single candidate label indicates a face of one individual, and another candidate label indicates a face of another individual.
- the candidate labels are not limited to individuals, but may be set arbitrarily as long as the labels indicate the types of the object images P.
- the candidate labels may be ages, genders, races, or the like of a person.
- the candidate labels may be types of animals, types of plants, types of buildings, or types of various devices, such as vehicles.
- FIG. 3 is a diagram illustrating an example of a reliability table.
- the classification evaluation unit 36 calculates the reliability of the object image P for each of the candidate labels, and generates the reliability table as illustrated in FIG. 3 .
- candidate labels F 01 , F 02 , F 03 , F 04 , and F 05 are set as the candidate labels.
- the candidate labels F 01 , F 02 , F 03 , F 04 , and F 05 indicate different individuals.
- the reliability are set to 0.05, 0.07, 0.86, 0.02, and 0.00 in order from the candidate label F 01 to the candidate label F 05 .
- the reliability are set such that a sum of the reliability of all of the candidate labels is 1.
- the classification evaluation unit 36 determines the probability that the object image P is classified as the candidate label F 01 is 5%, the probability that the object image P is classified as the candidate label F 02 is 7%, the probability that the object image P is classified as the candidate label F 03 is 86%, the probability that the object image P is classified as the candidate label F 04 is 2%, and the probability that the object image P is classified as the candidate label F 05 is 0%.
- the classification evaluation unit 36 stores the reliability calculated as above in the storage 22 in association with the object image P.
- the classification evaluation unit 36 calculates the reliability for each of the object images P.
- FIG. 4 is a diagram for explaining determination by the classification determination unit.
- the classification determination unit 38 determines whether the object image P is classified as the candidate labels based on the reliability calculated by the classification evaluation unit 36 . Specifically, the classification determination unit 38 extracts the candidate label with the highest reliability from among the multiple candidate labels, and determines whether the object image P is classified as the extracted candidate label.
- the reliability of the candidate label with the highest reliability is referred to as a maximum reliability.
- the candidate label with the highest reliability is the candidate label F 03
- the maximum reliability is the reliability of the candidate label F 03 , which is 0.86.
- the classification determination unit 38 determines that the object image P is classified as the candidate label with the maximum reliability. It is preferable to set the first threshold K 1 based on the number of classifications and the number of learned images in the learned model. However, in this example, the first threshold K 1 is set to 0.85, but not limited thereto, and may be set arbitrarily. For example, it is preferable to reduce the first threshold K 1 with an increase in the number of classifications, and increase the first threshold K 1 with an increase in the number of learned images in the learned model.
- the classification determination unit 38 transmits a determination result indicating that the object image P is classified as said candidate label to the label assigning unit 40 .
- the label assigning unit 40 assigns said candidate label as a label of the object image P. That is, the label assigning unit 40 adopts said candidate label as an official label, and determines that the object image P is classified as this label. For example, if the first threshold K 1 is set to 0.85, the object image P 3 in FIG. 3 is determined as being classified as the candidate label F 03 .
- the label assigning unit 40 assigns the candidate label F 03 as the official label F 03 to the object image P. That is, the object image P is classified as a face image of an individual with the candidate label F 03 (the label F 03 ).
- the classification determination unit 38 determines that the object image P is not classified as said candidate label. Further, if the maximum reliability is smaller than the first threshold K 1 , the classification determination unit 38 determines whether the maximum reliability is equal to or larger than a second threshold K 2 . As illustrated in FIG. 4 , the second threshold K 2 is smaller than the first threshold K 1 . It is preferable to set the second threshold K 2 based on the number of classifications and the number of learned images in the learned model. However, in this example, the second threshold K 2 is set to 0.7, but may be set to an arbitrary value that is smaller than the first threshold K 1 .
- the classification determination unit 38 associates a temporary label with the object image P if the maximum reliability is smaller than the first threshold K 1 and equal to or larger than the second threshold K 2 .
- the temporary label is set for each of the candidate labels, and is a label of a different type from the candidate labels.
- the classification determination unit 38 associates the object image P with the temporary label other than the candidate labels, without classifying the object image P as the candidate labels.
- the classification determination unit 38 temporarily associates an unknown new individual other than the individuals indicated by the candidate labels, without classifying the object image P as the individuals indicated by the candidate labels. For example, if the reliability (maximum reliability) of the candidate label F 03 in FIG. 3 is smaller than the first threshold K 1 and equal to or larger than the second threshold K 2 , the object image P is associated with a temporary label different from the candidate labels F 01 to F 05 .
- the classification determination unit 38 does not associate the object image P with the candidate labels and the temporary label. That is, if the maximum reliability is smaller than the second threshold K 2 , the classification determination unit 38 does not classify the object image P. Further, if the maximum reliability is smaller than the second threshold K 2 , the classification determination unit 38 does not use the object image P as learning data as described later. However, the classification determination unit 38 may set only the first threshold K 1 without setting the second threshold K 2 , and use the object image P as the learning data.
- FIG. 5 is a diagram illustrating an example of a label table.
- the classification determination unit 38 performs the determination as described above for each of the object images P.
- the label assigning unit 40 stores the label that is determined as the official label and the object image P in an associated manner in the storage 22 .
- FIG. 5 illustrates one example of the label table as information that indicates associations between the labels and the object images P.
- FIG. 5 illustrates an example in which determination is performed on an object image P 1 to an object image P 20 .
- the object images P 1 , P 2 , P 10 , P 12 , P 13 , and P 16 are the object images P for which the candidate label F 01 is determined as the official label F 01 .
- the object images P 3 and P 19 are the object images P for which the candidate label F 02 is determined as the official label F 02
- the object image P 18 is the object image P for which the candidate label F 03 is determined as the official label F 03
- the object images P 4 , P 9 , P 17 , and P 20 are the object images P for which the candidate label F 04 is determined as the official label F 04
- no object image P is present for which the candidate label F 05 is determined as the official label F 05 .
- the object images P to which no labels are assigned are present among the object images P 1 to the object image P 20 .
- FIG. 6 is a diagram illustrating an example of a temporary label table.
- the classification determination unit 38 stores the temporary labels and the object images P in an associated manner in the storage 22 .
- the candidate label determined as having the maximum reliability may vary for each of the object images P.
- the classification determination unit 38 sets a temporary label for each of the candidate labels that are determined as having the maximum reliability.
- FIG. 6 illustrates one example of the temporary label table as information in which the temporary labels and the object images P are associated. For example, in FIG.
- the classification determination unit 38 determines a temporary label F 06 as a temporary label that is adopted when the candidate label F 01 is determined as having the maximum reliability, and determines a temporary label F 07 as a temporary label that is adopted when the candidate label F 02 is determined as having the maximum reliability. That is, when the classification determination unit 38 associates the temporary labels, and if the candidate labels determined as having the maximum reliability are different, the classification determination unit 38 associates different temporary labels (different individuals). Further, when the classification determination unit 38 associates the temporary label, and if the candidate labels determined as having the maximum reliability are the same, the classification determination unit 38 associates the same temporary label.
- the object images P 5 , P 8 , P 11 , P 14 , and P 15 are the object images P that are associated with the temporary label F 06
- the object image P 6 is the object image P that is associated with the temporary label F 07 .
- the classification determination unit 38 determines that the object images P are classified as the temporary label. Then, the classification determination unit 38 transmits a determination result indicating that the object images P are classified as the temporary label to the label assigning unit 40 .
- the label assigning unit 40 assigns the temporary label as the labels of the object images P. That is, the label assigning unit 40 adopts the temporary label as the official label, and determines that the object images P are classified as the said label.
- the label assigning unit 40 assigns a different label (temporary label) different from the already set candidate labels to the object images P.
- the predetermined number in this example is set to 5 for example, but is not limited thereto, and may be set arbitrarily.
- FIG. 7 is a diagram illustrating an example of the label table. As described above, if the temporary label is adopted as the official label, the number of labels in the label table increases. FIG. 7 illustrates the label table in a case where the temporary label F 06 is added as the official label to the label table in FIG. 5 . As illustrated in FIG. 7 , the temporary label F 06 is set as the official label F 06 , and the object images P 5 , P 8 , P 11 , P 14 , and P 15 are the object images P that are classified as the label F 06 . The temporary label F 07 is not adopted as the official label because the number of associated object images P is smaller than the predetermined number.
- the learning data generation unit 42 generates learning data for updating the learned model based on the determination result obtained by the classification determination unit 38 . If the number of the object images P that are associated with the temporary label by the classification determination unit 38 becomes equal to or larger than the predetermined number, the learning data generation unit 42 adopts the object images P and the temporary label as the learning data. That is, if the number of the object images P that are associated with the temporary label becomes equal to or larger than the predetermined number and the temporary label is therefore assigned as the official label, the learning data generation unit 42 generates the learning data by associating the temporary label with each of the object images P.
- the learning data indicates supervised data and is data that includes information indicating that the object images P are classified as the temporary label.
- the object images P 5 , P 8 , P 11 , P 14 , and P 15 are associated with the temporary label F 06 , and adopted as the learning data.
- the learning data transmission controller 44 illustrated in FIG. 1 controls the communication unit 24 and transmits the learning data generated by the learning data generation unit 42 to the learning apparatus 12 .
- the learning data generation unit 42 deletes the object images P that are used as the learning data from the temporary label table stored in the storage 22 . That is, in the example in FIG. 6 , the object images P 5 , P 8 , P 11 , P 14 , and P 15 that are associated with the temporary label F 06 and that are adopted as the learning data are deleted. Therefore, only the object image P 6 associated with the temporary label F 07 remains in the temporary label table. However, if any of the object images P is adopted as the learning data, the learning data generation unit 42 may delete all of the object images P in the temporary label table.
- the object image P 6 associated with the temporary label F 07 is also deleted.
- the object images P that are used as the learning data are deleted, the object images P associated with other temporary labels remain. Therefore, in this case, the number needed to reach the predetermined number is small, so that it is possible to generate the learning data more promptly.
- the learning data generation apparatus 10 is configured as described above. Next, the learning apparatus 12 will be described. As illustrated in FIG. 1 , the learning apparatus 12 is an apparatus (server) that is installed at a different position from the learning data generation apparatus 10 .
- the learning apparatus 12 includes a communication unit 50 , a storage 52 , and a controller 54 .
- the learning apparatus 12 may include, for example, an input unit that allows a user to perform input and an output unit that is able to output information.
- the input unit may be a mouse, a keyboard, a touch panel, or the like.
- the output unit is, for example, a display and is able to display a captured image or the like.
- the communication unit 50 transmits and receives data by performing communication with an external apparatus, e.g., the learning data generation apparatus 10 in this example, under a control of the controller 54 .
- the communication unit 50 is, for example, an antenna, and transmits and receives data to and from the learning data generation apparatus 10 via radio communication using a wireless LAN, Wi-fi (registered trademark), Bluetooth (registered trademark), or the like.
- the communication unit 50 may be connected to the learning data generation apparatus 10 via a cable, and transmit and receive information via wired communication.
- the storage 52 is a memory for storing information on calculation contents and programs of the controller 54 .
- the storage 52 includes at least one external storage device, such as a random access memory (RAM), a read only memory (ROM), and a flash memory, for example.
- RAM random access memory
- ROM read only memory
- flash memory for example.
- the controller 54 is an arithmetic device, i.e., a central processing unit (CPU).
- the controller 54 includes a learning data acquisition unit 60 , a learning unit 62 , and a learned model transmission controller 64 .
- the learning data acquisition unit 60 , the learning unit 62 , and the learned model transmission controller 64 perform processes as described later by reading software (program) stored in the storage 52 .
- the learning data acquisition unit 60 controls the communication unit 50 and acquires the learning data generated by the learning data generation unit 42 from the communication unit 24 of the learning data generation apparatus 10 .
- the learning data acquisition unit 60 stores the acquired learning data in the storage 52 .
- the learning unit 62 updates the learned model by learning.
- the learning unit 62 reads the learned model and the learned learning data that are stored in advance in the storage 52 , and reads new learning data that is acquired by the learning data acquisition unit 60 .
- the learning unit 62 updates the learned model by causing the learned model to learn the learned learning data and the new learning data as supervised data.
- FIG. 8 and FIG. 9 are examples of a learned data table.
- FIG. 8 is one example of the learned data table before update.
- the learned data table is supervised data that is stored in the storage 52 , and is information in which learned images are associated with labels. That is, the learned data table is a data group that includes multiple pieces of teaching data indicating the labels into which the learned images are classified.
- the learned model is learned and constructed by using each piece of the data in the learned data table as teaching data.
- FIG. 8 illustrates one example of the learned data before update with the learning data obtained from the learning data generation unit 42 .
- learned images P 101 to P 200 are classified as the label F 01
- learned images P 201 to P 300 are classified as the label F 02
- learned images P 301 to P 400 are classified as the label F 03
- learned images P 401 to P 500 are classified as the label F 04
- learned images P 501 to P 600 are classified as the label F 05 .
- the learned images P 100 to P 600 are face images that are extracted in advance, and are images for which the labels to be classified are set in advance without being classified by the learning data generation apparatus 10 . That is, the learned model is constructed by being supplied with the learned images P 100 to P 600 in advance as supervised data.
- FIG. 9 illustrates one example of the leaned data table after update with the learning data obtained from the learning data generation unit 42 .
- the learning unit 62 updates the learned data table by adding the learning data to the learned data table that is not yet updated. That is, the learning unit 62 updates the learned data table by adding the object images P of the learning data as learned images and the temporary label as a new label in the learned data table. That is, the learning data is supervised data that indicates an association between the temporary labels and the object images P.
- the new learned images (object images) P 5 , P 8 , P 11 , P 14 , and P 15 included in the learning data are added to the learned data table, and the new label F 06 is associated with the learned images.
- the learning unit 62 updates the learned model by using the learned data table to which the new learning data is added as described above.
- the learning unit 62 stores the updated learned model in the storage 52 .
- the learned model transmission controller 64 controls the communication unit 50 and transmits the updated learned model to the learning data generation apparatus 10 .
- the learned model acquisition unit 34 reads the updated learned model and performs classification of a next object image P and generation of the learning data.
- the learning apparatus 12 is a server that is installed at a position separated from the learning data generation apparatus 10 as described above. However, the learning apparatus 12 may be incorporated in the learning data generation apparatus 10 . That is, the object classification system 1 may be configured such that the learning apparatus 12 is not included but the learning data acquisition unit 60 and the learning unit 62 are incorporated in the controller 26 of the learning data generation apparatus 10 . In this case, the learning data transmission controller 44 is also not needed. Further, in the present embodiment, the learning data generation apparatus 10 causes the classification determination unit 38 to perform determination, but the learning apparatus 12 may include the classification determination unit 38 and perform determination. Furthermore, the learning apparatus 12 may include the classification evaluation unit 36 and calculate the reliability.
- the object classification system 1 may of course be implemented as an apparatus using hardware, such as a CPU, an FPGA, an ASIC, or a memory, and may be implemented by firmware stored in a ROM, a flash memory, or the like or by software, such as a computer.
- a firmware program and a software program may be provided by being recorded in a medium that is readable by a computer or the like, may be transmitted and received to and from a server via a wired or wireless network, or may be transmitted and received as data broadcasting via terrestrial broadcasting or satellite digital broadcasting.
- FIG. 10 is a flowchart for explaining a flow of processes performed by the controller of the learning data generation apparatus.
- the controller 26 of the learning data generation apparatus 10 causes the image acquisition unit 30 to acquire an image captured by the imager 20 (Step S 10 ), and causes the object extraction unit 32 to extract a single object image P from the acquired image (Step S 12 ).
- the controller 26 causes the classification evaluation unit 36 to calculate the reliability for each of the candidate labels of the object image P based on the learned model (Step S 16 ). Then, the controller 26 causes the classification determination unit 38 to determine whether the maximum reliability is equal to or larger than the first threshold K 1 (Step S 18 ). The maximum reliability is the reliability of the candidate label with the highest reliability among the candidate labels. If the maximum reliability is equal to or larger than the first threshold K 1 (Step S 18 ; Yes), the classification determination unit 38 determines that the object image P is classified as the candidate label with the maximum reliability, and the label assigning unit 40 determines the candidate label with the maximum reliability as the label of the object image P (Step S 20 ). After the candidate label with the maximum reliability is determined as the label, the process proceeds to Step S 31 described later.
- Step S 18 determines whether the maximum reliability is equal to or larger than the first threshold K 1 (Step S 18 ; No). If the maximum reliability is not equal to or larger than the second threshold K 2 (Step S 22 ; No), that is, if the maximum reliability is smaller than the second threshold, the process proceeds to Step S 31 . If the maximum reliability is equal to or larger than the second threshold K 2 (Step S 22 ; Yes), the classification determination unit 38 assigns a temporary label to the object image P (Step S 24 ).
- the classification determination unit 38 determines whether the number of the object images P to which the temporary label is assigned is equal to or larger than a predetermined number (Step S 26 ), and if the number is not equal to or larger than the predetermined number (Step S 26 ; No), the process proceeds to Step S 31 . In contrast, if the number of the object images P to which the temporary label is assigned is equal to or larger than the predetermined number (Step S 26 ; Yes), the label assigning unit 40 determines the temporary label as the label (Step S 28 ), and the learning data generation unit 42 adopts the temporary label and the object images P as the learning data (Step S 30 ).
- Step S 31 at which the object extraction unit 32 determines whether other object images P are present in the image (Step S 31 ). If the other object images P are present (Step S 31 ; Yes), the process returns to Step S 12 , and the object extraction unit 32 extracts one of the other object images P. If the other object images P are not present (Step S 31 ; No), the process proceeds to Step S 32 , and the controller 26 determines whether other images are present (Step S 32 ). If the other images are present (Step S 32 ; Yes), the process returns to Step S 10 . If the other images are not present (Step S 32 ; No), the process is ended. Thereafter, the controller 26 causes the learning data transmission controller 44 to transmit the learning data to the learning apparatus 12 , and the learning apparatus 12 updates the learned model based on the learning data. The controller 26 subsequently performs the above-described process using the updated learned model.
- the learning data generation apparatus 10 includes the image acquisition unit 30 that acquires the image, the object extraction unit 32 that extracts the object image P from the image, the classification evaluation unit 36 , the classification determination unit 38 , and the learning data generation unit 42 .
- the classification evaluation unit 36 evaluates the object image P based on the learned model, and calculates the reliability indicating the degree of the auditority that the object image P is classified as the candidate label.
- the classification determination unit 38 associates the temporary label that is different from the candidate label with the object image P if the reliability is smaller than the first threshold K 1 and is equal to or larger than the second threshold K 2 that is smaller than the first threshold K 1 .
- the learning data generation unit 42 generates the learning data based on the object image that is associated with the temporary label. Further, the classification determination unit 38 determines whether the object image P is classified as the candidate label by performing the determination based on the reliability. The learning data generation unit 42 generates the learning data for updating the learned model based on the determination result of the classification determination unit 38 . The classification determination unit 38 determines that the object image P is classified as the candidate label if the reliability is equal to or larger than the first threshold K 1 , and associates the temporary label that is set for the candidate label with the object image P if the reliability is smaller than the first threshold K 1 and is equal to or larger than the second threshold K 2 .
- the learning data generation unit 42 adopts each of the object images P and the temporary label as the learning data. With this configuration, it is possible to assign the temporary label and improve the accuracy of the learning data. Further, if the reliability becomes equal to or larger than the first threshold K 1 , the candidate label and the object image P are not adopted as the learning data, so that it is possible to eliminate the images that are used in machine learning in advance as premise for the machine learning, and to improve efficiency of the machine learning.
- the learning data generation apparatus 10 analyzes the object image P to be classified, and assigns a temporary label. Then, it is determined that the object image P is classified as the temporary label and adopted as the learning data (supervised data). Therefore, the learning data generation apparatus 10 generates the learning data by using the image to be classified, so that it is possible to appropriately generate the learning data suitable for an object to be classified.
- the learning data generation apparatus 10 is able to increase the number of classifications after update (after relearning). Meanwhile, in the present embodiment, even if the learned model is updated, the values of the first threshold K 1 and the second threshold K 2 for assigning the temporary label or the like are set constant. However, the learning data generation apparatus 10 may change the values of the first threshold K 1 and the second threshold K 2 after the update of the learned model. For example, if the number of the candidate labels exceeds a predetermined number due to the update of the learned model, the learning data generation apparatus 10 may reduce the values of the first threshold K 1 and the second threshold K 2 . With this configuration, even if the number of classifications increases, it is possible to perform the classification in a preferred manner.
- the classification evaluation unit 36 calculates the reliability for the candidate labels of multiple kinds.
- the classification determination unit 38 sets the temporary label for each of the candidate labels, and associates the object image P with the temporary label that is set for the candidate label with the highest reliability.
- the learning data generation apparatus 10 associates the temporary label that is set for the candidate label with the highest reliability among the multiple candidate labels. Therefore, according to the learning data generation apparatus 10 , it is possible to appropriately perform the classification, appropriately assign the temporary label, and improve the accuracy of the learning data.
- the learning data generation apparatus 10 includes the label assigning unit 40 that assigns the label to the object image P. If it is determined that the object image P is classified as the candidate label, the label assigning unit 40 assigns the candidate label as a label of the object image P. If the number of the object images P associated with the temporary label becomes equal to or larger than the predetermined number, the label assigning unit 40 assigns the temporary label as the label of the object images P.
- the learning data generation apparatus 10 as described above determines that a label with the reliability equal to or larger than the first threshold K 1 is the label, and determines, after multiple pieces of data are accumulated, that a label with the reliability smaller than the first threshold K 1 and equal to or larger than the second threshold K 2 is the label. Therefore, it is possible to improve accuracy in setting an unknown label.
- the learning model generation system (the object classification system 1 ) according to the present embodiments includes the learning data generation apparatus 10 , and the learning unit 62 that performs the machine learning based on the learning data generated by the learning data generation unit 42 and updates the learned model.
- the learning model generation system generates the learned model based on the learning data generated by the learning data generation apparatus 10 .
- the learning model generation system updates the learning model based on the learning data, so that it is possible to construct the learning model that is suitable for the object to be classified.
- the learning unit 62 performs the machine learning by deep learning. Therefore, the learning model generation system is able to construct a highly accurate learning model.
- the object classification system 1 includes the learning data generation apparatus 10 and classifies the object image P by using the learned model. Therefore, the object classification system 1 is able to appropriately classify the object image P.
- FIG. 11 is a diagram for explaining the determination performed by the classification determination unit according to another example of the present embodiment.
- the classification determination unit 38 sets a single temporary label for a single candidate label, but it may be possible to set multiple temporary labels for a single candidate label.
- the example in FIG. 11 is an example in which temporary labels A and B are assigned to a single candidate label.
- the classification determination unit 38 associates the temporary label B with the object image P.
- the threshold K 2 A is smaller than the first threshold K 1 and larger than the second threshold K 2 .
- the classification determination unit 38 associates the temporary label A with the object image P. In this manner, the classification determination unit 38 may divide a range assigned for the maximum reliability of a single candidate label into multiple divided ranges, and set a temporary label for each of the multiple divided ranges. Meanwhile, it may be possible to set one of the first threshold K 1 and the second threshold K 2 to the same value as the threshold K 2 A.
- the reliability of the first threshold K 1 is not always high as in a case where the number of the learned images in the learned model is smaller than a predetermined number, and if the maximum reliability is smaller than the first threshold K 1 and equal to or larger than the threshold K 2 A, it may be possible not to assign the temporary label to the object image P.
- the classification determination unit 38 associates a first temporary label (the temporary label B) that is different from the candidate label with the object image P. If the reliability is equal to or larger than the second threshold K 2 and smaller than the intermediate threshold (the threshold K 2 A), the classification determination unit 38 associates a second temporary label (the temporary label A) that is different from the first temporary label with the object image P.
- FIG. 12 is a diagram for explaining the determination performed by the classification determination unit in another example of the present embodiment.
- the classification determination unit 38 assigns the label but does not use the object image as the learning data.
- the classification determination unit 38 may use, as the learning data, the object image to which the label is assigned.
- the classification determination unit 38 assigs, as the official label, a candidate label with the maximum reliability to the object image P if the maximum reliability is equal to or larger than the third threshold K 3 and smaller than 1.
- the classification determination unit 38 uses the object image P and the candidate label with the maximum reliability as the learning data if the maximum reliability is equal to or larger than the first threshold K 1 and smaller than a third threshold K 3 .
- the third threshold K 3 is larger than the first threshold K 1 .
- the classification determination unit 38 assigns the candidate label with the maximum reliability as the official label to the object image P, but does not use the object image P and the candidate label as the learning data.
- FIG. 13 is a diagram illustrating an example of the learned data table in another example of the present embodiment.
- FIG. 13 illustrates an example in which the candidate label F 01 is a candidate label with the maximum reliability for the object images P 12 , P 13 , and P 16 , and in which the maximum reliability is equal to or larger than the first threshold K 1 and smaller than the third threshold K 3 .
- the object images P 12 , P 13 , and P 16 are associated with the candidate label F 01 and added, as learning data (supervised data), to the learned data table. Therefore, the learned data table illustrated in FIG. 13 includes data indicating that the learned images P 12 , P 13 , and P 16 are associated with the label F 01 , in addition to the data illustrated in FIG. 10 . Consequently, the learning unit 62 is able to construct the learned model with high classification accuracy by using the learned images P 12 , P 13 , and P 16 .
- the classification determination unit 38 associates the object image P with the candidate label.
- the learning data generation unit 42 generates the learning data based on the object image P associated with the temporary label or the candidate label.
- the learning data generation unit 42 may adopt, as the learning data, the candidate label with the reliability that is higher than the first threshold K 1 and lower than the third threshold K 3 that is higher than the first threshold K 1 and the object image P. Further, it may be possible for the learning data generation unit 42 not to adopt the candidate label with the reliability that is equal to or larger than the third threshold K 3 and the object image P as the learning data.
- the learning data generation apparatus 10 determines that data is not needed for new learning and does not adopt the data as learning data if the reliability is adequately high, e.g., if the reliability is equal to or larger than the third threshold K 3 , and determines that data is suitable for improvement of accuracy of the learning model and uses the data as learning data if the reliability is adequate but is not as high as the third threshold K 3 . Therefore, by using only data that is appropriate for the learning data, the learning data generation apparatus 10 is able to appropriately generate the learning data that is suitable for the object to be classified.
- the classification determination unit 38 may change at least one of the first threshold K 1 and the second threshold K 2 in accordance with the number of images (learned images) that are used for the learned model. More specifically, the classification determination unit 38 may change at least any of the first threshold K 1 , the second threshold K 2 , the threshold K 2 A, and the third threshold K 3 in accordance with the number of images (learned images) that are used for the learned model. With this configuration, it is possible to appropriately change the threshold depending on a change in the number of the images, so that it is possible to appropriately generate the learning data that is suitable for the object to be classified.
Abstract
Description
- This application is a Continuation of PCT International Application No. PCT/JP2019/009536 filed on Mar. 8, 2019 which claims the benefit of priority from Japanese Patent Application No. 2018-042291 filed on Mar. 8, 2018, the entire contents of both of which are incorporated herein by reference.
- The present application relates to a learning data generation apparatus, a learning model generation system, a learning data generation method, and a non-transitory storage medium.
- In recent years, with practical application of a graphics processing unit (GPU) or the like, machine learning using deep learning has attracted attention. The deep learning is a technique for causing a multi-layered neural network to perform machine learning, and it is possible to improve accuracy by performing supervised learning on a large amount of learning data. For example, with use of the learning data, it is possible to assign a label to an object, such as an image, and classify the image. Here, to perform the supervised learning with high accuracy, it is necessary to prepare a large amount of learning data to which labels are assigned. For example, Japanese Laid-open Patent Publication No. 2017-111731 discloses a system in which, when a verification image is erroneously detected, a similar image is extracted from unlearned image data by an unsupervised image classifier, and the similar image is added as learning data. Further, Japanese Laid-open Patent Publication No. 2006-343791 discloses a method of increasing learning data by tracking a face region in a moving image.
- However, in Japanese Laid-open Patent Publication No. 2017-111731, it is possible to improve classification accuracy of the verification image, but there is room for improvement in collection of learning data that is suitable for an object to be classified. Further, in Japanese Laid-open Patent Publication No. 2006-343791, it is possible to increase learning data with respect to a known label, but there is room for improvement in an increase in learning data that is suitable for an object not classified as a known label. Therefore, there is a need to appropriately collect learning data that is suitable for an object to be classified.
- A learning data generation apparatus, a learning model generation system, a learning data generation method, and a non-transitory storage medium are disclosed.
- According to one aspect, there is provided a learning data generation apparatus comprising: an object extraction unit configured to extract an object image from an image; a classification evaluation unit configured to evaluate the object image based on a learned model and to thereby calculate reliability indicating a degree of posibility that the object image is classified as a candidate label; a classification determination unit configured to, if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold, associate a temporary label different from the candidate label with the object image; and a learning data generation unit configured to generate learning data based on the object image that is associated with the temporary label.
- According to one aspect, there is provided a learning data generation method comprising: extracting an object image from an image; evaluating the object image based on a learned model and calculating reliability indicating a degree of possibility that the object image is classified as a candidate label; associating a temporary label different from the candidate label with the object image if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold; and generating learning data based on the object image that is associated with the temporary label.
- According to one aspect, there is provided a non-transitory storage medium that stores a computer program for causing a computer to execute: extracting an object image from an image; evaluating the object image based on a learned model and calculating reliability indicating a degree of possibility that the object image is classified as a candidate label; associating a temporary label different from the candidate label with the object image if the reliability is smaller than a first threshold and equal to or larger than a second threshold which is smaller than the first threshold; and generating learning data based on the object image that is associated with the temporary label.
- The above and other objects, features, advantages and technical and industrial significance of this application will be better understood by reading the following detailed description of presently preferred embodiments of the application, when considered in connection with the accompanying drawings.
-
FIG. 1 is a schematic block diagram of an object classification system according to a present embodiment. -
FIG. 2 is a diagram schematically illustrating a concept of classification by a learned model. -
FIG. 3 is a diagram illustrating an example of a reliability table. -
FIG. 4 is a diagram for explaining determination by a classification determination unit. -
FIG. 5 is a diagram illustrating an example of a label table. -
FIG. 6 is a diagram illustrating an example of a temporary label table. -
FIG. 7 is a diagram illustrating an example of the label table. -
FIG. 8 is a diagram illustrating an example of a learned data table. -
FIG. 9 is a diagram illustrating an example of the learned data table. -
FIG. 10 is a flowchart for explaining a flow of processes performed by a controller of a learning data generation apparatus. -
FIG. 11 is a diagram for explaining determination by a classification determination unit in another example of the present embodiment. -
FIG. 12 is a diagram for explaining determination by the classification determination unit in another example of the present embodiment. -
FIG. 13 is a diagram illustrating an example of a learned data table in another example of the present embodiment. - Embodiments of the present application will be described in detail below based on the drawings. The present application is not limited by the embodiments described below.
-
FIG. 1 is a schematic block diagram of an object classification system according to a present embodiment. An object classification system 1 according to the present embodiments is a system that classifies an object image by assigning a label to the object image based on a learned model. Further, the object classification system 1 improves classification accuracy of the object image by generating learning data and updating the learned model. That is, the object classification system 1 may be referred to as a learning model generation system. - As illustrated in
FIG. 1 , the object classification system 1 includes a learning data generation apparatus 10 and alearning apparatus 12. The learning data generation apparatus 10 is a terminal that is installed at a predetermined position in the present embodiment. The learning data generation apparatus 10 includes animager 20, astorage 22, acommunication unit 24, and acontroller 26. Meanwhile, the learning data generation apparatus 10 may include, for example, an input unit that allows a user to perform input and an output unit that is able to output information. In this case, the input unit may be an input device, such as a button, that causes theimager 20 to capture an image, or may be a mouse, a keyboard, a touch panel, or the like. The output unit is, for example, a display and is able to display a captured image or the like. - The
imager 20 is an imaging element that captures an image under a control of thecontroller 26. In the present embodiment, theimager 20 captures a moving image, but theimager 20 may capture a still image. As described above, the learning data generation apparatus 10 in the present embodiment is an imaging apparatus that includes theimager 20. However, the learning data generation apparatus 10 need not always include theimager 20. In this case, the learning data generation apparatus 10 may acquire an image from an external apparatus via communication. Theimager 20 captures an image with image resolution of 1920×1080 and at a frame rate of 30 frames per second. However, imaging conditions, such as the resolution and the frame rate, are not limited to this example. - The
storage 22 is a memory for storing information on calculation contents and programs of thecontroller 26, images captured by theimager 20, and the like. Thestorage 22 includes at least one external storage device, such as a random access memory (RAM), a read only memory (ROM), or a flash memory, for example. - The
communication unit 24 transmits and receives data by performing communication with an external apparatus, e.g., thelearning apparatus 12 in this example, under the control of thecontroller 26. Thecommunication unit 24 is, for example, an antenna, and transmits and receives data to and from thelearning apparatus 12 via radio communication using a wireless local area network (LAN), Wi-fi (registered trademark), Bluetooth (registered trademark), or the like. However, thecommunication unit 24 may be connected to thelearning apparatus 12 via a cable, and transmit and receive information via wired communication. - The
controller 26 is an arithmetic device, i.e., a central processing unit (CPU). Thecontroller 26 includes animage acquisition unit 30, anobject extraction unit 32, a learnedmodel acquisition unit 34, aclassification evaluation unit 36, a classification determination unit 38, a label assigning unit 40, a learningdata generation unit 42, and a learning data transmission controller 44. Theimage acquisition unit 30, theobject extraction unit 32, the learnedmodel acquisition unit 34, theclassification evaluation unit 36, the classification determination unit 38, the label assigning unit 40, the learningdata generation unit 42, and the learning data transmission controller 44 perform processes described later by reading software (program) stored in thestorage 22. - The
image acquisition unit 30 controls theimager 20 and causes theimager 20 to capture images. Theimage acquisition unit 30 acquires the images captured by theimager 20. Theimage acquisition unit 30 stores the acquired images in thestorage 22. - The
object extraction unit 32 extracts an object image P from the images acquired by theimage acquisition unit 30. The object image P is an image that is included in a partial region in a certain image, and is an image to be classified. For example, the object image P is a face image of a person who appears in a certain image. Theobject extraction unit 32 extracts multiple object images P from a single image. That is, if multiple face images are included in a certain image, theobject extraction unit 32 extracts each of the face images as the object image P. Meanwhile, the object images P are not limited to the face images of persons, but may be arbitrary images as long as the object images P are images that are to be classified. Examples of the object images P include animals, plants, buildings, and various devices, such as vehicles. - The
object extraction unit 32 extracts the object images P by detecting feature amounts of the image. Theobject extraction unit 32 is, for example, a device (Haar-like detector) that recognizes a face using Haar-like features, but may be a feature amount detector capable of recognizing a face using a different method. That is, theobject extraction unit 32 may adopt an arbitrary extraction method as long as it is possible to extract the object images P. The object images P extracted by theobject extraction unit 32 are stored in thestorage 22. - The learned
model acquisition unit 34 controls thecommunication unit 24 and acquires a learned model from thelearning apparatus 12. The learnedmodel acquisition unit 34 stores the acquired learned model in thestorage 22. The learned model according to the present embodiments is configured with a model (configuration information on the neural network) which defines the neural network constituting classifiers that are learned through deep learning, and variables. The learned model is able to recreate a neural network that obtain the same classification result if the same input data is input. The deep learning is a learning method for causing a deep neural network to perform learning by back propagation (error back propagation algorithm). -
FIG. 2 is a diagram schematically illustrating a concept of classification by the learned model. Theclassification evaluation unit 36 reads the learned model from thestorage 22, and classifies the object image P by using the learned model. More specifically, theclassification evaluation unit 36 calculates posibility (reliability described later) that the object image P is classified as a candidate label for each of the candidate labels by using the learned model. Theclassification evaluation unit 36 inputs the object image P as input data to the learned model. Accordingly, the learned model extracts multiple kinds of feature amounts from the object image P, and calculates a degree of posibility that the object image P is classified as each of the candidate labels based on the feature amounts. More specifically, the learned model is configured with a multi-layered neural network, and classifies the object image P by extracting different feature amounts for the respective layers. That is, in the learned model, if, for example, classification into two candidate labels is to be performed, as illustrated inFIG. 2 , it is determined whether the object image P is classified as either one of the candidate labels based on whether the extracted feature amounts of the object image P are equal to or larger than a set boundary line L or equal to or smaller than the set boundary line L. In the present embodiment, a convolutional neural network (CNN) is adopted as a model that is used as the learned model, for example. In the CNN, multiple convolutional layers and multiple pooling layers are set in intermediate layers for extracting feature amounts of the input data, and a fully connected layer is set in a final layer for classifying the input data. However, a classification model and a classification method of the learned model may be arbitrary as long as the object image P is classified based on the feature amounts of the object image P. The deep learning may be implemented by software so as to be operated by a CPU, a GPU, or the like by using a deep learning framework, such as TensorFlow (Registered Trademark). - In the present embodiment, the
classification evaluation unit 36 evaluates (analyzes) the object image P based on the learned model, and calculates reliability for each of the candidate labels. The reliability is an index, in this example, a value, that indicates a degree of posibility that the object image P is classified as each of the candidate labels. For example, the reliability is equal to or larger than 0 and equal to or smaller than 1, and a sum of the reliability of all of the labels is 1. In this case, the reliability may be represented as posterior posibility. Further, the candidate labels are labels that are set in advance in the learned model. Each of the object images P may be different types of face images, and the candidate labels are labels indicating the types of the object images P. Therefore, theclassification evaluation unit 36 calculates, as the reliability, posibility that the object image P corresponds to each of the types. In the present embodiment, the candidate label is set for each individual (same person). That is, one single candidate label indicates a face of one individual, and another candidate label indicates a face of another individual. However, the candidate labels are not limited to individuals, but may be set arbitrarily as long as the labels indicate the types of the object images P. For example, the candidate labels may be ages, genders, races, or the like of a person. Further, the candidate labels may be types of animals, types of plants, types of buildings, or types of various devices, such as vehicles. -
FIG. 3 is a diagram illustrating an example of a reliability table. Theclassification evaluation unit 36 calculates the reliability of the object image P for each of the candidate labels, and generates the reliability table as illustrated inFIG. 3 . In the example illustrated inFIG. 3 , candidate labels F01, F02, F03, F04, and F05 are set as the candidate labels. In the present embodiment, the candidate labels F01, F02, F03, F04, and F05 indicate different individuals. In the example inFIG. 3 , the reliability are set to 0.05, 0.07, 0.86, 0.02, and 0.00 in order from the candidate label F01 to the candidate label F05. The reliability are set such that a sum of the reliability of all of the candidate labels is 1. - Therefore, in the example in
FIG. 3 , theclassification evaluation unit 36 determines the posibility that the object image P is classified as the candidate label F01 is 5%, the posibility that the object image P is classified as the candidate label F02 is 7%, the posibility that the object image P is classified as the candidate label F03 is 86%, the posibility that the object image P is classified as the candidate label F04 is 2%, and the posibility that the object image P is classified as the candidate label F05 is 0%. - The
classification evaluation unit 36 stores the reliability calculated as above in thestorage 22 in association with the object image P. Theclassification evaluation unit 36 calculates the reliability for each of the object images P. -
FIG. 4 is a diagram for explaining determination by the classification determination unit. The classification determination unit 38 determines whether the object image P is classified as the candidate labels based on the reliability calculated by theclassification evaluation unit 36. Specifically, the classification determination unit 38 extracts the candidate label with the highest reliability from among the multiple candidate labels, and determines whether the object image P is classified as the extracted candidate label. Hereinafter, the reliability of the candidate label with the highest reliability is referred to as a maximum reliability. In the example inFIG. 3 , the candidate label with the highest reliability is the candidate label F03, and the maximum reliability is the reliability of the candidate label F03, which is 0.86. - As illustrated in
FIG. 4 , if the maximum reliability is equal to or larger than a first threshold K1, the classification determination unit 38 determines that the object image P is classified as the candidate label with the maximum reliability. It is preferable to set the first threshold K1 based on the number of classifications and the number of learned images in the learned model. However, in this example, the first threshold K1 is set to 0.85, but not limited thereto, and may be set arbitrarily. For example, it is preferable to reduce the first threshold K1 with an increase in the number of classifications, and increase the first threshold K1 with an increase in the number of learned images in the learned model. The classification determination unit 38 transmits a determination result indicating that the object image P is classified as said candidate label to the label assigning unit 40. The label assigning unit 40 assigns said candidate label as a label of the object image P. That is, the label assigning unit 40 adopts said candidate label as an official label, and determines that the object image P is classified as this label. For example, if the first threshold K1 is set to 0.85, the object image P3 inFIG. 3 is determined as being classified as the candidate label F03. The label assigning unit 40 assigns the candidate label F03 as the official label F03 to the object image P. That is, the object image P is classified as a face image of an individual with the candidate label F03 (the label F03). - In contrast, if the maximum reliability is smaller than the first threshold K1, the classification determination unit 38 determines that the object image P is not classified as said candidate label. Further, if the maximum reliability is smaller than the first threshold K1, the classification determination unit 38 determines whether the maximum reliability is equal to or larger than a second threshold K2. As illustrated in
FIG. 4 , the second threshold K2 is smaller than the first threshold K1. It is preferable to set the second threshold K2 based on the number of classifications and the number of learned images in the learned model. However, in this example, the second threshold K2 is set to 0.7, but may be set to an arbitrary value that is smaller than the first threshold K1. Meanwhile, it is preferable to reduce an interval between the first threshold K1 and the second threshold K2 with an increase in the number of unknown new individuals (classifications), and increase the interval between the first threshold K1 and the second threshold K2 with an increase in the number of learned images in the learned model. Meanwhile, it may be possible to set the first threshold K1 and the second threshold K2 to the same value. The classification determination unit 38 associates a temporary label with the object image P if the maximum reliability is smaller than the first threshold K1 and equal to or larger than the second threshold K2. The temporary label is set for each of the candidate labels, and is a label of a different type from the candidate labels. That is, if the maximum reliability is smaller than the first threshold K1 and equal to or larger than the second threshold K2, the classification determination unit 38 associates the object image P with the temporary label other than the candidate labels, without classifying the object image P as the candidate labels. In other words, the classification determination unit 38 temporarily associates an unknown new individual other than the individuals indicated by the candidate labels, without classifying the object image P as the individuals indicated by the candidate labels. For example, if the reliability (maximum reliability) of the candidate label F03 inFIG. 3 is smaller than the first threshold K1 and equal to or larger than the second threshold K2, the object image P is associated with a temporary label different from the candidate labels F01 to F05. - If the maximum reliability is smaller than the second threshold K2, the classification determination unit 38 does not associate the object image P with the candidate labels and the temporary label. That is, if the maximum reliability is smaller than the second threshold K2, the classification determination unit 38 does not classify the object image P. Further, if the maximum reliability is smaller than the second threshold K2, the classification determination unit 38 does not use the object image P as learning data as described later. However, the classification determination unit 38 may set only the first threshold K1 without setting the second threshold K2, and use the object image P as the learning data.
-
FIG. 5 is a diagram illustrating an example of a label table. The classification determination unit 38 performs the determination as described above for each of the object images P. The label assigning unit 40 stores the label that is determined as the official label and the object image P in an associated manner in thestorage 22.FIG. 5 illustrates one example of the label table as information that indicates associations between the labels and the object images P.FIG. 5 illustrates an example in which determination is performed on an object image P1 to an object image P20. In the example inFIG. 5 , the object images P1, P2, P10, P12, P13, and P16 are the object images P for which the candidate label F01 is determined as the official label F01. Further, the object images P3 and P19 are the object images P for which the candidate label F02 is determined as the official label F02, and the object image P18 is the object image P for which the candidate label F03 is determined as the official label F03. Furthermore, the object images P4, P9, P17, and P20 are the object images P for which the candidate label F04 is determined as the official label F04, and no object image P is present for which the candidate label F05 is determined as the official label F05. In this manner, the object images P to which no labels are assigned are present among the object images P1 to the object image P20. -
FIG. 6 is a diagram illustrating an example of a temporary label table. The classification determination unit 38 stores the temporary labels and the object images P in an associated manner in thestorage 22. Here, the candidate label determined as having the maximum reliability may vary for each of the object images P. In this case, the classification determination unit 38 sets a temporary label for each of the candidate labels that are determined as having the maximum reliability.FIG. 6 illustrates one example of the temporary label table as information in which the temporary labels and the object images P are associated. For example, inFIG. 6 , the classification determination unit 38 determines a temporary label F06 as a temporary label that is adopted when the candidate label F01 is determined as having the maximum reliability, and determines a temporary label F07 as a temporary label that is adopted when the candidate label F02 is determined as having the maximum reliability. That is, when the classification determination unit 38 associates the temporary labels, and if the candidate labels determined as having the maximum reliability are different, the classification determination unit 38 associates different temporary labels (different individuals). Further, when the classification determination unit 38 associates the temporary label, and if the candidate labels determined as having the maximum reliability are the same, the classification determination unit 38 associates the same temporary label. In the example inFIG. 6 , the object images P5, P8, P11, P14, and P15 are the object images P that are associated with the temporary label F06, and the object image P6 is the object image P that is associated with the temporary label F07. - If the number of the object images P that are associated with the same temporary label becomes equal to or larger than a predetermined number, the classification determination unit 38 determines that the object images P are classified as the temporary label. Then, the classification determination unit 38 transmits a determination result indicating that the object images P are classified as the temporary label to the label assigning unit 40. The label assigning unit 40 assigns the temporary label as the labels of the object images P. That is, the label assigning unit 40 adopts the temporary label as the official label, and determines that the object images P are classified as the said label. In this manner, if the number of the object images P that are associated with the temporary label becomes equal to or larger than the predetermined number, the label assigning unit 40 assigns a different label (temporary label) different from the already set candidate labels to the object images P. Meanwhile, the predetermined number in this example is set to 5 for example, but is not limited thereto, and may be set arbitrarily.
-
FIG. 7 is a diagram illustrating an example of the label table. As described above, if the temporary label is adopted as the official label, the number of labels in the label table increases.FIG. 7 illustrates the label table in a case where the temporary label F06 is added as the official label to the label table inFIG. 5 . As illustrated inFIG. 7 , the temporary label F06 is set as the official label F06, and the object images P5, P8, P11, P14, and P15 are the object images P that are classified as the label F06. The temporary label F07 is not adopted as the official label because the number of associated object images P is smaller than the predetermined number. - Referring back to
FIG. 1 , the learningdata generation unit 42 generates learning data for updating the learned model based on the determination result obtained by the classification determination unit 38. If the number of the object images P that are associated with the temporary label by the classification determination unit 38 becomes equal to or larger than the predetermined number, the learningdata generation unit 42 adopts the object images P and the temporary label as the learning data. That is, if the number of the object images P that are associated with the temporary label becomes equal to or larger than the predetermined number and the temporary label is therefore assigned as the official label, the learningdata generation unit 42 generates the learning data by associating the temporary label with each of the object images P. Here, the learning data indicates supervised data and is data that includes information indicating that the object images P are classified as the temporary label. In the example inFIG. 6 , the object images P5, P8, P11, P14, and P15 are associated with the temporary label F06, and adopted as the learning data. - The learning data transmission controller 44 illustrated in
FIG. 1 controls thecommunication unit 24 and transmits the learning data generated by the learningdata generation unit 42 to thelearning apparatus 12. The learningdata generation unit 42 deletes the object images P that are used as the learning data from the temporary label table stored in thestorage 22. That is, in the example inFIG. 6 , the object images P5, P8, P11, P14, and P15 that are associated with the temporary label F06 and that are adopted as the learning data are deleted. Therefore, only the object image P6 associated with the temporary label F07 remains in the temporary label table. However, if any of the object images P is adopted as the learning data, the learningdata generation unit 42 may delete all of the object images P in the temporary label table. In this case, the object image P6 associated with the temporary label F07 is also deleted. In this case, when next classification is to be started, it is possible to perform classification and set a temporary label with high accuracy by using a new learned model in which the learning data is reflected. In contrast, if only the object images P that are used as the learning data are deleted, the object images P associated with other temporary labels remain. Therefore, in this case, the number needed to reach the predetermined number is small, so that it is possible to generate the learning data more promptly. - The learning data generation apparatus 10 is configured as described above. Next, the
learning apparatus 12 will be described. As illustrated inFIG. 1 , thelearning apparatus 12 is an apparatus (server) that is installed at a different position from the learning data generation apparatus 10. Thelearning apparatus 12 includes acommunication unit 50, astorage 52, and acontroller 54. Meanwhile, thelearning apparatus 12 may include, for example, an input unit that allows a user to perform input and an output unit that is able to output information. In this case, the input unit may be a mouse, a keyboard, a touch panel, or the like. The output unit is, for example, a display and is able to display a captured image or the like. - The
communication unit 50 transmits and receives data by performing communication with an external apparatus, e.g., the learning data generation apparatus 10 in this example, under a control of thecontroller 54. Thecommunication unit 50 is, for example, an antenna, and transmits and receives data to and from the learning data generation apparatus 10 via radio communication using a wireless LAN, Wi-fi (registered trademark), Bluetooth (registered trademark), or the like. However, thecommunication unit 50 may be connected to the learning data generation apparatus 10 via a cable, and transmit and receive information via wired communication. - The
storage 52 is a memory for storing information on calculation contents and programs of thecontroller 54. Thestorage 52 includes at least one external storage device, such as a random access memory (RAM), a read only memory (ROM), and a flash memory, for example. - The
controller 54 is an arithmetic device, i.e., a central processing unit (CPU). Thecontroller 54 includes a learning data acquisition unit 60, alearning unit 62, and a learned model transmission controller 64. The learning data acquisition unit 60, thelearning unit 62, and the learned model transmission controller 64 perform processes as described later by reading software (program) stored in thestorage 52. - The learning data acquisition unit 60 controls the
communication unit 50 and acquires the learning data generated by the learningdata generation unit 42 from thecommunication unit 24 of the learning data generation apparatus 10. The learning data acquisition unit 60 stores the acquired learning data in thestorage 52. - The
learning unit 62 updates the learned model by learning. Thelearning unit 62 reads the learned model and the learned learning data that are stored in advance in thestorage 52, and reads new learning data that is acquired by the learning data acquisition unit 60. Thelearning unit 62 updates the learned model by causing the learned model to learn the learned learning data and the new learning data as supervised data. -
FIG. 8 andFIG. 9 are examples of a learned data table.FIG. 8 is one example of the learned data table before update. The learned data table is supervised data that is stored in thestorage 52, and is information in which learned images are associated with labels. That is, the learned data table is a data group that includes multiple pieces of teaching data indicating the labels into which the learned images are classified. The learned model is learned and constructed by using each piece of the data in the learned data table as teaching data. -
FIG. 8 illustrates one example of the learned data before update with the learning data obtained from the learningdata generation unit 42. In the learned data table illustrated inFIG. 8 , learned images P101 to P200 are classified as the label F01, learned images P201 to P300 are classified as the label F02, learned images P301 to P400 are classified as the label F03, learned images P401 to P500 are classified as the label F04, and learned images P501 to P600 are classified as the label F05. The learned images P100 to P600 are face images that are extracted in advance, and are images for which the labels to be classified are set in advance without being classified by the learning data generation apparatus 10. That is, the learned model is constructed by being supplied with the learned images P100 to P600 in advance as supervised data. -
FIG. 9 illustrates one example of the leaned data table after update with the learning data obtained from the learningdata generation unit 42. As illustrated inFIG. 9 , thelearning unit 62 updates the learned data table by adding the learning data to the learned data table that is not yet updated. That is, thelearning unit 62 updates the learned data table by adding the object images P of the learning data as learned images and the temporary label as a new label in the learned data table. That is, the learning data is supervised data that indicates an association between the temporary labels and the object images P. In the example inFIG. 9 , the new learned images (object images) P5, P8, P11, P14, and P15 included in the learning data are added to the learned data table, and the new label F06 is associated with the learned images. Thelearning unit 62 updates the learned model by using the learned data table to which the new learning data is added as described above. Thelearning unit 62 stores the updated learned model in thestorage 52. - Referring back to
FIG. 1 , the learned model transmission controller 64 controls thecommunication unit 50 and transmits the updated learned model to the learning data generation apparatus 10. In the learning data generation apparatus 10, the learnedmodel acquisition unit 34 reads the updated learned model and performs classification of a next object image P and generation of the learning data. - The
learning apparatus 12 is a server that is installed at a position separated from the learning data generation apparatus 10 as described above. However, thelearning apparatus 12 may be incorporated in the learning data generation apparatus 10. That is, the object classification system 1 may be configured such that thelearning apparatus 12 is not included but the learning data acquisition unit 60 and thelearning unit 62 are incorporated in thecontroller 26 of the learning data generation apparatus 10. In this case, the learning data transmission controller 44 is also not needed. Further, in the present embodiment, the learning data generation apparatus 10 causes the classification determination unit 38 to perform determination, but thelearning apparatus 12 may include the classification determination unit 38 and perform determination. Furthermore, thelearning apparatus 12 may include theclassification evaluation unit 36 and calculate the reliability. - Moreover, the object classification system 1 may of course be implemented as an apparatus using hardware, such as a CPU, an FPGA, an ASIC, or a memory, and may be implemented by firmware stored in a ROM, a flash memory, or the like or by software, such as a computer. A firmware program and a software program may be provided by being recorded in a medium that is readable by a computer or the like, may be transmitted and received to and from a server via a wired or wireless network, or may be transmitted and received as data broadcasting via terrestrial broadcasting or satellite digital broadcasting.
- A processing flow of processes performed by the object classification system 1 as described above will be described using a flowchart.
FIG. 10 is a flowchart for explaining a flow of processes performed by the controller of the learning data generation apparatus. As illustrated inFIG. 10 , thecontroller 26 of the learning data generation apparatus 10 causes theimage acquisition unit 30 to acquire an image captured by the imager 20 (Step S10), and causes theobject extraction unit 32 to extract a single object image P from the acquired image (Step S12). - The
controller 26 causes theclassification evaluation unit 36 to calculate the reliability for each of the candidate labels of the object image P based on the learned model (Step S16). Then, thecontroller 26 causes the classification determination unit 38 to determine whether the maximum reliability is equal to or larger than the first threshold K1 (Step S18). The maximum reliability is the reliability of the candidate label with the highest reliability among the candidate labels. If the maximum reliability is equal to or larger than the first threshold K1 (Step S18; Yes), the classification determination unit 38 determines that the object image P is classified as the candidate label with the maximum reliability, and the label assigning unit 40 determines the candidate label with the maximum reliability as the label of the object image P (Step S20). After the candidate label with the maximum reliability is determined as the label, the process proceeds to Step S31 described later. - If the maximum reliability is not equal to or larger than the first threshold K1 (Step S18; No), that is, if the maximum reliability is smaller than the first threshold K1, the classification determination unit 38 determines whether the maximum reliability is equal to or larger than the second threshold K2 (Step S22). If the maximum reliability is not equal to or larger than the second threshold K2 (Step S22; No), that is, if the maximum reliability is smaller than the second threshold, the process proceeds to Step S31. If the maximum reliability is equal to or larger than the second threshold K2 (Step S22; Yes), the classification determination unit 38 assigns a temporary label to the object image P (Step S24). Then, the classification determination unit 38 determines whether the number of the object images P to which the temporary label is assigned is equal to or larger than a predetermined number (Step S26), and if the number is not equal to or larger than the predetermined number (Step S26; No), the process proceeds to Step S31. In contrast, if the number of the object images P to which the temporary label is assigned is equal to or larger than the predetermined number (Step S26; Yes), the label assigning unit 40 determines the temporary label as the label (Step S28), and the learning
data generation unit 42 adopts the temporary label and the object images P as the learning data (Step S30). Thereafter, the process proceeds to Step S31, at which theobject extraction unit 32 determines whether other object images P are present in the image (Step S31). If the other object images P are present (Step S31; Yes), the process returns to Step S12, and theobject extraction unit 32 extracts one of the other object images P. If the other object images P are not present (Step S31; No), the process proceeds to Step S32, and thecontroller 26 determines whether other images are present (Step S32). If the other images are present (Step S32; Yes), the process returns to Step S10. If the other images are not present (Step S32; No), the process is ended. Thereafter, thecontroller 26 causes the learning data transmission controller 44 to transmit the learning data to thelearning apparatus 12, and thelearning apparatus 12 updates the learned model based on the learning data. Thecontroller 26 subsequently performs the above-described process using the updated learned model. - As described above, the learning data generation apparatus 10 according to the present embodiments includes the
image acquisition unit 30 that acquires the image, theobject extraction unit 32 that extracts the object image P from the image, theclassification evaluation unit 36, the classification determination unit 38, and the learningdata generation unit 42. Theclassification evaluation unit 36 evaluates the object image P based on the learned model, and calculates the reliability indicating the degree of the posibility that the object image P is classified as the candidate label. The classification determination unit 38 associates the temporary label that is different from the candidate label with the object image P if the reliability is smaller than the first threshold K1 and is equal to or larger than the second threshold K2 that is smaller than the first threshold K1. Then, the learningdata generation unit 42 generates the learning data based on the object image that is associated with the temporary label. Further, the classification determination unit 38 determines whether the object image P is classified as the candidate label by performing the determination based on the reliability. The learningdata generation unit 42 generates the learning data for updating the learned model based on the determination result of the classification determination unit 38. The classification determination unit 38 determines that the object image P is classified as the candidate label if the reliability is equal to or larger than the first threshold K1, and associates the temporary label that is set for the candidate label with the object image P if the reliability is smaller than the first threshold K1 and is equal to or larger than the second threshold K2. If the number of the object images P associated with the temporary label becomes equal to or larger than the predetermined number, the learningdata generation unit 42 adopts each of the object images P and the temporary label as the learning data. With this configuration, it is possible to assign the temporary label and improve the accuracy of the learning data. Further, if the reliability becomes equal to or larger than the first threshold K1, the candidate label and the object image P are not adopted as the learning data, so that it is possible to eliminate the images that are used in machine learning in advance as premise for the machine learning, and to improve efficiency of the machine learning. - Here, when a learned model is constructed by the machine learning, such as deep learning, in some cases, supervised data for which a solution is known may be used. In this case, accuracy increases with an increase in the number of pieces of supervised data. Therefore, it is needed to appropriately collect pieces of the supervised data. In contrast, the learning data generation apparatus 10 according to the present embodiments analyzes the object image P to be classified, and assigns a temporary label. Then, it is determined that the object image P is classified as the temporary label and adopted as the learning data (supervised data). Therefore, the learning data generation apparatus 10 generates the learning data by using the image to be classified, so that it is possible to appropriately generate the learning data suitable for an object to be classified.
- Furthermore, by assigning a new label as the temporary label and performing supervised learning on the images that have not been classified by the learned model before update, the learning data generation apparatus 10 is able to increase the number of classifications after update (after relearning). Meanwhile, in the present embodiment, even if the learned model is updated, the values of the first threshold K1 and the second threshold K2 for assigning the temporary label or the like are set constant. However, the learning data generation apparatus 10 may change the values of the first threshold K1 and the second threshold K2 after the update of the learned model. For example, if the number of the candidate labels exceeds a predetermined number due to the update of the learned model, the learning data generation apparatus 10 may reduce the values of the first threshold K1 and the second threshold K2. With this configuration, even if the number of classifications increases, it is possible to perform the classification in a preferred manner.
- Furthermore, the
classification evaluation unit 36 calculates the reliability for the candidate labels of multiple kinds. The classification determination unit 38 sets the temporary label for each of the candidate labels, and associates the object image P with the temporary label that is set for the candidate label with the highest reliability. The learning data generation apparatus 10 associates the temporary label that is set for the candidate label with the highest reliability among the multiple candidate labels. Therefore, according to the learning data generation apparatus 10, it is possible to appropriately perform the classification, appropriately assign the temporary label, and improve the accuracy of the learning data. - Moreover, the learning data generation apparatus 10 includes the label assigning unit 40 that assigns the label to the object image P. If it is determined that the object image P is classified as the candidate label, the label assigning unit 40 assigns the candidate label as a label of the object image P. If the number of the object images P associated with the temporary label becomes equal to or larger than the predetermined number, the label assigning unit 40 assigns the temporary label as the label of the object images P. The learning data generation apparatus 10 as described above determines that a label with the reliability equal to or larger than the first threshold K1 is the label, and determines, after multiple pieces of data are accumulated, that a label with the reliability smaller than the first threshold K1 and equal to or larger than the second threshold K2 is the label. Therefore, it is possible to improve accuracy in setting an unknown label.
- Furthermore, the learning model generation system (the object classification system 1) according to the present embodiments includes the learning data generation apparatus 10, and the
learning unit 62 that performs the machine learning based on the learning data generated by the learningdata generation unit 42 and updates the learned model. The learning model generation system generates the learned model based on the learning data generated by the learning data generation apparatus 10. The learning model generation system updates the learning model based on the learning data, so that it is possible to construct the learning model that is suitable for the object to be classified. - Moreover, the
learning unit 62 performs the machine learning by deep learning. Therefore, the learning model generation system is able to construct a highly accurate learning model. - Furthermore, the object classification system 1 according to the present embodiment includes the learning data generation apparatus 10 and classifies the object image P by using the learned model. Therefore, the object classification system 1 is able to appropriately classify the object image P.
-
FIG. 11 is a diagram for explaining the determination performed by the classification determination unit according to another example of the present embodiment. In the present embodiment, the classification determination unit 38 sets a single temporary label for a single candidate label, but it may be possible to set multiple temporary labels for a single candidate label. The example inFIG. 11 is an example in which temporary labels A and B are assigned to a single candidate label. As illustrated inFIG. 11 , if the maximum reliability is smaller than the first threshold K1 and equal to or larger than a threshold K2A, the classification determination unit 38 associates the temporary label B with the object image P. The threshold K2A is smaller than the first threshold K1 and larger than the second threshold K2. Further, if the maximum reliability is smaller than the threshold K2A and equal to or larger than the second threshold K2, the classification determination unit 38 associates the temporary label A with the object image P. In this manner, the classification determination unit 38 may divide a range assigned for the maximum reliability of a single candidate label into multiple divided ranges, and set a temporary label for each of the multiple divided ranges. Meanwhile, it may be possible to set one of the first threshold K1 and the second threshold K2 to the same value as the threshold K2A. - Here, for example, when the reliability of the first threshold K1 is not always high as in a case where the number of the learned images in the learned model is smaller than a predetermined number, and if the maximum reliability is smaller than the first threshold K1 and equal to or larger than the threshold K2A, it may be possible not to assign the temporary label to the object image P.
- In this manner, if the reliability is equal to or larger than an intermediate threshold (threshold K2A), which is smaller than the first threshold K1 and larger than the second threshold K2, and smaller than the first threshold K1, the classification determination unit 38 associates a first temporary label (the temporary label B) that is different from the candidate label with the object image P. If the reliability is equal to or larger than the second threshold K2 and smaller than the intermediate threshold (the threshold K2A), the classification determination unit 38 associates a second temporary label (the temporary label A) that is different from the first temporary label with the object image P. By setting the temporary label for each of the multiple divided ranges as described above, it is possible to perform the classification with high accuracy.
-
FIG. 12 is a diagram for explaining the determination performed by the classification determination unit in another example of the present embodiment. In the present embodiment, if the maximum reliability is larger than the first threshold K1, the classification determination unit 38 assigns the label but does not use the object image as the learning data. However, the classification determination unit 38 may use, as the learning data, the object image to which the label is assigned. In this case, as illustrated inFIG. 12 , the classification determination unit 38 assigs, as the official label, a candidate label with the maximum reliability to the object image P if the maximum reliability is equal to or larger than the third threshold K3 and smaller than 1. Furthermore, the classification determination unit 38 uses the object image P and the candidate label with the maximum reliability as the learning data if the maximum reliability is equal to or larger than the first threshold K1 and smaller than a third threshold K3. The third threshold K3 is larger than the first threshold K1. In contrast, if the maximum reliability is equal to or larger than the third threshold K3, the classification determination unit 38 assigns the candidate label with the maximum reliability as the official label to the object image P, but does not use the object image P and the candidate label as the learning data. -
FIG. 13 is a diagram illustrating an example of the learned data table in another example of the present embodiment.FIG. 13 illustrates an example in which the candidate label F01 is a candidate label with the maximum reliability for the object images P12, P13, and P16, and in which the maximum reliability is equal to or larger than the first threshold K1 and smaller than the third threshold K3. In this case, the object images P12, P13, and P16 are associated with the candidate label F01 and added, as learning data (supervised data), to the learned data table. Therefore, the learned data table illustrated inFIG. 13 includes data indicating that the learned images P12, P13, and P16 are associated with the label F01, in addition to the data illustrated inFIG. 10 . Consequently, thelearning unit 62 is able to construct the learned model with high classification accuracy by using the learned images P12, P13, and P16. - In this manner, if the reliability is smaller than the third threshold K3 which is larger than the first threshold K1, and equal to or larger than the first threshold K1, the classification determination unit 38 associates the object image P with the candidate label. Then, the learning
data generation unit 42 generates the learning data based on the object image P associated with the temporary label or the candidate label. In this manner, the learningdata generation unit 42 may adopt, as the learning data, the candidate label with the reliability that is higher than the first threshold K1 and lower than the third threshold K3 that is higher than the first threshold K1 and the object image P. Further, it may be possible for the learningdata generation unit 42 not to adopt the candidate label with the reliability that is equal to or larger than the third threshold K3 and the object image P as the learning data. The learning data generation apparatus 10 determines that data is not needed for new learning and does not adopt the data as learning data if the reliability is adequately high, e.g., if the reliability is equal to or larger than the third threshold K3, and determines that data is suitable for improvement of accuracy of the learning model and uses the data as learning data if the reliability is adequate but is not as high as the third threshold K3. Therefore, by using only data that is appropriate for the learning data, the learning data generation apparatus 10 is able to appropriately generate the learning data that is suitable for the object to be classified. - Furthermore, the classification determination unit 38 may change at least one of the first threshold K1 and the second threshold K2 in accordance with the number of images (learned images) that are used for the learned model. More specifically, the classification determination unit 38 may change at least any of the first threshold K1, the second threshold K2, the threshold K2A, and the third threshold K3 in accordance with the number of images (learned images) that are used for the learned model. With this configuration, it is possible to appropriately change the threshold depending on a change in the number of the images, so that it is possible to appropriately generate the learning data that is suitable for the object to be classified.
- According to the present application, it is possible to appropriately collect learning data that is suitable for an object to be classified.
- Although the application has been described with respect to specific embodiments for a complete and clear application, the appended claims are not to be thus limited but are to be construed as embodying all modifications and alternative constructions that may occur to one skilled in the art that fairly fall within the basic teaching herein set forth.
Claims (7)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2018-042291 | 2018-03-08 | ||
JP2018042291A JP6933164B2 (en) | 2018-03-08 | 2018-03-08 | Learning data creation device, learning model creation system, learning data creation method, and program |
PCT/JP2019/009536 WO2019172451A1 (en) | 2018-03-08 | 2019-03-08 | Learning data creation device, learning model creation system, learning data creation method, and program |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2019/009536 Continuation WO2019172451A1 (en) | 2018-03-08 | 2019-03-08 | Learning data creation device, learning model creation system, learning data creation method, and program |
Publications (2)
Publication Number | Publication Date |
---|---|
US20200387756A1 true US20200387756A1 (en) | 2020-12-10 |
US11922317B2 US11922317B2 (en) | 2024-03-05 |
Family
ID=67847274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/001,716 Active 2041-04-03 US11922317B2 (en) | 2018-03-08 | 2020-08-25 | Learning data generation apparatus, learning model generation system, learning data generation method, and non-transitory storage medium |
Country Status (4)
Country | Link |
---|---|
US (1) | US11922317B2 (en) |
JP (3) | JP6933164B2 (en) |
CN (1) | CN111868780B (en) |
WO (1) | WO2019172451A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210264260A1 (en) * | 2020-02-21 | 2021-08-26 | Samsung Electronics Co., Ltd. | Method and device for training neural network |
US20220414368A1 (en) * | 2021-06-25 | 2022-12-29 | International Business Machines Corporation | Emotional response evaluation for presented images |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102349854B1 (en) * | 2019-12-30 | 2022-01-11 | 엘아이지넥스원 주식회사 | System and method for tracking target |
KR20220018469A (en) * | 2020-08-01 | 2022-02-15 | 센스타임 인터내셔널 피티이. 리미티드. | Target object recognition method and device |
WO2022249572A1 (en) * | 2021-05-26 | 2022-12-01 | ソニーグループ株式会社 | Image processing device, image processing method, and recording medium |
CN117916582A (en) | 2021-09-07 | 2024-04-19 | 明答克株式会社 | Defect classification system |
TWI793865B (en) | 2021-11-18 | 2023-02-21 | 倍利科技股份有限公司 | System and method for AI automatic auxiliary labeling |
WO2023145164A1 (en) * | 2022-01-28 | 2023-08-03 | 株式会社Jvcケンウッド | Image classification device, image classification method, and image classification program |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110243426A1 (en) * | 2010-03-04 | 2011-10-06 | Yi Hu | Method, apparatus, and program for generating classifiers |
US20140079297A1 (en) * | 2012-09-17 | 2014-03-20 | Saied Tadayon | Application of Z-Webs and Z-factors to Analytics, Search Engine, Learning, Recognition, Natural Language, and Other Utilities |
US20170083791A1 (en) * | 2014-06-24 | 2017-03-23 | Olympus Corporation | Image processing device, endoscope system, and image processing method |
US20180121765A1 (en) * | 2016-11-02 | 2018-05-03 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and storage medium |
US20190311301A1 (en) * | 2018-04-10 | 2019-10-10 | Ebay Inc. | Dynamically generated machine learning models and visualization thereof |
US20200012899A1 (en) * | 2017-03-21 | 2020-01-09 | Nec Corporation | Image processing device, image processing method, and storage medium |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4591215B2 (en) * | 2005-06-07 | 2010-12-01 | 株式会社日立製作所 | Facial image database creation method and apparatus |
CN101295305B (en) * | 2007-04-25 | 2012-10-31 | 富士通株式会社 | Image retrieval device |
JP4623387B2 (en) * | 2008-09-04 | 2011-02-02 | ソニー株式会社 | Learning device and method, recognition device and method, and program |
JP5406705B2 (en) * | 2009-12-28 | 2014-02-05 | キヤノン株式会社 | Data correction apparatus and method |
JP5214762B2 (en) * | 2011-03-25 | 2013-06-19 | 株式会社東芝 | Recognition device, method and program |
JP2013125322A (en) * | 2011-12-13 | 2013-06-24 | Olympus Corp | Learning device, program and learning method |
JP6188400B2 (en) * | 2013-04-26 | 2017-08-30 | オリンパス株式会社 | Image processing apparatus, program, and image processing method |
JP6110281B2 (en) * | 2013-11-19 | 2017-04-05 | 日本電信電話株式会社 | Moving unit prediction model generation apparatus, moving unit prediction model generation method, and moving unit prediction model generation program |
JP2016099668A (en) * | 2014-11-18 | 2016-05-30 | キヤノン株式会社 | Learning method, learning device, image recognition method, image recognition device and program |
WO2016103651A1 (en) * | 2014-12-22 | 2016-06-30 | 日本電気株式会社 | Information processing system, information processing method and recording medium |
US20170039469A1 (en) * | 2015-08-04 | 2017-02-09 | Qualcomm Incorporated | Detection of unknown classes and initialization of classifiers for unknown classes |
JP6489005B2 (en) * | 2015-12-18 | 2019-03-27 | キヤノンマーケティングジャパン株式会社 | Information processing system, information processing method, and program |
JP6874757B2 (en) * | 2016-02-24 | 2021-05-19 | 日本電気株式会社 | Learning equipment, learning methods and programs |
JP6364037B2 (en) * | 2016-03-16 | 2018-07-25 | セコム株式会社 | Learning data selection device |
-
2018
- 2018-03-08 JP JP2018042291A patent/JP6933164B2/en active Active
-
2019
- 2019-03-08 CN CN201980017058.8A patent/CN111868780B/en active Active
- 2019-03-08 WO PCT/JP2019/009536 patent/WO2019172451A1/en active Application Filing
-
2020
- 2020-08-25 US US17/001,716 patent/US11922317B2/en active Active
-
2021
- 2021-08-17 JP JP2021132656A patent/JP7239853B2/en active Active
-
2023
- 2023-02-28 JP JP2023029459A patent/JP2023065548A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110243426A1 (en) * | 2010-03-04 | 2011-10-06 | Yi Hu | Method, apparatus, and program for generating classifiers |
US20140079297A1 (en) * | 2012-09-17 | 2014-03-20 | Saied Tadayon | Application of Z-Webs and Z-factors to Analytics, Search Engine, Learning, Recognition, Natural Language, and Other Utilities |
US20170083791A1 (en) * | 2014-06-24 | 2017-03-23 | Olympus Corporation | Image processing device, endoscope system, and image processing method |
US20180121765A1 (en) * | 2016-11-02 | 2018-05-03 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and storage medium |
US20200012899A1 (en) * | 2017-03-21 | 2020-01-09 | Nec Corporation | Image processing device, image processing method, and storage medium |
US20190311301A1 (en) * | 2018-04-10 | 2019-10-10 | Ebay Inc. | Dynamically generated machine learning models and visualization thereof |
Non-Patent Citations (4)
Title |
---|
Fabian Gieseke,"Convolutional neural networks for transient candidate vetting in large-scale surveys,"August 23rd 2017, MNRAS 472, Pages 3102-3110. * |
Junwei Han,"Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning," May 23rd, 2014,IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 6, JUNE 2015,PAges 3325-3332. * |
Junwei Han,"Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning,"May 23rd, 2014,IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 53, NO. 6, JUNE 2015,PAges 3325-3332. * |
Yang Long,"Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks," January 19th, 2017, IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 55, NO. 5, MAY 2017,Pages 2486-2495. * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210264260A1 (en) * | 2020-02-21 | 2021-08-26 | Samsung Electronics Co., Ltd. | Method and device for training neural network |
US20220414368A1 (en) * | 2021-06-25 | 2022-12-29 | International Business Machines Corporation | Emotional response evaluation for presented images |
Also Published As
Publication number | Publication date |
---|---|
JP2019159499A (en) | 2019-09-19 |
US11922317B2 (en) | 2024-03-05 |
CN111868780A (en) | 2020-10-30 |
WO2019172451A1 (en) | 2019-09-12 |
JP7239853B2 (en) | 2023-03-15 |
JP2021184299A (en) | 2021-12-02 |
JP6933164B2 (en) | 2021-09-08 |
CN111868780B (en) | 2023-07-28 |
JP2023065548A (en) | 2023-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11922317B2 (en) | Learning data generation apparatus, learning model generation system, learning data generation method, and non-transitory storage medium | |
CN108229509B (en) | Method and device for identifying object class and electronic equipment | |
US9251588B2 (en) | Methods, apparatuses and computer program products for performing accurate pose estimation of objects | |
US8750573B2 (en) | Hand gesture detection | |
US8792722B2 (en) | Hand gesture detection | |
CN112183153A (en) | Object behavior detection method and device based on video analysis | |
CN103605972A (en) | Non-restricted environment face verification method based on block depth neural network | |
US11107231B2 (en) | Object detection device, object detection method, and object detection program | |
US11893727B2 (en) | Rail feature identification system | |
CN110633643A (en) | Abnormal behavior detection method and system for smart community | |
CN111414946A (en) | Artificial intelligence-based medical image noise data identification method and related device | |
JP2020135551A (en) | Object recognition device, object recognition method and object recognition program | |
CN111488850A (en) | Neural network-based old people falling detection method | |
Iosifidis et al. | Neural representation and learning for multi-view human action recognition | |
JP2019215728A (en) | Information processing apparatus, information processing method and program | |
US11715032B2 (en) | Training a machine learning model using a batch based active learning approach | |
CN113076963B (en) | Image recognition method and device and computer readable storage medium | |
CN111178134B (en) | Tumble detection method based on deep learning and network compression | |
KR101556696B1 (en) | Method and system for recognizing action of human based on unit operation | |
KR101503398B1 (en) | Method and Apparatus for classifying the moving objects in video | |
CN111553202A (en) | Training method, detection method and device of neural network for detecting living body | |
Cheng et al. | Automatic Data Cleaning System for Large-Scale Location Image Databases Using a Multilevel Extractor and Multiresolution Dissimilarity Calculation | |
US20230368575A1 (en) | Access control with face recognition and heterogeneous information | |
CN117133053A (en) | Machine vision-based Gao Tieke station scene downlink man multi-feature fall detection method | |
CN116980976A (en) | Data transparent transmission method based on 4G communication module |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: JVCKENWOOD CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TAKEHARA, HIDEKI;REEL/FRAME:053582/0553 Effective date: 20200817 |
|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |