US20240119723A1 - Information processing device, and selection output method - Google Patents

Information processing device, and selection output method Download PDF

Info

Publication number
US20240119723A1
US20240119723A1 US18/273,278 US202118273278A US2024119723A1 US 20240119723 A1 US20240119723 A1 US 20240119723A1 US 202118273278 A US202118273278 A US 202118273278A US 2024119723 A1 US2024119723 A1 US 2024119723A1
Authority
US
United States
Prior art keywords
learning data
pieces
object detection
unlabeled
unlabeled learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US18/273,278
Inventor
Jia Qu
Shoichi Shimizu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mitsubishi Electric Corp
Original Assignee
Mitsubishi Electric Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mitsubishi Electric Corp filed Critical Mitsubishi Electric Corp
Assigned to MITSUBISHI ELECTRIC CORPORATION reassignment MITSUBISHI ELECTRIC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: QU, Jia, SHIMIZU, SHOICHI
Publication of US20240119723A1 publication Critical patent/US20240119723A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/091Active learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/87Arrangements for image or video recognition or understanding using pattern recognition or machine learning using selection of the recognition techniques, e.g. of a classifier in a multiple classifier system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7753Incorporation of unlabelled data, e.g. multiple instance learning [MIL]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations

Definitions

  • the present disclosure relates to an information processing device, a selection output method and a selection output program.
  • the device executes deep learning by using a great amount of training data (referred to also as a learning data set, for example).
  • a learning data set for example
  • the training data includes a region of the object as the detection target in the image and a label indicating the type of the object.
  • the training data is generated by a labeling worker.
  • the generating work executed by the labeling worker is referred to as labeling.
  • the labeling executed by the labeling worker increases the load on the labeling worker.
  • active learning has been devised in order to lighten the load on the labeling worker. In the active learning, images labeled and having great learning effect are used as the training data.
  • An active learning device calculates a classification score in regard to unlabeled learning data by using a classifier that has been learned by using labeled learning data.
  • the active learning device generates a plurality of clusters by clustering the unlabeled learning data.
  • the active learning device selects learning data to be used for the active learning from the unlabeled learning data based on the plurality of clusters and the classification score.
  • the learning data is selected by using a classifier, obtained by executing learning in a certain method by using labeled learning data, and unlabeled learning data.
  • the classifier is hereinafter referred to as a learned model.
  • the selected learning data is learning data having great learning effect when the learning is executed by using the certain method.
  • a learned model using a different method is generated, the selected learning data cannot necessarily be regarded as learning data having great learning effect. Therefore, methods using the above-described technology cannot necessarily be considered to be desirable. Thus, how to select learning data having great learning effect is an important issue.
  • An object of the present disclosure is to select learning data having great learning effect.
  • the information processing device includes an acquisition unit that acquires a plurality of learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object, an object detection unit that performs the object detection on each of the plurality of pieces of unlabeled learning data by using the plurality of learned models, a calculation unit that calculates a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results, and a selection output unit that selects a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores and outputs the selected unlabeled learning data.
  • learning data having great learning effect can be selected.
  • FIG. 1 is a block diagram showing functions of an information processing device in a first embodiment.
  • FIG. 2 is a diagram showing hardware included in the information processing device in the first embodiment.
  • FIGS. 3 (A) and 3 (B) are diagrams for explaining IoU in the first embodiment.
  • FIG. 4 is a diagram showing a relationship among Precision, Recall and AP in the first embodiment.
  • FIGS. 5 (A) and 5 (B) are diagrams (No. 1) showing examples of output of selected images.
  • FIGS. 6 (A) and 6 (B) are diagrams (No. 2) showing examples of the output of the selected images.
  • FIG. 7 is a block diagram showing functions of an information processing device in a second embodiment.
  • FIG. 8 is a flowchart showing an example of a process executed by the information processing device in the second embodiment.
  • FIG. 1 is a block diagram showing functions of an information processing device in a first embodiment.
  • the information processing device 100 is a device that executes a selection output method.
  • the information processing device 100 includes a first storage unit 111 , a second storage unit 112 , an acquisition unit 120 , learning units 130 a and 130 b , an object detection unit 140 , a calculation unit 150 and a selection output unit 160 .
  • FIG. 2 is a diagram showing the hardware included in the information processing device in the first embodiment.
  • the information processing device 100 includes a processor 101 , a volatile storage device 102 and a nonvolatile storage device 103 .
  • the processor 101 controls the whole of the information processing device 100 .
  • the processor 101 is a Central Processing Unit (CPU), a Field Programmable Gate Array (FPGA) or the like, for example.
  • the processor 101 can also be a multiprocessor.
  • the information processing device 100 may include processing circuitry.
  • the processing circuitry may be either a single circuit or a combined circuit.
  • the volatile storage device 102 is main storage of the information processing device 100 .
  • the volatile storage device 102 is a Random Access Memory (RAM), for example.
  • the nonvolatile storage device 103 is auxiliary storage of the information processing device 100 .
  • the nonvolatile storage device 103 is a Hard Disk Drive (HDD) or a Solid State Drive (SSD), for example.
  • the first storage unit 111 and the second storage unit 112 may also be implemented as storage areas reserved in the volatile storage device 102 or the nonvolatile storage device 103 .
  • Part or all of the acquisition unit 120 , the learning units 130 a and 130 b , the object detection unit 140 , the calculation unit 150 and the selection output unit 160 may be implemented by the processing circuitry. Further, part or all of the acquisition unit 120 , the learning units 130 a and 130 b , the object detection unit 140 , the calculation unit 150 and the selection output unit 160 may be implemented as modules of a program executed by the processor 101 . For example, the program executed by the processor 101 is referred to also as a selection output program. The selection output program has been recorded in a record medium, for example.
  • the information processing device 100 generates learned models 200 a and 200 b . A process until the learned models 200 a and 200 b are generated will be described below.
  • the first storage unit 111 may store labeled learning data.
  • the labeled learning data includes an image, at least one region of an object as a detection target in the image, and a label indicating the type of the object.
  • information including the region of the object and the label is referred to also as label information.
  • the image is an image including a road, for example, the type is four-wheel vehicle, two-wheel vehicle, truck, or the like.
  • the acquisition unit 120 acquires the labeled learning data.
  • the acquisition unit 120 acquires the labeled learning data from the first storage unit 111 , for example.
  • the acquisition unit 120 acquires the labeled learning data from an external device (e.g., cloud server), for example.
  • an external device e.g., cloud server
  • the learning units 130 a and 130 b generate the learned models 200 a and 200 b by executing object detection learning in methods different from each other by using the labeled learning data.
  • each of these methods can be Faster Regions with Convolutional Neural Networks (R-CNN), You Look Only Once (YOLO), Single Shot MultiBox Detector (SSD), or the like.
  • R-CNN Faster Regions with Convolutional Neural Networks
  • YOLO You Look Only Once
  • SSD Single Shot MultiBox Detector
  • each method can be referred to also as algorithm.
  • the learned models 200 a and 200 b for executing object detection by methods different from each other are generated.
  • the learned model 200 a is a learned model for executing the object detection by using Faster R-CNN.
  • the learned model 200 b is a learned model for executing the object detection by using YOLO.
  • each learned model may be referred to also as a detector or detector information.
  • the learned models 200 a and 200 b generated may be stored in the volatile storage device 102 or the nonvolatile storage device 103 or stored in an external device.
  • the second storage unit 112 may store a plurality of pieces of unlabeled learning data. Each of the plurality of pieces of unlabeled learning data does not include the label information.
  • the plurality of pieces of unlabeled learning data are a plurality of images. Each of the plurality of images includes an object. The object is a human, an animal or the like, for example.
  • the acquisition unit 120 acquires a plurality of pieces of unlabeled learning data.
  • the acquisition unit 120 acquires the plurality of pieces of unlabeled learning data from the second storage unit 112 , for example.
  • the acquisition unit 120 acquires the plurality of pieces of unlabeled learning data from an external device, for example.
  • the acquisition unit 120 acquires the learned models 200 a and 200 b .
  • the acquisition unit 120 acquires the learned models 200 a and 200 b from the volatile storage device 102 or the nonvolatile storage device 103 , for example.
  • the acquisition unit 120 acquires the learned models 200 a and 200 b from an external device, for example.
  • the object detection unit 140 performs the object detection on each of the plurality of pieces of unlabeled learning data by using the learned models 200 a and 200 b . For example, when the number of pieces of unlabeled learning data is two, the object detection unit 140 performs the object detection on first unlabeled learning data, as one of the plurality of pieces of unlabeled learning data, by using the learned models 200 a and 200 b . In other words, the object detection unit 140 executes the object detection by using the first unlabeled learning data and the learned models 200 a and 200 b . Further, for example, the object detection unit 140 performs the object detection on second unlabeled learning data, as one of the plurality of pieces of unlabeled learning data, by using the learned models 200 a and 200 b.
  • the object detection unit 140 performs the object detection on each of the plurality of pieces of unlabeled learning data by using the learned models 200 a and 200 b.
  • the object detection unit 140 executes the object detection by using the one piece of unlabeled learning data and the learned models 200 a and 200 b .
  • the object detection unit 140 executes the object detection by using the unlabeled learning data and the learned model 200 a , for example. Further, the object detection unit 140 executes the object detection by using the unlabeled learning data and the learned model 200 b , for example. Accordingly, the object detection is executed by methods different from each other.
  • An object detection result is outputted in regard to each learned model.
  • the object detection result is represented as D i .
  • i is an integer from 1 to N.
  • the object detection result D i is referred to also as a reasoning label R i .
  • the reasoning label Ri is expressed as “(c, x, y, w, h)”.
  • the parameter c indicates the type of the object.
  • the parameters x and y indicate coordinates (x, y) of an image region center of the object.
  • the parameter w indicates width of the object.
  • the parameter h indicates height of the object.
  • the calculation unit 150 calculates the information amount score by using the object detection result D i .
  • the information amount score indicates value of the unlabeled learning data.
  • a larger value of the information amount score indicates that the unlabeled learning data has greater value as learning data.
  • the information amount score varies greatly in the result of the type in an image region having high similarity.
  • the information amount score varies greatly in the image region in the result of the same type.
  • mAP mean Average Precision
  • IoU Intersection over Union
  • the information amount score is calculated by using expression (1).
  • the object detection result outputted from the learned model 200 a is represented as D i .
  • the object detection result outputted from the learned model 200 b is represented as D 2 .
  • the mAP@0.5 is one of evaluation methods in the object detection, and the IoU is known as a concept used for the evaluation.
  • the IoU is represented by using expression (2).
  • the character R gt represents a true value region.
  • the character R d represents a detection region.
  • the character A represents an area.
  • IoU ⁇ ( R gt , R d ) A ⁇ ( R gt ⁇ R d ) A ⁇ ( R gt ⁇ R d ) ( 2 )
  • FIGS. 3 (A) and 3 (B) are diagrams for explaining the IoU in the first embodiment.
  • FIG. 3 (A) shows a concrete example of the true value region R gt and the detection region R d . Further, FIG. 3 (A) shows how much the true value region R gt and the detection region R d overlap with each other.
  • the unlabeled learning data includes no label.
  • the IoU cannot be represented by directly using the expression (2). Therefore, the IoU is represented as follows: A region represented by one object detection result is defined as the true value region. Then, a region represented by another object detection result is defined as the detection region. For example, in FIG. 3 (B), a detection region R gt1 represented by the object detection result D 1 is defined as the true value region. A detection region R d1 represented by the object detection result D 2 is defined as the detection region.
  • the IoU is represented by using expression (3).
  • IoU ⁇ ( R gt ⁇ 1 , R d ⁇ 1 ) A ⁇ ( R gt ⁇ 1 ⁇ R d ⁇ 1 ) A ⁇ ( R gt ⁇ 1 ⁇ R d ⁇ 1 ) ( 3 )
  • the TP indicates that the learned model detected an object existing in the image of the unlabeled learning data. In other words, it indicates that the learned model detected a true value since the detection region R d1 and the detection region R gt1 are situated substantially at the same position.
  • the FP indicates that the learned model detected an object not existing in the image of the unlabeled learning data. In other words, it indicates that the learned model made false detection since the detection region R gt1 is situated at a deviated position.
  • the FN indicates that the learned model did not detect an object existing in the image of the unlabeled learning data. In other words, it indicates that the learned model did not make the detection since the detection region R gt1 is situated at a deviated position.
  • Precision is represented by using the TP and the FP. Specifically, the Precision is represented by using expression (4). Incidentally, the Precision indicates a ratio of data that are actually positive among data that were estimated to be positive. Incidentally, the Precision is referred to also as a precision ratio.
  • Recall is represented by using the TP and the FP. Specifically, the Recall is represented by using expression (5). Incidentally, the Recall indicates a ratio of data that were estimated to be positive among data that are actually positive. Incidentally, the Recall is referred to also as a recall ratio.
  • FIG. 4 is a diagram showing the relationship among the Precision, the Recall and the AP in the first embodiment.
  • the vertical axis represents the Precision.
  • the horizontal axis represents the Recall.
  • the Average Precision (AP) is calculated by using the Precision and the Recall. Specifically, the area of “AP” in FIG. 4 is calculated as the AP.
  • the calculation unit 150 calculates the TP, the FP and the FN of each of the plurality of objects.
  • the calculation unit 150 calculates the Precision and the Recall of each of the plurality of objects by using the expression (4) and the expression (5).
  • the calculation unit 150 calculates the AP of each object (i.e., class) based on the Precision and the Recall of each of the plurality of objects. For example, when the plurality of objects are a cat and a dog, the AP “0.4” of the cat and the AP “0.6” of the dog are calculated.
  • the calculation unit 150 calculates the average of the APs of the objects as the mAP.
  • the calculation unit 150 calculates the mAP “0.5”.
  • the one AP is calculated. Then, the one AP serves as the mAP.
  • the mAP is calculated as above.
  • the calculation unit 150 calculates the information amount score by using the mAP and the expression (1). Namely, the calculation unit 150 calculates the information amount score by “1 ⁇ mAP”. The information amount score is calculated as above.
  • the information amount score is calculated by using expression (6). Namely, the calculation unit 150 generates a plurality of combinations of two leaned models by using the N leaned models, calculates a value for each combination by using the expression (1), and calculates the information amount score by dividing the sum total of the calculated values by N.
  • the calculation unit 150 calculates the information amount score corresponding to the one piece of unlabeled learning data. Then, the information processing device 100 (i.e., the object detection unit 140 and the calculation unit 150 ) performs the same process also on each of the plurality of pieces of unlabeled learning data. By this, the information processing device 100 is capable of obtaining the information amount score of each of the plurality of pieces of unlabeled learning data. In other words, the information processing device 100 is capable of obtaining a plurality of information amount scores corresponding to the plurality of pieces of unlabeled learning data. As above, the information processing device 100 calculates the plurality of information amount scores based on a plurality of object detection results. Specifically, the information processing device 100 calculates the plurality of information amount scores by using the mAPs and the plurality of object detection results.
  • the selection output unit 160 selects a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores. In other words, the selection output unit 160 selects unlabeled learning data having great learning effect from the plurality of pieces of unlabeled learning data corresponding to the plurality of information amount scores based on the plurality of information amount scores.
  • This sentence may also be expressed as follows: The selection output unit 160 selects unlabeled learning data that is/are expected to contribute to the learning from the plurality of pieces of unlabeled learning data.
  • the information amount score is a value in a range from 0 to 1.
  • the detection results by the learned models 200 a and 200 b substantially coincide with each other. Therefore, unlabeled learning data corresponding to the information amount score “0” is considered to have low usefulness since the degree of necessity of appropriating the unlabeled learning data for learning data is low.
  • the information amount score is “1”
  • the detection results by the learned models 200 a and 200 b greatly differ from each other.
  • unlabeled learning data corresponding to the information amount score “1” can be regarded also as a special example that is extremely difficult to detect.
  • the selection output unit 160 excludes such unlabeled learning data corresponding to the information amount score “0” or “1” from the plurality of pieces of unlabeled learning data corresponding to the plurality of information amount scores. After the exclusion, the selection output unit 160 selects top n (n is a positive integer) pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data as unlabeled learning data having great learning effect.
  • the selection output unit 160 outputs the selected unlabeled learning data. It is also possible for the selection output unit 160 to output object detection results, as results of performing the object detection on the selected unlabeled learning data (hereinafter referred to as selected images), as the reasoning labels. Here, examples of the output of the selected images will be described below.
  • FIGS. 5 (A) and 5 (B) are diagrams (No. 1) showing examples of the output of the selected images.
  • FIG. 5 (A) shows a case where the selected images are outputted to the volatile storage device 102 or the nonvolatile storage device 103 .
  • the labeling worker performs the labeling on the selected images by using the information processing device 100 .
  • FIG. 5 (B) shows a case where the selected images and the reasoning labels are outputted to the volatile storage device 102 or the nonvolatile storage device 103 .
  • the labeling worker performs the labeling on the selected images by using the information processing device 100 and the reasoning labels. Further, by the outputting of the reasoning labels, the labeling workload on the labeling worker is lightened.
  • FIGS. 6 (A) and 6 (B) are diagrams (No. 2) showing examples of the output of the selected images.
  • FIG. 6 (A) shows a case where the selected images are outputted to a labeling tool. Since the selected images are outputted to the labeling tool as above, the labeling workload on the labeling worker is lightened.
  • FIG. 6 (B) shows a case where the selected images and the reasoning labels are outputted to the labeling tool.
  • the labeling worker performs the labeling on the selected images while correcting the reasoning labels by using the labeling tool.
  • the images selected by the selection output unit 160 are images selected by using learned models that detect an object by methods different from each other. Therefore, the selected images are not only suitable as learning data used when executing the learning by a certain method but also suitable as learning data used when executing the learning by a different method. Thus, the selected images can be regarded as learning data having great learning effect. According to the first embodiment, the information processing device 100 is capable of selecting learning data having great learning effect.
  • the learning data having great learning effect are automatically selected by the information processing device 100 . Therefore, the information processing device 100 is capable of efficiently selecting the learning data having great learning effect.
  • FIG. 7 is a block diagram showing functions of an information processing device in the second embodiment. Each component in FIG. 7 that is the same as a component shown in FIG. 1 is assigned the same reference character as in FIG. 1 .
  • the information processing device 100 relearns the learned models 200 a and 200 b . Details of the relearning will be described later.
  • FIG. 8 is a flowchart showing an example of the process executed by the information processing device in the second embodiment.
  • Step S 11 The acquisition unit 120 acquires the labeled learning data.
  • the data amount of the labeled learning data may be small.
  • the learning units 130 a and 130 b generate the learned models 200 a and 200 b by executing the object detection learning in methods different from each other by using the labeled learning data.
  • Step S 12 The acquisition unit 120 acquires a plurality of pieces of unlabeled learning data.
  • the object detection unit 140 executes the object detection by using the plurality of pieces of unlabeled learning data and the learned models 200 a and 200 b.
  • Step S 13 The calculation unit 150 calculates a plurality of information amount scores corresponding to the plurality of pieces of unlabeled learning data based on a plurality of object detection results.
  • Step S 14 The selection output unit 160 selects unlabeled learning data having great learning effect from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores.
  • Step ST 15 The selection output unit 160 outputs the selected unlabeled learning data (i.e., selected images). For example, the selection output unit 160 outputs the selected images as illustrated in FIG. 5 or FIG. 6 .
  • the labeling worker executes the labeling by using the selected images.
  • labeled learning data is generated.
  • the labeled learning data includes the selected images, at least one region of an object as a detection target in the images, and a label indicating the type of the object.
  • the labeled learning data may be stored in the first storage unit 111 .
  • the labeling work may also be executed by an external device.
  • Step S 16 The acquisition unit 120 acquires the labeled learning data.
  • the acquisition unit 120 acquires the labeled learning data from the first storage unit 111 , for example.
  • the acquisition unit 120 acquires the labeled learning data from the external device, for example.
  • Step ST 7 The learning units 130 a and 130 b relearn the learned models 200 a and 200 b by using the labeled learning data.
  • Step S 18 The information processing device 100 judges whether a termination condition of the learning is satisfied or not.
  • the termination condition has been stored in the nonvolatile storage device 103 , for example.
  • the process ends.
  • the process advances to the step S 12 .
  • the information processing device 100 is capable of increasing the object detection accuracy of the learned models by repeating the addition of labeled learning data and the relearning.
  • 100 information processing device
  • 101 processor
  • 102 volatile storage device
  • 103 nonvolatile storage device
  • 111 first storage unit
  • 112 second storage unit
  • 120 acquisition unit
  • 130 a , 130 b learning unit
  • 140 object detection unit
  • 150 calculation unit
  • 160 selection output unit, 200 a , 200 b learned model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

An information processing device includes an acquisition unit that acquires learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object, an object detection unit that performs the object detection on each of the plurality of pieces of unlabeled learning data by using the learned models, a calculation unit that calculates a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results, and a selection output unit that selects a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores and outputs the selected unlabeled learning data.

Description

    TECHNICAL FIELD
  • The present disclosure relates to an information processing device, a selection output method and a selection output program.
  • BACKGROUND ART
  • In general, to realize excellent performance of a device that uses a learned model, the device executes deep learning by using a great amount of training data (referred to also as a learning data set, for example). For example, when a learned model for detecting an object in an inputted image is generated, the training data includes a region of the object as the detection target in the image and a label indicating the type of the object. The training data is generated by a labeling worker. The generating work executed by the labeling worker is referred to as labeling. The labeling executed by the labeling worker increases the load on the labeling worker. In such a circumstance, active learning has been devised in order to lighten the load on the labeling worker. In the active learning, images labeled and having great learning effect are used as the training data.
  • Here, a technology for selecting data to be used for the active learning has been proposed (see Patent Reference 1). An active learning device calculates a classification score in regard to unlabeled learning data by using a classifier that has been learned by using labeled learning data. The active learning device generates a plurality of clusters by clustering the unlabeled learning data. The active learning device selects learning data to be used for the active learning from the unlabeled learning data based on the plurality of clusters and the classification score.
  • PRIOR ART REFERENCE Patent Reference
    • Patent Reference 1: Japanese Patent Application Publication No. 2017-167834
    SUMMARY OF THE INVENTION Problem to be Solved by the Invention
  • In the above-described technology, the learning data is selected by using a classifier, obtained by executing learning in a certain method by using labeled learning data, and unlabeled learning data. Incidentally, the classifier is hereinafter referred to as a learned model. The selected learning data is learning data having great learning effect when the learning is executed by using the certain method. In contrast, when a learned model using a different method is generated, the selected learning data cannot necessarily be regarded as learning data having great learning effect. Therefore, methods using the above-described technology cannot necessarily be considered to be desirable. Thus, how to select learning data having great learning effect is an important issue.
  • An object of the present disclosure is to select learning data having great learning effect.
  • Means for Solving the Problem
  • An information processing device according to an aspect of the present disclosure is provided. The information processing device includes an acquisition unit that acquires a plurality of learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object, an object detection unit that performs the object detection on each of the plurality of pieces of unlabeled learning data by using the plurality of learned models, a calculation unit that calculates a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results, and a selection output unit that selects a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores and outputs the selected unlabeled learning data.
  • Effect of the Invention
  • According to the present disclosure, learning data having great learning effect can be selected.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a block diagram showing functions of an information processing device in a first embodiment.
  • FIG. 2 is a diagram showing hardware included in the information processing device in the first embodiment.
  • FIGS. 3(A) and 3(B) are diagrams for explaining IoU in the first embodiment.
  • FIG. 4 is a diagram showing a relationship among Precision, Recall and AP in the first embodiment.
  • FIGS. 5(A) and 5(B) are diagrams (No. 1) showing examples of output of selected images.
  • FIGS. 6(A) and 6(B) are diagrams (No. 2) showing examples of the output of the selected images.
  • FIG. 7 is a block diagram showing functions of an information processing device in a second embodiment.
  • FIG. 8 is a flowchart showing an example of a process executed by the information processing device in the second embodiment.
  • MODE FOR CARRYING OUT THE INVENTION
  • Embodiments will be described below with reference to the drawings. The following embodiments are just examples and a variety of modifications are possible within the scope of the present disclosure.
  • First Embodiment
  • FIG. 1 is a block diagram showing functions of an information processing device in a first embodiment. The information processing device 100 is a device that executes a selection output method. The information processing device 100 includes a first storage unit 111, a second storage unit 112, an acquisition unit 120, learning units 130 a and 130 b, an object detection unit 140, a calculation unit 150 and a selection output unit 160.
  • Here, hardware included in the information processing device 100 will be described below.
  • FIG. 2 is a diagram showing the hardware included in the information processing device in the first embodiment. The information processing device 100 includes a processor 101, a volatile storage device 102 and a nonvolatile storage device 103.
  • The processor 101 controls the whole of the information processing device 100. The processor 101 is a Central Processing Unit (CPU), a Field Programmable Gate Array (FPGA) or the like, for example. The processor 101 can also be a multiprocessor. Further, the information processing device 100 may include processing circuitry. The processing circuitry may be either a single circuit or a combined circuit.
  • The volatile storage device 102 is main storage of the information processing device 100. The volatile storage device 102 is a Random Access Memory (RAM), for example. The nonvolatile storage device 103 is auxiliary storage of the information processing device 100. The nonvolatile storage device 103 is a Hard Disk Drive (HDD) or a Solid State Drive (SSD), for example.
  • Returning to FIG. 1 , the functions of the information processing device 100 will be described below.
  • The first storage unit 111 and the second storage unit 112 may also be implemented as storage areas reserved in the volatile storage device 102 or the nonvolatile storage device 103.
  • Part or all of the acquisition unit 120, the learning units 130 a and 130 b, the object detection unit 140, the calculation unit 150 and the selection output unit 160 may be implemented by the processing circuitry. Further, part or all of the acquisition unit 120, the learning units 130 a and 130 b, the object detection unit 140, the calculation unit 150 and the selection output unit 160 may be implemented as modules of a program executed by the processor 101. For example, the program executed by the processor 101 is referred to also as a selection output program. The selection output program has been recorded in a record medium, for example.
  • The information processing device 100 generates learned models 200 a and 200 b. A process until the learned models 200 a and 200 b are generated will be described below.
  • First, the first storage unit 111 will be described. The first storage unit 111 may store labeled learning data. The labeled learning data includes an image, at least one region of an object as a detection target in the image, and a label indicating the type of the object. Incidentally, information including the region of the object and the label is referred to also as label information. When the image is an image including a road, for example, the type is four-wheel vehicle, two-wheel vehicle, truck, or the like.
  • The acquisition unit 120 acquires the labeled learning data. The acquisition unit 120 acquires the labeled learning data from the first storage unit 111, for example. Alternatively, the acquisition unit 120 acquires the labeled learning data from an external device (e.g., cloud server), for example.
  • The learning units 130 a and 130 b generate the learned models 200 a and 200 b by executing object detection learning in methods different from each other by using the labeled learning data. For example, each of these methods can be Faster Regions with Convolutional Neural Networks (R-CNN), You Look Only Once (YOLO), Single Shot MultiBox Detector (SSD), or the like. Incidentally, each method can be referred to also as algorithm.
  • As above, by the learning units 130 a and 130 b, the learned models 200 a and 200 b for executing object detection by methods different from each other are generated. For example, the learned model 200 a is a learned model for executing the object detection by using Faster R-CNN. For example, the learned model 200 b is a learned model for executing the object detection by using YOLO.
  • In this example, two learning units are shown in FIG. 1 . The number of learning units is not limited to two. The same number of learned models as the learning units are generated. Thus, the number of learned models is not limited to two. Further, each learned model may be referred to also as a detector or detector information.
  • The learned models 200 a and 200 b generated may be stored in the volatile storage device 102 or the nonvolatile storage device 103 or stored in an external device.
  • Next, a process executed by the information processing device 100 after the generation of the learned models 200 a and 200 b will be described below.
  • First, the second storage unit 112 will be described. The second storage unit 112 may store a plurality of pieces of unlabeled learning data. Each of the plurality of pieces of unlabeled learning data does not include the label information. The plurality of pieces of unlabeled learning data are a plurality of images. Each of the plurality of images includes an object. The object is a human, an animal or the like, for example.
  • The acquisition unit 120 acquires a plurality of pieces of unlabeled learning data. The acquisition unit 120 acquires the plurality of pieces of unlabeled learning data from the second storage unit 112, for example. Alternatively, the acquisition unit 120 acquires the plurality of pieces of unlabeled learning data from an external device, for example.
  • The acquisition unit 120 acquires the learned models 200 a and 200 b. The acquisition unit 120 acquires the learned models 200 a and 200 b from the volatile storage device 102 or the nonvolatile storage device 103, for example. Alternatively, the acquisition unit 120 acquires the learned models 200 a and 200 b from an external device, for example.
  • The object detection unit 140 performs the object detection on each of the plurality of pieces of unlabeled learning data by using the learned models 200 a and 200 b. For example, when the number of pieces of unlabeled learning data is two, the object detection unit 140 performs the object detection on first unlabeled learning data, as one of the plurality of pieces of unlabeled learning data, by using the learned models 200 a and 200 b. In other words, the object detection unit 140 executes the object detection by using the first unlabeled learning data and the learned models 200 a and 200 b. Further, for example, the object detection unit 140 performs the object detection on second unlabeled learning data, as one of the plurality of pieces of unlabeled learning data, by using the learned models 200 a and 200 b.
  • As above, the object detection unit 140 performs the object detection on each of the plurality of pieces of unlabeled learning data by using the learned models 200 a and 200 b.
  • First, a case where the object detection is executed by using one piece of unlabeled learning data and the learned models 200 a and 200 b will be described below. Further, a method for calculating an information amount score corresponding to the one piece of unlabeled learning data will also be described below.
  • The object detection unit 140 executes the object detection by using the one piece of unlabeled learning data and the learned models 200 a and 200 b. The object detection unit 140 executes the object detection by using the unlabeled learning data and the learned model 200 a, for example. Further, the object detection unit 140 executes the object detection by using the unlabeled learning data and the learned model 200 b, for example. Accordingly, the object detection is executed by methods different from each other. An object detection result is outputted in regard to each learned model. The object detection result is represented as Di. Incidentally, i is an integer from 1 to N. Further, the object detection result Di is referred to also as a reasoning label Ri. The reasoning label Ri is expressed as “(c, x, y, w, h)”. The parameter c indicates the type of the object. The parameters x and y indicate coordinates (x, y) of an image region center of the object. The parameter w indicates width of the object. The parameter h indicates height of the object.
  • The calculation unit 150 calculates the information amount score by using the object detection result Di. The information amount score indicates value of the unlabeled learning data. Thus, a larger value of the information amount score indicates that the unlabeled learning data has greater value as learning data. In other words, the information amount score varies greatly in the result of the type in an image region having high similarity. Alternatively, the information amount score varies greatly in the image region in the result of the same type.
  • A method for calculating the information amount score will be described below. In the calculation of the information amount score, mean Average Precision (mAP) @0.5 as a detection accuracy index in consideration of similarity of the image region of each object and difference in the type result of each object is used. Incidentally, “0.5” represents a threshold value of Intersection over Union (IoU) which will be described later.
  • When there are two learned models, the information amount score is calculated by using expression (1). Here, the object detection result outputted from the learned model 200 a is represented as Di. The object detection result outputted from the learned model 200 b is represented as D2.

  • INFORMATION AMOUNT SCOREN=2=1−mAP@0.5(D 1 ,D 2)  (1)
  • Further, the mAP@0.5 is one of evaluation methods in the object detection, and the IoU is known as a concept used for the evaluation. When the object detection has been executed by using labeled learning data, the IoU is represented by using expression (2). The character Rgt represents a true value region. The character Rd represents a detection region. The character A represents an area.
  • IoU ( R gt , R d ) = A ( R gt R d ) A ( R gt R d ) ( 2 )
  • A concrete example of the true value region Rgt and the detection region Rd will be described below.
  • FIGS. 3(A) and 3(B) are diagrams for explaining the IoU in the first embodiment. FIG. 3(A) shows a concrete example of the true value region Rgt and the detection region Rd. Further, FIG. 3(A) shows how much the true value region Rgt and the detection region Rd overlap with each other.
  • Here, the unlabeled learning data includes no label. Thus, there is no true value. Accordingly, the IoU cannot be represented by directly using the expression (2). Therefore, the IoU is represented as follows: A region represented by one object detection result is defined as the true value region. Then, a region represented by another object detection result is defined as the detection region. For example, in FIG. 3(B), a detection region Rgt1 represented by the object detection result D1 is defined as the true value region. A detection region Rd1 represented by the object detection result D2 is defined as the detection region. When the example of FIG. 3(B) is used, the IoU is represented by using expression (3).
  • IoU ( R gt 1 , R d 1 ) = A ( R gt 1 R d 1 ) A ( R gt 1 R d 1 ) ( 3 )
  • True Positive (TP), False Positive (FP) and False Negative (FN) are calculated by using the IoU.
  • Incidentally, when the IoU of the detection region Rgt1 with respect to the detection region Rd1 is greater than or equal to a threshold value, the TP indicates that the learned model detected an object existing in the image of the unlabeled learning data. In other words, it indicates that the learned model detected a true value since the detection region Rd1 and the detection region Rgt1 are situated substantially at the same position.
  • When the IoU of the detection region Rgt1 with respect to the detection region Rd1 is less than the threshold value, the FP indicates that the learned model detected an object not existing in the image of the unlabeled learning data. In other words, it indicates that the learned model made false detection since the detection region Rgt1 is situated at a deviated position.
  • When the IoU of the detection region Rd1 with respect to the detection region Rgt1 is less than the threshold value, the FN indicates that the learned model did not detect an object existing in the image of the unlabeled learning data. In other words, it indicates that the learned model did not make the detection since the detection region Rgt1 is situated at a deviated position.
  • Further, Precision is represented by using the TP and the FP. Specifically, the Precision is represented by using expression (4). Incidentally, the Precision indicates a ratio of data that are actually positive among data that were estimated to be positive. Incidentally, the Precision is referred to also as a precision ratio.
  • Precision = TP TP + FP ( 4 )
  • Recall is represented by using the TP and the FP. Specifically, the Recall is represented by using expression (5). Incidentally, the Recall indicates a ratio of data that were estimated to be positive among data that are actually positive. Incidentally, the Recall is referred to also as a recall ratio.
  • Recall = TP TP + FN ( 5 )
  • An example of a relationship among the Precision, the Recall and AP will be shown below.
  • FIG. 4 is a diagram showing the relationship among the Precision, the Recall and the AP in the first embodiment. The vertical axis represents the Precision. The horizontal axis represents the Recall. The Average Precision (AP) is calculated by using the Precision and the Recall. Specifically, the area of “AP” in FIG. 4 is calculated as the AP.
  • For example, when a plurality of objects exist in the image of the unlabeled learning data, the calculation unit 150 calculates the TP, the FP and the FN of each of the plurality of objects. The calculation unit 150 calculates the Precision and the Recall of each of the plurality of objects by using the expression (4) and the expression (5). The calculation unit 150 calculates the AP of each object (i.e., class) based on the Precision and the Recall of each of the plurality of objects. For example, when the plurality of objects are a cat and a dog, the AP “0.4” of the cat and the AP “0.6” of the dog are calculated. The calculation unit 150 calculates the average of the APs of the objects as the mAP. For example, when the AP of the cat is “0.4” and the AP of the dog is “0.6”, the calculation unit 150 calculates the mAP “0.5”. Incidentally, when only one object exists in the image of the unlabeled learning data, one AP is calculated. Then, the one AP serves as the mAP.
  • The mAP is calculated as above. The calculation unit 150 calculates the information amount score by using the mAP and the expression (1). Namely, the calculation unit 150 calculates the information amount score by “1−mAP”. The information amount score is calculated as above.
  • When there are N (i.e., 3 or more) leaned models, the information amount score is calculated by using expression (6). Namely, the calculation unit 150 generates a plurality of combinations of two leaned models by using the N leaned models, calculates a value for each combination by using the expression (1), and calculates the information amount score by dividing the sum total of the calculated values by N.
  • INFORMATION AMOUNT SCORE N > 2 = 1 N i , j ( 1 , N ) , i j ( 1 - mAP @ 0.5 ( D i , D j ) ) ( 6 )
  • As above, the calculation unit 150 calculates the information amount score corresponding to the one piece of unlabeled learning data. Then, the information processing device 100 (i.e., the object detection unit 140 and the calculation unit 150) performs the same process also on each of the plurality of pieces of unlabeled learning data. By this, the information processing device 100 is capable of obtaining the information amount score of each of the plurality of pieces of unlabeled learning data. In other words, the information processing device 100 is capable of obtaining a plurality of information amount scores corresponding to the plurality of pieces of unlabeled learning data. As above, the information processing device 100 calculates the plurality of information amount scores based on a plurality of object detection results. Specifically, the information processing device 100 calculates the plurality of information amount scores by using the mAPs and the plurality of object detection results.
  • The selection output unit 160 selects a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores. In other words, the selection output unit 160 selects unlabeled learning data having great learning effect from the plurality of pieces of unlabeled learning data corresponding to the plurality of information amount scores based on the plurality of information amount scores. This sentence may also be expressed as follows: The selection output unit 160 selects unlabeled learning data that is/are expected to contribute to the learning from the plurality of pieces of unlabeled learning data.
  • An example of the method of the selection will be described below. In the first place, the information amount score is a value in a range from 0 to 1. When the information amount score is “0”, the detection results by the learned models 200 a and 200 b substantially coincide with each other. Therefore, unlabeled learning data corresponding to the information amount score “0” is considered to have low usefulness since the degree of necessity of appropriating the unlabeled learning data for learning data is low. In contrast, when the information amount score is “1”, the detection results by the learned models 200 a and 200 b greatly differ from each other. However, unlabeled learning data corresponding to the information amount score “1” can be regarded also as a special example that is extremely difficult to detect. Therefore, adding a lot of special examples to the learning data at a stage when the amount of learning data is small is considered not to contribute to improvement in the detection performance. Thus, the selection output unit 160 excludes such unlabeled learning data corresponding to the information amount score “0” or “1” from the plurality of pieces of unlabeled learning data corresponding to the plurality of information amount scores. After the exclusion, the selection output unit 160 selects top n (n is a positive integer) pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data as unlabeled learning data having great learning effect.
  • The selection output unit 160 outputs the selected unlabeled learning data. It is also possible for the selection output unit 160 to output object detection results, as results of performing the object detection on the selected unlabeled learning data (hereinafter referred to as selected images), as the reasoning labels. Here, examples of the output of the selected images will be described below.
  • FIGS. 5(A) and 5(B) are diagrams (No. 1) showing examples of the output of the selected images. FIG. 5(A) shows a case where the selected images are outputted to the volatile storage device 102 or the nonvolatile storage device 103. For example, the labeling worker performs the labeling on the selected images by using the information processing device 100.
  • FIG. 5(B) shows a case where the selected images and the reasoning labels are outputted to the volatile storage device 102 or the nonvolatile storage device 103. For example, the labeling worker performs the labeling on the selected images by using the information processing device 100 and the reasoning labels. Further, by the outputting of the reasoning labels, the labeling workload on the labeling worker is lightened.
  • FIGS. 6(A) and 6(B) are diagrams (No. 2) showing examples of the output of the selected images. FIG. 6(A) shows a case where the selected images are outputted to a labeling tool. Since the selected images are outputted to the labeling tool as above, the labeling workload on the labeling worker is lightened.
  • FIG. 6(B) shows a case where the selected images and the reasoning labels are outputted to the labeling tool. The labeling worker performs the labeling on the selected images while correcting the reasoning labels by using the labeling tool.
  • Here, the images selected by the selection output unit 160 are images selected by using learned models that detect an object by methods different from each other. Therefore, the selected images are not only suitable as learning data used when executing the learning by a certain method but also suitable as learning data used when executing the learning by a different method. Thus, the selected images can be regarded as learning data having great learning effect. According to the first embodiment, the information processing device 100 is capable of selecting learning data having great learning effect.
  • Further, the learning data having great learning effect are automatically selected by the information processing device 100. Therefore, the information processing device 100 is capable of efficiently selecting the learning data having great learning effect.
  • Second Embodiment
  • Next, a second embodiment will be described below. In the second embodiment, the description will be given mainly of features different from those in the first embodiment. In the second embodiment, the description is omitted for features in common with the first embodiment.
  • FIG. 7 is a block diagram showing functions of an information processing device in the second embodiment. Each component in FIG. 7 that is the same as a component shown in FIG. 1 is assigned the same reference character as in FIG. 1 .
  • The information processing device 100 relearns the learned models 200 a and 200 b. Details of the relearning will be described later.
  • Next, a process executed by the information processing device 100 will be described below by using a flowchart.
  • FIG. 8 is a flowchart showing an example of the process executed by the information processing device in the second embodiment.
  • (Step S11) The acquisition unit 120 acquires the labeled learning data. Incidentally, the data amount of the labeled learning data may be small.
  • The learning units 130 a and 130 b generate the learned models 200 a and 200 b by executing the object detection learning in methods different from each other by using the labeled learning data.
  • (Step S12) The acquisition unit 120 acquires a plurality of pieces of unlabeled learning data.
  • The object detection unit 140 executes the object detection by using the plurality of pieces of unlabeled learning data and the learned models 200 a and 200 b.
  • (Step S13) The calculation unit 150 calculates a plurality of information amount scores corresponding to the plurality of pieces of unlabeled learning data based on a plurality of object detection results.
  • (Step S14) The selection output unit 160 selects unlabeled learning data having great learning effect from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores.
  • (Step ST15) The selection output unit 160 outputs the selected unlabeled learning data (i.e., selected images). For example, the selection output unit 160 outputs the selected images as illustrated in FIG. 5 or FIG. 6 .
  • Here, the labeling worker executes the labeling by using the selected images. By this labeling, labeled learning data is generated. The labeled learning data includes the selected images, at least one region of an object as a detection target in the images, and a label indicating the type of the object. The labeled learning data may be stored in the first storage unit 111. Incidentally, the labeling work may also be executed by an external device.
  • (Step S16) The acquisition unit 120 acquires the labeled learning data. The acquisition unit 120 acquires the labeled learning data from the first storage unit 111, for example. Alternatively, the acquisition unit 120 acquires the labeled learning data from the external device, for example.
  • (Step ST7) The learning units 130 a and 130 b relearn the learned models 200 a and 200 b by using the labeled learning data.
  • (Step S18) The information processing device 100 judges whether a termination condition of the learning is satisfied or not. Incidentally, the termination condition has been stored in the nonvolatile storage device 103, for example. When the termination condition is satisfied, the process ends. When the termination condition is not satisfied, the process advances to the step S12.
  • According to the second embodiment, the information processing device 100 is capable of increasing the object detection accuracy of the learned models by repeating the addition of labeled learning data and the relearning.
  • Features in the embodiments described above can be appropriately combined with each other.
  • DESCRIPTION OF REFERENCE CHARACTERS
  • 100: information processing device, 101: processor, 102: volatile storage device, 103: nonvolatile storage device, 111: first storage unit, 112: second storage unit, 120: acquisition unit, 130 a, 130 b: learning unit, 140: object detection unit, 150: calculation unit, 160: selection output unit, 200 a, 200 b learned model.

Claims (6)

1. An information processing device comprising:
acquiring circuitry to acquire a plurality of learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object;
object detecting circuitry to perform the object detection on each of the plurality of pieces of unlabeled learning data by using the plurality of learned models;
calculating circuitry to calculate a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results; and
selection outputting circuitry to select a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores and output the selected unlabeled learning data.
2. The information processing device according to claim 1, wherein the selection outputting circuitry outputs object detection results, as results of performing the object detection on the selected unlabeled learning data, as reasoning labels.
3. The information processing device according to claim 1, wherein the calculating circuitry calculates the plurality of information amount scores by using mean Average Precision and the plurality of object detection results.
4. The information processing device according to claim 1, further comprising a plurality of learning circuitry, wherein
the acquiring circuitry acquires labeled learning data including the selected unlabeled learning data, and
the plurality of learning circuitry relearn the plurality of learned models by using the labeled learning data.
5. A selection output method performed by an information processing device, the selection output method comprising:
acquiring a plurality of learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object;
performing the object detection on each of the plurality of pieces of unlabeled learning data by using the plurality of learned models;
calculating a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results;
selecting a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores; and
outputting the selected unlabeled learning data.
6. An information processing device comprising:
a processor to execute a program; and
a memory to store the program which, when executed by the processor, performs processes of,
acquiring a plurality of learned models for executing object detection by methods different from each other and a plurality of pieces of unlabeled learning data as a plurality of images including an object;
performing the object detection on each of the plurality of pieces of unlabeled learning data by using the plurality of learned models;
calculating a plurality of information amount scores indicating values of the plurality of pieces of unlabeled learning data based on a plurality of object detection results;
selecting a predetermined number of pieces of unlabeled learning data from the plurality of pieces of unlabeled learning data based on the plurality of information amount scores; and
outputting the selected unlabeled learning data.
US18/273,278 2021-02-05 2021-02-05 Information processing device, and selection output method Pending US20240119723A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/004388 WO2022168274A1 (en) 2021-02-05 2021-02-05 Information processing device, selection and output method, and selection and output program

Publications (1)

Publication Number Publication Date
US20240119723A1 true US20240119723A1 (en) 2024-04-11

Family

ID=82742068

Family Applications (1)

Application Number Title Priority Date Filing Date
US18/273,278 Pending US20240119723A1 (en) 2021-02-05 2021-02-05 Information processing device, and selection output method

Country Status (4)

Country Link
US (1) US20240119723A1 (en)
CN (1) CN116802651A (en)
DE (1) DE112021006984T5 (en)
WO (1) WO2022168274A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5167596B2 (en) * 2006-05-10 2013-03-21 日本電気株式会社 Data set selection device and experimental design system
JP6364037B2 (en) 2016-03-16 2018-07-25 セコム株式会社 Learning data selection device
US10769500B2 (en) * 2017-08-31 2020-09-08 Mitsubishi Electric Research Laboratories, Inc. Localization-aware active learning for object detection

Also Published As

Publication number Publication date
CN116802651A (en) 2023-09-22
DE112021006984T5 (en) 2023-11-16
WO2022168274A1 (en) 2022-08-11
JPWO2022168274A1 (en) 2022-08-11

Similar Documents

Publication Publication Date Title
US10540572B1 (en) Method for auto-labeling training images for use in deep learning network to analyze images with high precision, and auto-labeling device using the same
Garcia-Fidalgo et al. Hierarchical place recognition for topological mapping
US10474713B1 (en) Learning method and learning device using multiple labeled databases with different label sets and testing method and testing device using the same
US11341424B2 (en) Method, apparatus and system for estimating causality among observed variables
EP3355244A1 (en) Data fusion and classification with imbalanced datasets
US10262214B1 (en) Learning method, learning device for detecting lane by using CNN and testing method, testing device using the same
US9683836B2 (en) Vehicle classification from laser scanners using fisher and profile signatures
US20170032276A1 (en) Data fusion and classification with imbalanced datasets
EP3686791B1 (en) Learning method and learning device for object detector based on cnn to be used for multi-camera or surround view monitoring using image concatenation and target object merging network, and testing method and testing device using the same
GB2537681A (en) A method of detecting objects within a 3D environment
US11410388B1 (en) Devices, systems, methods, and media for adaptive augmentation for a point cloud dataset used for training
KR20200091331A (en) Learning method and learning device for object detector based on cnn, adaptable to customers' requirements such as key performance index, using target object merging network and target region estimating network, and testing method and testing device using the same to be used for multi-camera or surround view monitoring
CN113129335B (en) Visual tracking algorithm and multi-template updating strategy based on twin network
EP3620958A1 (en) Learning method, learning device for detecting lane through lane model and testing method, testing device using the same
CN111783844A (en) Target detection model training method and device based on deep learning and storage medium
Xiong et al. Contrastive learning for automotive mmWave radar detection points based instance segmentation
CN110909588B (en) CNN-based method and device for lane line detection
CN117689693A (en) Abnormal local track detection method and device based on graph comparison self-supervision learning
CN116206275B (en) Knowledge distillation-based recognition model training method and device
US10713815B1 (en) Method and device for supporting administrators to processes of object detectors to provide logical driving
US20240119723A1 (en) Information processing device, and selection output method
Ali et al. A life-long SLAM approach using adaptable local maps based on rasterized LIDAR images
US9208402B2 (en) Face matching for mobile devices
US11023776B1 (en) Methods for training auto-labeling device and performing auto-labeling by using hybrid classification and devices using the same
CN111414804B (en) Identification frame determining method, identification frame determining device, computer equipment, vehicle and storage medium

Legal Events

Date Code Title Description
AS Assignment

Owner name: MITSUBISHI ELECTRIC CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:QU, JIA;SHIMIZU, SHOICHI;SIGNING DATES FROM 20230424 TO 20230425;REEL/FRAME:064321/0179

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION