WO2023002648A1 - 情報処理方法及び情報処理システム - Google Patents
情報処理方法及び情報処理システム Download PDFInfo
- Publication number
- WO2023002648A1 WO2023002648A1 PCT/JP2022/003897 JP2022003897W WO2023002648A1 WO 2023002648 A1 WO2023002648 A1 WO 2023002648A1 JP 2022003897 W JP2022003897 W JP 2022003897W WO 2023002648 A1 WO2023002648 A1 WO 2023002648A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- learning data
- learning
- information processing
- model
- data
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 80
- 238000003672 processing method Methods 0.000 title claims abstract description 30
- 238000010801 machine learning Methods 0.000 claims abstract description 154
- 238000012549 training Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 30
- 238000004364 calculation method Methods 0.000 claims description 105
- 238000004088 simulation Methods 0.000 claims description 82
- 238000011156 evaluation Methods 0.000 claims description 55
- 230000033001 locomotion Effects 0.000 claims description 27
- 230000000704 physical effect Effects 0.000 claims description 17
- 230000004044 response Effects 0.000 claims description 5
- 238000000034 method Methods 0.000 description 26
- 238000011157 data evaluation Methods 0.000 description 24
- 238000004422 calculation algorithm Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 11
- 238000013473 artificial intelligence Methods 0.000 description 10
- 238000013528 artificial neural network Methods 0.000 description 9
- 230000008569 process Effects 0.000 description 8
- 230000004913 activation Effects 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 230000007704 transition Effects 0.000 description 5
- 239000013598 vector Substances 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 238000012217 deletion Methods 0.000 description 4
- 230000037430 deletion Effects 0.000 description 4
- 241000282326 Felis catus Species 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 238000005381 potential energy Methods 0.000 description 3
- 238000005401 electroluminescence Methods 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 230000002250 progressing effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000002730 additional effect Effects 0.000 description 1
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
Definitions
- this disclosure relates to an information processing method and an information processing system that perform processing related to learning of a machine learning model.
- Artificial intelligence can analyze and estimate huge amounts of data, and is used for image recognition, voice recognition, and natural language processing, for example.
- Artificial intelligence is realized by learning a machine learning model composed of a neural network or the like. By performing deep learning using a huge amount of data sets for learning, it is possible to obtain artificial intelligence that realizes inference that exceeds human ability.
- the process by which artificial intelligence reaches an inference result is black-boxed, making it difficult to understand the grounds for its judgment.
- Gradient-weighted Class Activation Mapping has been developed as a technology for visualizing the basis of decisions made by deep learning machine learning models.
- Grad -An analysis program has been proposed that is written to generate a map indicating the degree of attention of each image part of an incorrectly inferred image that was focused on during inference by using the CAM method (see Patent Document 1). .
- An object of the present disclosure is to provide an information processing method and an information processing system for processing learning data used for learning a machine learning model.
- the present disclosure has been made in consideration of the above problems, and a first aspect thereof is an information processing method for processing learning data used for learning a machine learning model, a determining step of determining characteristics of each training data based on inference results of the machine learning model for the training data; a presentation step of presenting an evaluation result of the learning data based on the determined characteristics; It is an information processing method having
- the determining step based on the expected value for each label output by the machine learning model for the learning data, the physical characteristics of the object corresponding to each learning data are determined, and each object having the determined physical characteristics is determined. Perform physics simulation calculations between Specifically, in the determination step, the mass of the object corresponding to the learning data is determined based on the magnitude of the expected value of the correct label, and each mass is determined based on the match/mismatch of the label with a high expected value or the label with a low expected value. Attractive force and repulsive force acting between objects corresponding to learning data are determined, and motion information of each object is calculated by physics simulation calculation based on these physical characteristics. Then, in the presenting step, each object displayed on the screen of the display device is moved based on the movement information calculated in the determining step.
- the information processing method may further include an input step of inputting a user's operation on the object displayed on the screen of the display device.
- the user can exclude the learning data corresponding to the object for which the deletion operation has been performed on the screen in the input step from the learning target of the machine learning model. In this way, custom data sets for each user can be created.
- a second aspect of the present disclosure is an information processing system that performs processing related to learning data used for learning a machine learning model, a determination unit that determines characteristics of each learning data based on the inference result of the machine learning model for the learning data; a presentation unit that presents an evaluation result of the learning data based on the determined characteristics; It is an information processing system including
- system refers to a logical assembly of multiple devices (or functional modules that implement specific functions), and each device or functional module is in a single housing. It does not matter whether or not In other words, both a single device consisting of a plurality of parts or functional modules and an assembly of a plurality of devices correspond to a "system.”
- the determining unit determines the physical characteristics of the object corresponding to each learning data based on the expected value for each label output by the machine learning model for the learning data, and determines the physical characteristics of each object having the determined physical characteristics.
- a physics simulation calculation is performed between the objects to calculate the motion information of each object. Then, the presentation unit moves each object displayed on the screen of the display device based on the motion information calculated in the determination step.
- the information processing system is composed of one or more devices.
- the information processing device includes a first device including the determination unit and a second device including the presentation unit.
- the second device may include a display device for displaying an evaluation result of learning data based on the determined characteristics on a screen, and an input unit for inputting a user's operation on the screen.
- the information processing system may further include a third device including a model updating unit that updates the machine learning model by learning using learning data.
- FIG. 1 is a diagram showing a functional configuration example of a learning system 100.
- FIG. 2 is a diagram showing a configuration example of the machine learning model 200.
- FIG. 3 is a diagram exemplifying the transition of inference results according to the number of learning times (the number of epochs) of the machine learning model 200.
- FIG. 4 is a diagram showing physical properties determined for inference results for the learning data illustrated in FIG.
- FIG. 5 is a diagram exemplifying a dynamic model composed of objects corresponding to each learning data.
- FIG. 6 is a diagram showing a configuration example of a GUI screen that displays evaluation results of learning data used for learning a machine learning model.
- FIG. 1 is a diagram showing a functional configuration example of a learning system 100.
- FIG. 2 is a diagram showing a configuration example of the machine learning model 200.
- FIG. 3 is a diagram exemplifying the transition of inference results according to the number of learning times (the number of epochs) of
- FIG. 7 is a diagram showing how a GUI operation is performed on the GUI screen shown in FIG.
- FIG. 8 is a diagram showing how a GUI operation is performed on the GUI screen shown in FIG.
- FIG. 9 is a diagram showing how the evaluation result of the learning data changes on the GUI screen during learning of the machine learning model.
- FIG. 10 is a diagram showing an example of a judgment base image with a heat map display calculated based on the Grad-CAM algorithm.
- FIG. 11 is a diagram showing another example of a judgment basis image with heat map display calculated based on the Grad-CAM algorithm.
- FIG. 12 is a diagram showing an example of GUI operation for displaying detailed information of learning data.
- FIG. 13 is a diagram showing an example of GUI operation for displaying detailed information of learning data.
- FIG. 14 is a flowchart showing a processing procedure performed by the learning data evaluation unit 120.
- FIG. 15 is a diagram showing a hardware configuration example of an information processing system 1500. As shown in FIG.
- Artificial intelligence consists of models using types such as neural networks, support vector regression, and Gaussian process regression. For the sake of convenience, the present specification will focus on embodiments using a neural network model, but the present disclosure is not limited to a specific model type, and is equally applicable to models other than neural networks.
- the use of artificial intelligence consists of a "learning phase” in which models are trained and an “inference phase” in which inferences are made using the trained models. Inference includes recognition processing such as image recognition and voice recognition, and prediction processing for estimating and predicting events.
- recognition processing such as image recognition and voice recognition
- prediction processing for estimating and predicting events.
- the learning phase of artificial intelligence using a data set consisting of a combination of the data input to the model (hereinafter also referred to as "input data”) and the labels that you want the model to estimate for the input data, each input data
- the model is trained by a learning algorithm such as error backpropagation so that it can output the correct label corresponding to .
- a model that has been trained in the learning phase (hereinafter also referred to as a "trained model”) outputs appropriate labels for input data.
- the present disclosure proposes a method and system for evaluating training data during model training and presenting evaluation results to a user.
- a user here is specifically a developer of a machine learning model.
- evaluation of learning data the ranking of individual learning data is calculated, the relationship between learning data is evaluated, and such evaluation results are displayed on a computer GUI (Graphical User Interface) screen. be used and presented to the user. Therefore, through the GUI screen, the user can grasp that there is a problem in the learning data used for the machine learning model during learning, select the problematic learning data, and reduce the time required for re-learning. loss can be reduced. That is, the user can proceed with the learning of the machine learning model while visually confirming the influence of the learning data.
- GUI Graphic User Interface
- FIG. 1 shows a functional configuration example of a learning system 100 to which the present disclosure is applied.
- the illustrated learning system 100 is used by being mounted on an edge device, for example, but some or all of the functions of the learning system 100 may be built on a cloud or an arithmetic device capable of large-scale computation.
- the learning system 100 will be described as learning a machine learning model that mainly performs image classification, such as object recognition and face recognition.
- the present disclosure is not limited to this, and the learning system 100 may learn a machine learning model that performs inference other than image classification.
- the illustrated learning system 100 includes a learning data storage unit 101, a model update unit 102, a model parameter storage unit 103, an inference unit 111, a data input unit 112, and an input data processing unit 113.
- the learning data holding unit 101, the model updating unit 102, and the model parameter holding unit 103 operate in the learning phase of the machine learning model
- the inference unit 111, the data input unit 112, and the input data processing unit 113 operate in the learning phase. It operates in the inference phase using the pre-existing model.
- the learning system 100 is used by being mounted on an edge device, for example, but some or all of the functions of the learning system 100 may be built on a cloud or an arithmetic device capable of large-scale computation.
- the learning system 100 includes a learning data providing unit 130 that provides learning data used for learning the machine learning model, and a model updating unit 102 that evaluates the learning data used for learning the machine learning model.
- a learning data evaluation unit 120 is further provided.
- the learning data evaluation unit 120 includes a physics simulation calculation unit 121 , an evaluation result presentation unit 122 , and a judgment basis calculation unit 123 .
- the learning data evaluation unit 120 may be the same system as the learning system 100 or may be a system configured independently from the learning system 100 .
- a system that implements the learning data evaluation unit 120 is used by being mounted on, for example, an edge device, but part or all of the functions of this system may be built on a cloud or a computing device capable of large-scale computation.
- the learning data providing unit 130 supplies learning data that the model updating unit 102 uses for model learning.
- Learning data basically consists of a data set (x, y) that combines input data x to be input to a model to be learned and correct label y that is the correct answer for input data x.
- the learning data providing unit 130 for example, in the case of a digital camera, stores a photographed image and a correct label (what is the subject of the photographed image).
- Provide learning data consisting of combinations. For example, learning data composed of images captured by a large number of digital cameras is provided to the learning system 100 via a wide area network such as the Internet.
- the learning data holding unit 101 accumulates learning data that the model updating unit 102 uses for model learning.
- Each piece of learning data consists of a data set combining input data to be input to a model to be learned and correct labels to be inferred by the model.
- the learning data holding unit 101 stores data sets provided by the learning data providing unit 130, but may also store data sets obtained from other sources. When the model updating unit 102 performs deep learning, a large amount of data sets are accumulated in the learning data holding unit 101 .
- a custom data set can be generated at the discretion of the user.
- the learning data holding unit 101 associates a data set customized for each user with, for example, identification information for each user, and provides general data provided by the learning data providing unit 130 or acquired from other sources. You may make it hold separately from a set.
- the model updating unit 102 sequentially reads the learning data from the learning data holding unit 101, performs learning of the machine learning model to be learned, and updates the machine learning model.
- the machine learning model is composed of, for example, a neural network, but may be a model using support vector regression, Gaussian process regression, or the like.
- a machine learning model consisting of a neural network consists of an input layer that inputs data such as images (explanatory variables), an output layer that outputs labels (objective variables) that are inference results for the input data, an input layer and an output layer. consists of multiple layers of one or more intermediate layers (or hidden layers) between Each layer consists of a plurality of nodes corresponding to neurons.
- the connections between nodes between layers have weights, and the data input to the input layer undergoes value conversion in the process of passing from layer to layer.
- the model update unit 102 calculates a loss function defined based on the error between the label output from the machine learning model for the input data and the known correct label corresponding to the input data, and calculates the loss function
- a machine learning model is learned while updating model parameters (such as weight coefficients between nodes) by error backpropagation so that is minimized. Since the learning process of the machine learning model requires a huge amount of calculation, distributed learning may be performed using multiple GPUs (Graphics Processing Units) or multiple computation nodes.
- a model parameter is a variable element that defines a model, and is, for example, a connection weighting factor given between nodes of a neural network model.
- the inference unit 111, the data input unit 112, and the input data processing unit 113 perform the inference phase of the trained model.
- the data input unit 112 inputs sensor information acquired by a sensor provided in the edge device.
- the input data processing unit 113 processes the data input from the data input unit 112 into a data format that can be input to a model (for example, a neural network model), and inputs the processed data to the inference unit 111 .
- the inference unit 111 outputs a label inferred from input data using a model in which model parameters read from the model parameter storage unit 103 are set, that is, a trained model.
- the learning data evaluation unit 120 evaluates each learning data used for learning the machine learning model in the model updating unit 102 .
- the learning data evaluation unit 120 includes a physics simulation calculation unit 121 and an evaluation result presentation unit 122 .
- the physics simulation calculation unit 121 determines the physical characteristics of each piece of learning data based on the inference result of the learning data by the model during learning. Specifically, the physics simulation calculation unit 121 determines the force acting on the learning data based on the inference result of each learning data by the machine learning model.
- the force here includes the mass (gravitational force) of learning data, buoyancy, and the attraction or repulsion acting between other learning data. Also, the physics simulation calculation unit 121 may determine physical quantities such as the size (volume) and shape of learning data in addition to the acting force.
- the physics simulation calculation unit 121 performs physics simulation calculation based on the physical quantity such as the magnitude of the acting force of each learning data determined according to the result of inference by the model during learning, and calculates the movement of each learning data. decide.
- An example of the physical simulation calculation is the FD (Force-Directed) method.
- the evaluation result presentation unit 122 presents a GUI screen in which a plurality of objects corresponding to each learning data are arranged and visually ranked based on the motion information determined based on the physics simulation calculation. On this GUI screen, it is possible to perform GUI operations on objects corresponding to each learning data, such as dragging and dropping.
- the evaluation result presentation unit 122 may include a display device that displays a GUI screen, and an input device (mouse, touch panel, keyboard, etc.) for performing user operations on the GUI screen.
- the judgment basis calculation unit 123 calculates the judgment basis for inference on learning data by the machine learning model being learned in the model updating unit 102 .
- Grad-CAM Grad-weighted Class Activation Mapping
- LIME Land Interpretable Model-agnostic Explanations
- LIME Machine learning model inference using one or more XAI algorithms such as SHAP (SHApley Additive exPlanations), TCAV (Testing with Concept Activation Vectors) (see, for example, Non-Patent Document 3), which is an advanced form Calculate the basis for judgment.
- the evaluation result presentation unit 122 determines the inference of the machine learning model for the learning data in response to an operation (for example, a mouse over operation or a mouse button pressing operation) to an object corresponding to each learning data on the GUI screen. You may make it present the grounds of further.
- the judgment basis calculation of the machine learning model using the XAI algorithm such as Grad-CAM, LIME / SHAP, TCAV is performed in the learning system 100 instead of the learning data evaluation unit 120, and the calculation result of the judgment basis is learned. It may be passed from the system 100 to the learning data evaluation unit 120 .
- the physics simulation calculation unit 121 determines the physical characteristics of each piece of learning data and performs the physics simulation calculation each time learning is performed, and the evaluation result presentation unit 122 updates the GUI screen.
- the learning data evaluation unit 120 evaluates the learning data used for learning the machine learning model in the model updating unit 102 .
- the learning data evaluation unit 120 includes a physics simulation calculation unit 121 and an evaluation result presentation unit 122 .
- the physics simulation calculation unit 121 determines the physical characteristics of each piece of learning data based on the inference results of the learning data by the model during learning, and further calculates the motion information of each piece of learning data on a two-dimensional plane or three-dimensional space by physics simulation calculation.
- the evaluation result presenting unit 122 presents a GUI screen in which objects corresponding to each learning data are arranged based on the motion information determined based on the physical simulation calculation.
- the processing implemented in the learning data evaluation unit 120 will be described in detail.
- the model updating unit 102 learns the machine learning model 200 as shown in FIG.
- the machine learning model 200 is configured by, for example, a neural network, and performs learning using the learning data read from the learning data holding unit 101 .
- machine learning model 200 performs image classification. That is, the input data to the machine learning model 200 is an image, and the machine learning model 200 infers whether the subject included in the image is one of predefined labels 1 to 5, and the expected value for each label (or likelihood). For example, label 1 is horse, label 2 is cat, label 3 is dog, label 4 is cow, and label 5 is bird.
- the machine learning model 200 it is assumed that the inference results for the same learning data will change according to the number of times of learning (or the number of epochs). For example, when the number of times of learning is small and the learning is not progressing much, the machine learning model 200 outputs a low expected value for the correct label and a high threshold for the incorrect label, but the learning progress state , the machine learning model 200 gradually transitions to output a higher expected value for the correct label.
- FIG. 3 exemplifies the transition of the inference results of the machine learning model 200 according to the number of times of learning (the number of epochs) for the same learning data.
- the learning data is the inference result of the machine learning model 200 for the input image with label 3 as the correct answer. (However, for simplification of the drawing, specific expected values are entered only for the number of times of learning E1, E2, and E3, and details of the other number of times of learning are omitted.).
- the machine learning model 200 outputs a low expected value of "0.1" for the correct label 3 at the learning number E1 when learning has not progressed, while outputting "0.5” for the incorrect label 1. output a high expected value. After that, as the number of times of learning increases E2 times and E3 times, the learning of the machine learning model 200 progresses, and gradually high expected values such as "0.5” and "0.8” are output for the correct label 3. , and low expected values such as "0.1” and "0.0" are output to the dictionary for label 1, which is an incorrect answer.
- the physics simulation calculation unit 121 determines the physical properties of each learning data based on the inference results of the learning data by the machine learning model learned by the model updating unit 102 . Since the inference result for the same learning data changes according to the number of times of learning, the physics simulation calculation unit 121, for example, uses a predetermined number of learning data to update the model parameters (or each epoch), Physical characteristics of each piece of learning data are determined based on the inference result of each piece of learning data by the machine learning model 200 .
- the physics simulation calculation unit 121 determines the physical properties of the learning data according to the inference results of the machine learning model 200 .
- the inference result of the learning data by the machine learning model 200 consists of the expected value of each label for the input data. Therefore, based on the expected value of the correct label, the physics simulation calculation unit 121 performs mass (gravitational force), buoyancy, attraction or repulsion acting between learning data, size (volume) and Determine physical quantities such as shape. Therefore, the evaluation result presenting unit 122 in the latter stage can express each piece of learning data as an object having physical properties determined by the physical simulation calculating unit 121 .
- the physics simulation calculation unit 121 determines a light mass and a small size for learning data with a low expected value for the correct label, and determines a heavy mass and a large size for learning data with a high expected value for the correct label. to decide. Therefore, we expect to express the inference result of the training data by using the property that heavy objects try to sink and light objects try to float. Also, when the expected value of the correct label gradually increases as the number of times of learning increases, a light mass is determined for the learning data at first, but is updated to a heavy mass after that.
- the buoyancy may be determined according to the size of the object, or may be determined based only on the expected value for the correct label without depending on the size of the object.
- an attractive force acts between learning data with matching labels for which a high expected value is output from the machine learning model 200, and a low expected value is output from the machine learning model 200 for the same label.
- Attractive force and repulsive force acting between learning data are determined such that a repulsive force acts between a learning model outputting a high expected value and a learning model outputting a high expected value. Therefore, learning data with high expected values for the same label are attracted by attraction.
- the inference result of the training data can be expressed by utilizing the property that training data with a high expectation value and training data with a low expectation value for the same label tend to separate due to repulsive force.
- the learning data in which the expected value of the correct label gradually increases as the number of learning times increases, the learning data first attracts the learning data of the incorrect label, but then the correct label is attracted. It will come to attract with the same learning data.
- FIG. 4 shows the transition example of the inference result according to the number of learning times (epoch number) for certain learning data shown in FIG.
- the machine learning model 200 When learning is not progressing and the number of times of learning is E1, the machine learning model 200 infers this learning data and outputs a low expected value of "0.1" for label 3, which is the correct answer. Therefore, the physics simulation calculation unit 121 determines a light mass and a small size for this learning data because the expected value to be output for the correct label is low at this point. The physics simulation calculation unit 121 may further determine that a large buoyant force acts on this learning data. In addition, since the machine learning model 200 outputs the highest expected value of “0.5” for label 1, which is an incorrect answer, the physics simulation calculation unit 121 selects another learning model with a high expected value for label 1. Determine the attractive forces acting between the data and this training data.
- the physics simulation calculation unit 121 outputs a higher expected value for the labels 3 to 5. Determine the repulsive force acting between this training data and other training data with .
- the machine learning model 200 outputs an average expected value of “0.2” for the label 2
- the physics simulation calculation unit 121 compares it with other learning data having a high expected value for the label 2. It is determined that neither attractive force nor repulsive force acts between this learning data.
- the machine learning model 200 Infers this learning data and outputs an intermediate expected value of "0.5" for the correct label 3. Therefore, the physics simulation calculation unit 121 determines an intermediate mass and size for this learning data because the expected value to be output for the correct label is not sufficiently high at this point. The physics simulation calculation unit 121 may reduce the buoyancy acting on this learning data. In addition, since the machine learning model 200 outputs the highest expected value of “0.5” for label 3, which is the correct answer, the physics simulation calculation unit 121 outputs other learning data having a high expected value for label 3. and this training data.
- the machine learning model 200 outputs expected values of "0.1” for incorrect labels 1 and 4, and "0.0" for incorrect label 5, which are lower than the average values
- the physics simulation calculation unit 121 determines the repulsive forces acting between this learning data and other learning data having high expected values for each of the labels 1, 4, and 5.
- the machine learning model 200 outputs an expected value of “0.3” for label 2, which is an incorrect answer, which is higher than the average value. determines the attractive force acting between the learning data of and this learning data.
- the machine learning model 200 Infers this learning data and outputs the highest expected value of "0.8" for label 3, which is the correct answer. Therefore, the physics simulation calculation unit 121 determines a heavy mass and a large size for this learning data because the expected value to be output for the correct label is the highest. The physics simulation calculation section 121 may make the buoyant force acting on this learning data extremely small. In addition, since the machine learning model 200 outputs the highest expected value of “0.8” for label 3, which is the correct answer, the physics simulation calculation unit 121 outputs other learning data having a high expected value for label 3. and this training data.
- the physics simulation calculation unit 121 determines the repulsive force acting between this learning data and other learning data having high expected values for each of the labels 1, 2, 4, and 5.
- a sufficiently high expected value can be output for label 3, which is the correct answer, so a heavy mass is given to this learning data.
- an attractive force acts in the E2 time, but a repulsive force acts in the E3 time.
- the physics simulation calculation unit 121 treats each learning data as an object having determined physical properties such as attractive force, repulsive force, mass, and size, and performs a physics simulation calculation on a two-dimensional plane or plane of the object corresponding to each learning data. Calculate motion information in a three-dimensional space.
- An example of a physics simulation operation is the Force-Directed (FD) method.
- the position information of the object D i corresponding to the i-th learning data is (x i , y i , z i ), and the physics simulation calculation unit 121 outputs the inference result of the machine learning model 200 for the i-th learning data (
- M i be the mass of the object D i
- S i be the size
- B i be the buoyancy of the object D i determined based on the expected value of the correct label.
- the attractive force represented by the following equation (1) or A repulsive force G ij acts.
- k is a constant (for example, universal gravitational constant)
- r ij is the distance between the object D i corresponding to the i-th learning data and the object D j corresponding to the j-th learning data.
- ⁇ ij is the label that the machine learning model 200 inferred the highest expected value and the lowest expected value for the i-th learning data, respectively, and the highest expected value for the j-th learning data. Takes values of 1, 0, or -1 based on label matching.
- ⁇ ij 1 means that when the labels with the highest expected values in the i-th learning data and the j-th learning data match, an attractive force acts between the object D i and the object D j corresponding to each learning data.
- the force F i acting on the object D i corresponding to the i -th learning data is the force (attractive force or repulsive force) G ij , the gravitational force M i g depending on the mass M i of the object D i , and the buoyant force B i of the object D i .
- the physics simulation calculation unit 121 sets a dynamic model in which the force shown in the above formula (2) acts on an object corresponding to each learning data used for learning of the machine learning model 200. Then, the physics simulation calculation unit 121 calculates two-dimensional or three-dimensional motion information of each object by physics simulation calculation.
- FIG. 5 illustrates a dynamic model consisting of three objects D i , D j , and D k each corresponding to training data.
- an attractive force acts between objects D i and D j
- a repulsive force acts between objects D i and D k .
- the gravitational force and buoyancy of each object D i , D j , and D k are omitted.
- a dynamic model consisting of bodies D i , D j , and D k is represented as a spring system in which the bodies are connected by springs. Each spring that connects bodies has a restoring force in either the direction of compression or extension.
- the position of an object corresponding to each learning data can be calculated so that the potential energy in such a spring system is minimized.
- a physics simulation calculation such as the Force-Directed method can be performed to calculate the position of each object so that the potential energy is minimized.
- the evaluation result presentation unit 122 presents each learning data to the machine learning model 200 based on the physical characteristics of each learning data determined by the physics simulation calculation unit 121 based on the inference results of the machine learning model 200. We will present the evaluation results when using it for learning.
- the physics simulation calculation unit 121 determines the physical properties of the learning data according to the inference results of the machine learning model 200, and treats each learning data as an object having the determined physical properties. Then, the movement information of the object on the two-dimensional plane or three-dimensional space corresponding to each learning data is calculated by physical simulation calculation.
- the evaluation result presenting unit 122 presents a GUI screen in which objects corresponding to each learning data are arranged based on the motion information determined based on the physical simulation calculation.
- an object for each learning data is displayed as an object having a size determined according to the inference result of the machine learning model 200, and is moved according to motion information calculated by physics simulation calculation.
- FIG. 6 shows a display example of a GUI screen in which the evaluation result presentation unit 122 maps objects representing each learning data based on the result of the physics simulation calculation in the physics simulation calculation unit 121.
- the objects corresponding to each learning data are all shown as circles or spheres for the sake of simplification of the drawing, but the objects may be other shapes such as square blocks or cubes.
- each piece of learning data may be represented by an object having a different shape.
- the objects may be displayed in different colors according to the correct label of the learning data.
- Objects corresponding to learning data with high expected values for correct labels are heavy, so they tend to sink downward on the GUI screen shown in FIG.
- objects in learning data whose labels with high expected values match are attracted to each other by a strong attractive force, so that they are mapped closer on the same GUI screen.
- an object corresponding to learning data with a low expected value for the correct label becomes lighter and tries to float upward on the same GUI screen.
- the attractive force acting on a light object is small, it is not attracted to objects corresponding to other training data that match labels with high expected values, and even if it is mapped at a location distant from other objects. good. Therefore, the evaluation result presentation unit 122 can be said to be a GUI screen that visually ranks and displays the evaluation results of each learning data used for learning.
- the object indicated by reference number 601 corresponds to a learning model that outputs a low expected value for the correct label, has a light and small size, and is mapped at a location distant from other objects.
- the inference result of the machine learning model 200 for the learning data corresponding to the object 601 has a low expected value for the correct label.
- a user for example, a developer of the machine learning model 200
- a sound effect may be output in accordance with a GUI operation such as moving the object 601 within the GUI screen or excluding the object 601 from the area.
- the user can prevent the learning data deleted through the GUI operation from being used for learning of the machine learning model 200 thereafter, thereby reducing time loss due to re-learning.
- a custom data set can be generated at the user's discretion, excluding one or more learning data deleted through a GUI operation that moves an object out of the area.
- the learning data holding unit 101 associates a data set customized for each user with, for example, identification information for each user, and provides a general data set provided by the learning data providing unit 130 or acquired from another source. You may make it distinguish and hold
- FIG. 8 shows another GUI operation example in which the user deletes learning data on the GUI screen shown in FIG.
- the physics simulation calculation unit 121 determines a light mass and a small size for an object corresponding to learning data with a low expected value of the correct label. It is small or floats above the GUI screen due to buoyancy.
- a threshold line representing the threshold set by the user is displayed.
- the user may directly specify the position of the threshold line 801 by performing a drag operation or the like on the GUI screen.
- An object that floats above the GUI screen beyond the threshold line 801 is an object that corresponds to learning data that has problems in using it for learning of the machine learning model 200, and is subject to automatic deletion. Instead of the user adjusting the position of the threshold line based on his or her own wishes, the position of the threshold line 801 is determined based on a threshold set in advance by the system, and learning corresponding to objects exceeding the threshold line 801 is performed. Data may be subject to automatic deletion. Also, by setting the threshold line 801, one or more learning data at positions beyond the threshold line 801 can be excluded, and a custom data set can be generated based on the user's judgment.
- learning data that is problematic for use in learning the machine learning model 200 can be deleted or automatically deleted at the request of the user. can be done. Also, the user can visually confirm learning data to be deleted through the GUI screen. Then, the model updating unit 102 can proceed with the learning of the machine learning model by excluding the learning data for which the delete operation has been performed on the GUI screen.
- the evaluation result presentation unit 122 visually ranks and displays the evaluation results of each learning data dynamically changing during learning of the machine learning model 200 through a GUI screen as shown in FIG. can do. Every time the machine learning model 200 during learning infers learning data in the model updating unit 102, physical characteristics are determined and physical simulation calculations are performed to update the GUI screen. It looks like it's moving. A user (for example, a developer of the machine learning model 200) appropriately selects each learning data while observing the change in the evaluation result of each learning data for each number of times of learning through a GUI screen as shown in FIG. be able to. For example, in the GUI screen shown in FIG. 9, the expected value of the correct label for the learning data corresponding to the object 601 increases each time learning is performed.
- the force acting on the object obtained by the physics simulation calculation does not reach an equilibrium state, multiple objects may be densely aggregated or displayed around the edge of the screen. In that case, a distance may be set between the objects, or a distance may be set between the object and the edge of the screen so that the user can appropriately recognize the object corresponding to the learning data.
- the average position of the cyclical motion may be calculated and displayed on the GUI screen. Also, it may be possible to input a command to temporarily stop the movement of an object on the GUI screen, and an icon or the like corresponding to such a command may be displayed on the GUI screen.
- the judgment basis calculation unit 123 calculates the judgment basis for the inference of the learning data by the machine learning model 200, and the evaluation result presentation unit 122 further calculates the judgment basis for the inference of the machine learning model for the learning data. It is designed to be presented.
- the judgment basis calculation unit 123 uses various XAI (eXplainable AI) algorithms such as Grad-CAM, LIME, SHAP, which is an advanced form of LIME, and TCAV, to calculate the judgment basis for inference of the machine learning model 200. do.
- the judgment basis calculation unit 123 calculates the judgment basis using one or a plurality of XAI algorithms for the inference label for which the machine learning model 200 outputs the highest expected value.
- the determination basis calculation unit 123 may further calculate the judgment basis for the labels with the second and subsequent highest expected values.
- Grad-CAM traces the gradient backward from the label that is the inference result of class classification in the output layer (calculates the contribution of each feature map up to class classification, and back-propagates with its weight. It is an algorithm for estimating the places in the input image data that have contributed to class classification, and the places that have contributed to class classification can be visualized like a heat map.
- the positional information of the pixels of the input image data is retained up to the final convolutional layer, and by obtaining the degree of influence of the positional information on the final discrimination output, the part of the original input image with strong influence is displayed as a heat map. You may do so.
- a machine learning model composed of a neural network when image recognition is performed on the input image and class c is output, a method of calculating the basis for judgment based on the Grad-CAM algorithm (method of generating a heat map) will be explained below.
- FIG. 10 shows an example of a judgment basis image with a heat map display calculated by the judgment basis calculation unit 123 based on the Grad-CAM algorithm.
- a heat map 1001 is superimposed on a portion of input image data 1000 that is the basis for the inference label for which the machine learning model 200 outputs the highest expected value.
- the original input image is an image in which a dog and a cat are photographed together, and is used for learning of the machine learning model 200 as learning data with a correct label of "dog (label 3)".
- the user refers to the image data with heat map display as shown in FIG. 10, and uses the original input image as learning data based on whether the area where the heat map is displayed represents the correct label. You can figure out if there are any issues.
- the correct label "dog” is correctly displayed as a heat map, so the user can determine that there is no problem in using this input image as learning data.
- FIG. 11 shows another example of a judgment basis image with heat map display calculated by the judgment basis calculation unit 123 for the same input image 1100 as in FIG. 10 based on the Grad-CAM algorithm.
- the inference label for which the machine learning model 200 outputs the highest expected value is “dog”, but the heat map 1101 is displayed in the input image 1100 in the area of the cat instead of the dog.
- a user can refer to image data in which a heat map is displayed in a region different from the inference labels, as shown in FIG. can be done.
- LIME estimates that if the output result of the neural network is reversed or greatly fluctuates when a specific input data item (feature value) is changed, that item is "highly important in determination". For example, the judgment basis calculation unit 123 generates another model (basis model) for local approximation to indicate the reason (basis) for inference in the machine learning model that the model updating unit 102 is learning. The determination basis calculation unit 123 generates a basis model that locally approximates the combination of the input image and the output result corresponding to the input information. Then, the judgment basis calculation unit 123 uses the basis model to generate basis information about the inference label for which the machine learning model during learning outputs the highest expected value, and generates a basis image as shown in FIG. It can be generated similarly to the Grad-CAM algorithm.
- TCAV is an algorithm that calculates the importance of Concepts (concepts that can be easily understood by humans) to the predictions of a trained model.
- the determination basis calculation unit 123 generates a plurality of pieces of input information by duplicating or modifying the input information (pathological image data), and creates a model (description target model) for which basis information is to be generated.
- a model description target model
- Each of a plurality of pieces of input information is input, and a plurality of pieces of output information corresponding to each piece of input information are output from the model to be explained.
- the determination basis calculation unit 123 learns a basis model using a combination (pair) of each of the plurality of input information and each of the corresponding plurality of output information as learning data, and selects the target input information as a target. Generate a basis model that locally approximates another interpretable model as . Then, when a label is output from the machine learning model under learning by the model updating unit 102, the determination basis calculation unit 123 uses the basis model to generate basis information related to the output label, which is shown in FIG. Such a basis image can be similarly generated.
- judgment basis calculation unit 123 may calculate the basis for the output label of the machine learning model being learned by the model updating unit 102 based on algorithms other than Grad-CAM, LIME/SHAP, and TCAV described above. good.
- the evaluation result presenting unit 122 performs the physical characteristics determined by the physical simulation computing unit 121 for each piece of learning data, and the physical characteristics based on the determined physical characteristics. Based on the calculation results, the user is presented with a GUI screen in which objects representing each learning data are mapped. Such a GUI screen visually ranks and displays the evaluation results for each piece of learning data, and the user can intuitively grasp whether or not there is a problem with the learning data through the GUI screen. However, when judging whether or not to exclude learning data, the user desires to check more detailed information about the learning data even if the learning data is visually ranked at a low rank.
- learning data has detailed information such as images (image file names), correct labels, and inference results (expected values for each label) by the machine learning model during learning.
- the judgment basis calculation unit 123 calculates the judgment basis using one or a plurality of XAI algorithms for the inference label for which the machine learning model 200 outputs the highest expected value.
- another method may be used to evaluate each piece of learning data.
- FIG. 12 shows an example of GUI operation for displaying detailed information of learning data.
- the user performs, for example, a mouse-over operation, a mouse-button press operation, a touch operation, or the like on an object of interest to the user on the GUI screen displaying the evaluation results of each learning data.
- a pop-up balloon 1201 describing detailed information of the image data corresponding to the object to be operated is displayed.
- FIG. 13 more specifically shows the detailed information of the learning data displayed on the GUI screen in response to the mouse operation on the object.
- the file name of the image to be input data, the correct label corresponding to the input data, the inference result (expected value for each label) by the machine learning model being trained, the input image, Judgment bases calculated using one or more XAI algorithms for the inference label output with the highest expected value are displayed.
- a slider bar 1302 may be provided, for example, at the right edge of the balloon 1301 so that the display range can be moved.
- FIG. 14 shows the processing procedure performed in the learning data evaluation section 120 in the form of a flow chart.
- the machine learning model 200 under learning infers the learning data in the model updating unit 102, the physical characteristics are determined and the physical simulation calculation is performed, and the GUI screen (FIGS. 6 to 9) is displayed. See also) are assumed to be updated.
- each time the model updating unit 102 updates model parameters using learning data it notifies the learning data evaluation unit 120 that the machine learning model has been updated.
- the learning data evaluation unit 120 is notified by the learning system 100 that the machine learning model has been updated (step S1401), the learning data evaluation unit 120 starts subsequent learning data evaluation processing.
- the learning data evaluation unit 120 basically evaluates all the learning data used for learning the machine learning model in the model update unit 102 .
- part of the learning data used for learning the machine learning model may be subject to evaluation, or part of the used learning data may be excluded from the subject of evaluation.
- step S1404 the learning data evaluation unit 120 selects one of them as target data (step S1403), and calculates inference for that target data. (S1404).
- the machine learning model under learning may be used to calculate a forward simulation of the target data, or the inference result of the machine learning model under learning is acquired from the model updating unit 102. You may do so.
- the inference result of the machine learning model consists of the expected value for each output label of the machine learning model for the target data.
- the physics simulation calculation unit 121 determines physical properties such as the mass and size of the target data and the acting force (attractive force or repulsive force) between other learning data based on the inference results of the machine learning model for the target data. (Step S1405).
- the physics simulation calculation unit 121 determines the physical properties of all the target data (No in step S1402), it performs the physics simulation calculation on the object corresponding to each target data (step S1406).
- the force-directed method is used to perform physical simulation calculations to calculate the motion of an object corresponding to each target data that minimizes the potential energy.
- the evaluation result presentation unit 122 presents a GUI screen in which a plurality of objects corresponding to each target data are arranged and visually ranked based on the motion information determined based on the physics simulation calculation in step S1406. (Step S1407). Every time the machine learning model 200 during learning infers learning data in the model updating unit 102, physical characteristics are determined and physical simulation calculations are performed to update the GUI screen. appear to be moving (see, eg, FIG. 9).
- FIG. 15 shows an example of the hardware configuration of an information processing system 1500 that operates as the learning data evaluation unit 120.
- the information processing system 1500 is configured using, for example, a personal computer.
- the learning data evaluation unit 120 includes functions such as a physics simulation calculation unit 121, an evaluation result presentation unit 122, and a judgment basis calculation unit 123.
- the information processing system 1500 may be the same system as the learning system 100 or may be a system configured independently from the learning system 100 .
- the illustrated information processing system 1500 includes a CPU (Central Processing Unit) 1501, a ROM (Read Only Memory) 1502, a RAM (Random Access Memory) 1503, a host bus 1504, a bridge 1505, an expansion bus 1506, an interface It includes a unit 1507 , an input device 1508 , an output device 1509 , a storage device 1510 , a drive 1511 and a communication device 1513 .
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- the CPU 1501 functions as an arithmetic processing device and a control device, and controls the overall operation of the information processing system 1500 according to various programs. Further, the information processing system 1500 may further include a GPU or GPGPU (General-Purpose Computing on Graphics Processing Units) in addition to the CPU 1501 as an arithmetic processing unit.
- a GPU or GPGPU General-Purpose Computing on Graphics Processing Units
- the ROM 1502 nonvolatilely stores programs (basic input/output system, etc.) used by the CPU 1501, operation parameters, and the like.
- the RAM 1503 is used to load programs used in the execution of the CPU 1501 and to temporarily store parameters such as work data that change as appropriate during program execution.
- the programs loaded into the RAM 1503 and executed by the CPU 1501 are, for example, various application programs and an operating system (OS).
- OS operating system
- the CPU 1501, ROM 1502 and RAM 1503 are interconnected by a host bus 1504 comprising a CPU bus or the like.
- the CPU 1501 can implement various functions and services by executing various application programs under the execution environment provided by the OS through cooperative operations of the ROM 1502 and the RAM 1503 .
- the cooperative operation of the CPU 1501, the ROM 1502, and the RAM 1503 realizes the function of the learning data evaluation unit 120, and determines the physical characteristics of the learning data used for learning the machine learning model, performs physical simulation calculation, and performs physical simulation. It realizes GUI screen presentation of evaluation results of learning data based on calculation results, basis calculation of learning data inference by machine learning models, and the like.
- a host bus 1504 is connected to an expansion bus 1506 via a bridge 1505 .
- the expansion bus 1506 is, for example, PCI standardized by PCI-SIG (Peripheral Component Interconnect Special Interest Group) or PCIe (PCI Express).
- PCI-SIG Peripheral Component Interconnect Special Interest Group
- PCIe PCI Express
- the interface 1507 connects external devices or peripheral devices such as the input device 1508, the output device 1509, the storage device 1510, the drive 1511, and the communication device 1513 according to the standard of the expansion bus 1506.
- external devices or peripheral devices such as the input device 1508, the output device 1509, the storage device 1510, the drive 1511, and the communication device 1513 according to the standard of the expansion bus 1506.
- not all external devices or peripheral devices shown in FIG. may further include:
- the input device 1508 is composed of an input control circuit and the like that generates an input signal based on an input from the user and outputs the signal to the CPU 1501 .
- the input device 1508 is, for example, at least one of a mouse, keyboard, touch panel, button, microphone, switch, and lever.
- the input device 1508 is used, for example, by a user (machine learning model developer) to operate an object corresponding to learning data on a GUI screen (see FIG. 7) or input other instructions. Used.
- the output device 1509 includes, for example, a liquid crystal display (LCD) device, an organic EL (Electro-Luminescence) display device, and a display device such as an LED (Light Emitting Diode), and displays various data such as video data as an image or text. do.
- the output device 1509 includes an audio output device such as a speaker and headphones, and converts audio data into audio and outputs the audio.
- the storage device 1510 is composed of a large-capacity storage device such as an SSD (Solid State Drive) or HDD (Hard Disk Drive).
- the storage device 1510 stores files such as programs executed by the CPU 1501 and various data.
- the removable storage medium 1512 is a cartridge type storage medium such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- Drive 1511 performs read and write operations to removable storage media 1513 installed therein.
- the drive 1511 outputs data read from the removable recording medium 1512 to the RAM 1503 and writes data on the RAM 1503 to the removable recording medium 1512 .
- the drive 1511 may be built into the housing of the information processing system 1500 or externally attached.
- the communication device 1513 is a device for connecting to an external network such as a LAN (Local Area Network) or the Internet, and is composed of a network interface card (NIC), for example.
- a LAN Local Area Network
- NIC network interface card
- the present disclosure is applied to a learning system that performs machine learning model learning for image classification, but the gist of the present disclosure is not limited to this.
- learning data according to the present disclosure can be evaluated for machine learning models that perform various inferences such as speech recognition, character recognition, and data generation.
- the machine learning model may be a model using a model such as support vector regression, Gaussian process regression, or the like, in addition to being configured with a neural network.
- An information processing method for processing learning data used for learning a machine learning model a determining step of determining characteristics of each training data based on inference results of the machine learning model for the training data; a presentation step of presenting an evaluation result of the learning data based on the determined characteristics;
- the determining step determines the characteristics of each learning data, and the presenting step presents an evaluation result of the learning data.
- the information processing method according to any one of (1) to (7) above.
- An information processing system for processing learning data used for learning a machine learning model a determination unit that determines characteristics of each learning data based on the inference result of the machine learning model for the learning data; a presentation unit that presents an evaluation result of the learning data based on the determined characteristics;
- Information processing system including;
- the determination unit determines physical properties of objects corresponding to each learning data based on the inference results of the machine learning model, and performs physical simulation calculations between objects having the determined physical properties.
- the presentation unit presents an object corresponding to each learning data based on the result of the physics simulation calculation.
- the determination unit determines the physical characteristics of the object corresponding to the learning data based on the expected value for each label output by the machine learning model for the learning data.
- the information processing system according to (10) above.
- the determination unit determines the mass, buoyancy, or size of the object corresponding to the learning data based on the expected value of the correct label.
- the determining unit determines such that an object corresponding to learning data with a large expected value of the correct label is heavy or large, and an object corresponding to learning data with a small expected value of the correct label is light or small. do, The information processing system according to (12) above.
- the determining unit determines at least one of an attractive force and a repulsive force acting between objects corresponding to each piece of learning data based on the expected value of the correct label.
- the information processing system according to either (11) or (12) above.
- the determining unit determines an attractive force acting between objects corresponding to learning data whose labels with high expected values match.
- the determining unit determines a repulsive force acting between an object corresponding to a learning model outputting a low expected value and an object corresponding to a learning model outputting a high expected value for the same label. decide, The information processing system according to either (13) or (13-1) above.
- the determination unit calculates movement information of each object by the physics simulation calculation based on the physical characteristics determined for the object corresponding to each learning data,
- the presentation unit moves and displays each object on the screen of the display device based on the movement information calculated by the determination unit.
- the information processing system according to any one of (10) to (13) above.
- the presenting unit further presents detailed information on learning data corresponding to the object in response to a predetermined operation performed on the object displayed on the screen through the input unit.
- the information processing system according to (15) above.
- (16-1) further comprising a calculation unit that calculates the grounds for the inference judgment of the machine learning model for the learning data;
- the presentation unit presents the detailed information including the judgment basis calculated by the calculation unit.
- the determination unit determines characteristics of each learning data, and the presentation unit presents an evaluation result of the learning data.
- the information processing system according to any one of (9) to (16) above.
- a first device including the determining unit; a second device including the presentation unit; including, The information processing system according to any one of (9) to (17) above.
- the second device includes a display device for displaying an evaluation result of learning data based on the determined characteristics on a screen, and an input unit for inputting a user's operation on the screen.
- (20) further comprising a third device including a model update unit that updates the machine learning model by learning using the learning data;
- a third device including a model update unit that updates the machine learning model by learning using the learning data;
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定ステップと、
前記決定した特性に基づく学習データの評価結果を提示する提示ステップと、
を有する情報処理方法である。
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定部と、
前記決定した特性に基づく学習データの評価結果を提示する提示部と、
を含む情報処理システムである。
B.システム構成
C.学習データの評価
C-1.学習データに対する物理特性の決定
C-2.学習データに対する評価結果の提示
C-3.機械学習モデルによる推論の判断の根拠の明示
C-4.詳細情報の提示方法ついて
C-5.処理手順
D.システム構成
人工知能は、例えばニューラルネットワークやサポートベクタ回帰、ガウス過程回帰などの型を用いたモデルからなる。本明細書では、便宜上、ニューラルネットワーク型のモデルを利用した実施形態を中心に説明するが、本開示は特定のモデル型に限定されず、ニューラルネットワーク以外のモデルに対しても同様に適用可能である。人工知能の利用は、モデルの学習を行う「学習フェーズ」と学習済みのモデルを使って推論を行う「推論フェーズ」からなる。推論は、画像認識や音声認識などの認識処理や、事象の推定や予測を行う予測処理を含む。以下では主に画像分類などの分類問題に人工知能を適用する実施例について説明する。
図1には、本開示を適用した学習システム100の機能的構成例を示している。図示の学習システム100は、例えばエッジデバイスに搭載して用いられるが、学習システム100の機能の一部又は全部がクラウド又は大規模演算が可能な演算装置上に構築されてもよい。以下では、学習システム100は、物体認識や顔認識など、主に画像分類を行う機械学習モデルの学習を行うものとして説明する。但し、本開示はこれに限定されるものではなく、学習システム100は画像分類以外の推論を行う機械学習モデルの学習を行うものであっても構わない。
学習データ評価部120は、モデル更新部102において機械学習モデルの学習に使用した学習データを評価する。本実施形態では、学習データ評価部120は、物理シミュレーション演算部121と、評価結果提示部122を含んでいる。物理シミュレーション演算部121は、学習中のモデルによる学習データの推論結果に基づいて各学習データの物理特性を決定し、さらに物理シミュレーション演算により各学習データの2次元平面又は3次元空間上の動き情報を算出する。そして、評価結果提示部122は、物理シミュレーション演算に基づいて決定された動き情報に基づいて、各学習データに対応する物体を配置したGUI画面を提示する。このC項では、学習データ評価部120において実現される処理について詳細に説明する。
物理シミュレーション演算部121は、モデル更新部102において学習した機械学習モデルによる学習データの推論結果に基づいて、各学習データの物理特性を決定する。学習回数に応じて同じ学習データに対する推論結果が推移していくことから、物理シミュレーション演算部121は、例えば所定個数の学習データを使用してモデルパラメータを更新する度(又は、エポック毎)に、機械学習モデル200による各学習データの推論結果に基づいて、各学習データの物理特性を決定する。
評価結果提示部122は、物理シミュレーション演算部121が機械学習モデル200による推論結果に基づいて決定した学習データ毎の物理特性に基づいて、各学習データを機械学習モデル200の学習に使用する際の評価結果を提示する。上記C-1項で説明したように、物理シミュレーション演算部121は、機械学習モデル200による推論結果に応じた学習データの物理特性を決定し、各学習データを決定した物理特性を持つ物体とみなして物理シミュレーション演算により各学習データに対応する物体の2次元平面又は3次元空間上の動き情報を算出する。そして、評価結果提示部122は、物理シミュレーション演算に基づいて決定された動き情報に基づいて、各学習データに対応する物体を配置したGUI画面を提示する。このGUI画面上では、学習データ毎の物体は、機械学習モデル200の推論結果に応じて決定された大きさを持つ物体として表示され、物理シミュレーション演算によって算出された動き情報に従って動かされる。
人工知能が推論結果に至った過程がブラックボックス化されて、その判断の根拠が分かり難いという問題がある。そこで、本実施形態では、判断根拠計算部123が機械学習モデル200による学習データに対する推論の判断の根拠を計算し、評価結果提示部122が学習データに対する機械学習モデルの推論の判断の根拠をさらに提示するようになっている。
クラスcの勾配ycが特徴マップの活性化Akであると仮定すると、下式(3)に示すようにニューロンの重要度の重みが与えられる。
LIMEは、特定の入力データ項目(特徴量)を変化させた際にニューラルネットワークの出力結果が反転又は大きく変動すれば、その項目を「判定における重要度が高い」と推定する。例えば、判断根拠計算部123は、モデル更新部102が学習を行っている機械学習モデルにおける推論の理由(根拠)を示すために局所近似する他のモデル(根拠用モデル)を生成する。判断根拠計算部123は、入力画像とその入力情報に対応する出力結果との組合せを対象に、局所的に近似する根拠用モデルを生成する。そして、判断根拠計算部123は、根拠用モデルを用いて、学習中の機械学習モデルが最も高い期待値を出力した推論ラベルに関する根拠情報を生成して、図10に示したような根拠画像をGrad-CAMアルゴリズムと同様に生成することができる。
TCAVは、訓練済みモデルの予測に対するConcept(人間が簡単に理解できるような概念)の重要度を計算するアルゴリズムである。例えば、判断根拠計算部123は、入力情報(病理画像データ)を複製したり、変更を加えたりした複数の入力情報を生成して、根拠情報の生成対象となるモデル(説明対象モデル)に、複数の入力情報の各々を入力し、各入力情報に対応する複数の出力情報を説明対象モデルから出力させる。次いで、判断根拠計算部123は、複数の入力情報の各々と、対応する複数の出力情報の各々との組合せ(ペア)を学習用データとして、根拠用モデルを学習して、対象入力情報を対象として別の解釈可能なモデルで局所近似する根拠用モデルを生成する。そして、判断根拠計算部123は、モデル更新部102により学習中の機械学習モデルからラベルが出力されると、根拠用モデルを用いて、その出力ラベルに関する根拠情報を生成して、図10に示したような根拠画像を同様に生成することができる。
図6を参照しながら既に説明したように、評価結果提示部122は、、物理シミュレーション演算部121が学習データ毎に決定した物理特性、及び決定した物理特性に基づく物理シミュレーション演算の結果に基づいて、各学習データを表す物体をマッピングしたGUI画面をユーザに提示する。このようなGUI画面は、各学習データに対する評価結果を視覚的にランク付けして表示しており、ユーザはGUI画面を通じて学習データについての課題の有無などを直感的に把握することができる。但し、学習データを除外するか否かを判断する際には、視覚的には低ランクに位置付けされていても、ユーザは学習データについてより詳細な情報を確認したいという要望がある。
図14には、学習データ評価部120において実施される処理手順をフローチャートの形式で示している。図14に示す処理手順では、モデル更新部102において学習中の機械学習モデル200が学習データを推論する度に物理特性の決定及び物理シミュレーション演算を実施して、GUI画面(図6~図9を参照のこと)を更新することを想定している。
図15には、学習データ評価部120として動作する情報処理システム1500のハードウェア構成例を示している。情報処理システム1500は、例えばパーソナルコンピュータを用いて構成されるが、機能的には学習データ評価部120は、物理シミュレーション演算部121と、評価結果提示部122と、判断根拠計算部123等の機能モジュールを含んでいる。情報処理システム1500は、学習システム100と同一のシステムであってもよいし、学習システム100とは独立して構成されたシステムであってもよい。
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定ステップと、
前記決定した特性に基づく学習データの評価結果を提示する提示ステップと、
を有する情報処理方法。
前記提示ステップでは、前記物理シミュレーション演算の結果に基づいて、各学習データに対応する物体を提示する、
上記(1)に記載の情報処理方法。
上記(2)に記載の情報処理方法。
上記(3)に記載の情報処理方法。
上記(4)に記載の情報処理方法。
上記(3)又は(4)のいずれかに記載の情報処理方法。
上記(5)に記載の情報処理方法。
上記(5)又は(5-1)のいずれかに記載の情報処理方法。
前記提示ステップでは、前記決定ステップで算出された動き情報に基づいて、表示装置の画面上に各物体を動かして表示させる、
上記(2)乃至(5)のいずれかに記載の情報処理方法。
上記(6)に記載の情報処理方法。
上記(7)に記載の情報処理方法。
上記(1)乃至(7)のいずれかに記載の情報処理方法。
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定部と、
前記決定した特性に基づく学習データの評価結果を提示する提示部と、
を含む情報処理システム。
前記提示部は、前記物理シミュレーション演算の結果に基づいて、各学習データに対応する物体を提示する、
上記(9)に記載の情報処理システム。
上記(10)に記載の情報処理システム。
上記(11)に記載の情報処理システム。
上記(12)に記載の情報処理システム。
上記(11)又は(12)のいずれかに記載の情報処理システム。
上記(13)に記載の情報処理システム。
上記(13)又は(13-1)のいずれかに記載の情報処理システム。
前記提示部は、前記決定部が算出された動き情報に基づいて、表示装置の画面上に各物体を動かして表示させる、
上記(10)乃至(13)のいずれかに記載の情報処理システム。
上記(14)に記載の情報処理システム。
上記(15)に記載の情報処理システム。
上記(15)に記載の情報処理システム。
前記提示部は、計算部によって計算された判断根拠を含む前記詳細情報を提示する、
上記(16)に記載の情報処理システム。
上記(9)乃至(16)のいずれかに記載の情報処理システム。
前記提示部を含む第2の装置と、
を含む、
上記(9)乃至(17)のいずれかに記載の情報処理システム。
上記(18)に記載の情報処理システム。
上記(18)又は(19)のいずれかに記載の情報処理システム。
103…モデルパラメータ保持部、111…推論部
112…データ入力処理部、120…学習データ評価部
121…物理シミュレーション演算部、122…評価結果提示部
123…判断根拠計算部、130…学習データ提供部
1500…情報処理システム、1501…CPU、1502…ROM
1503…RAM、1504…ホストバス、1505…ブリッジ、
1506…拡張バス、1507…インターフェース部
1508…入力装置、1509…出力装置、1510…ストレージ装置
1511…ドライブ、1512…リムーバブル記録媒体
1513…通信装置
Claims (20)
- 機械学習モデルの学習に用いられる学習データに関する処理を行う情報処理方法であって、
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定ステップと、
前記決定した特性に基づく学習データの評価結果を提示する提示ステップと、
を有する情報処理方法。 - 前記決定ステップでは、前記機械学習モデルの推論結果に基づいて各学習データに対応する物体の物理特性を決定するとともに、決定された物理特性をそれぞれ持つ物体間の物理シミュレーション演算を実施し、
前記提示ステップでは、前記物理シミュレーション演算の結果に基づいて、各学習データに対応する物体を提示する、
請求項1に記載の情報処理方法。 - 前記決定ステップでは、学習データに対して前記機械学習モデルが出力するラベル毎の期待値に基づいて、学習データに対応する物体の物理特性を決定する、
請求項2に記載の情報処理方法。 - 前記決定ステップでは、正解ラベルの期待値に基づいて学習データに対応する物体の質量、浮力、又は大きさを決定する、
請求項3に記載の情報処理方法。 - 前記決定ステップでは、正解ラベルの期待値に基づいて各学習データに対応する物体間に作用する引力又は斥力のうち少なくとも一方を決定する、
請求項3に記載の情報処理方法。 - 前記決定ステップでは、各学習データに対応する物体に決定した物理特性に基づいて、前記物理シミュレーション演算により各物体の動き情報を算出し、
前記提示ステップでは、前記決定ステップで算出された動き情報に基づいて、表示装置の画面上に各物体を動かして表示させる、
請求項2に記載の情報処理方法。 - 前記表示装置に画面上に表示された物体に対するユーザの操作を入力する入力ステップをさらに有する、
請求項6に記載の情報処理方法。 - 前記機械学習モデルの更新を行う度に、前記決定ステップにより各学習データの特性を決定して、前記提示ステップにより学習データの評価結果を提示する、
請求項1に記載の情報処理方法。 - 機械学習モデルの学習に用いられる学習データに関する処理を行う情報処理システムであって、
学習データに対する前記機械学習モデルの推論結果に基づいて各学習データの特性を決定する決定部と、
前記決定した特性に基づく学習データの評価結果を提示する提示部と、
を含む情報処理システム。 - 前記決定部は、前記機械学習モデルの推論結果に基づいて各学習データに対応する物体の物理特性を決定するとともに、決定された物理特性をそれぞれ持つ物体間の物理シミュレーション演算を実施し、
前記提示部は、前記物理シミュレーション演算の結果に基づいて、各学習データに対応する物体を提示する、
請求項9に記載の情報処理システム。 - 前記決定部は、学習データに対して前記機械学習モデルが出力するラベル毎の期待値に基づいて、学習データに対応する物体の物理特性を決定する、
請求項10に記載の情報処理システム。 - 前記決定部は、正解ラベルの期待値に基づいて学習データに対応する物体の質量、浮力、又は大きさを決定する、
請求項11に記載の情報処理システム。 - 前記決定部は、正解ラベルの期待値に基づいて各学習データに対応する物体間に作用する引力又は斥力のうち少なくとも一方を決定する、
請求項11に記載の情報処理システム。 - 前記決定部は、各学習データに対応する物体に決定した物理特性に基づいて、前記物理シミュレーション演算により各物体の動き情報を算出し、
前記提示部は、前記決定部が算出された動き情報に基づいて、表示装置の画面上に各物体を動かして表示させる、
請求項10に記載の情報処理システム。 - 前記表示装置に画面上に表示された物体に対するユーザの操作を入力する入力部をさらに含む、
請求項14に記載の情報処理システム。 - 前記提示部は、前記画面上に表示された物体に対して前記入力部を通じて所定の操作が行われたことに応答して、前記物体に対応する学習データに関する詳細情報をさらに提示する、
請求項15に記載の情報処理システム。 - 前記機械学習モデルの更新を行う度に、前記決定部が各学習データの特性を決定して、前記提示部が学習データの評価結果を提示する、
請求項9に記載の情報処理システム。 - 前記決定部を含む第1の装置と、
前記提示部を含む第2の装置と、
を含む、
請求項9に記載の情報処理システム。 - 前記第2の装置は、前記決定した特性に基づく学習データの評価結果を画面に表示する表示装置と、前記画面に対するユーザの操作を入力する入力部を含む、
請求項18に記載の情報処理システム。 - 学習データを用いた学習により前記機械学習モデルを更新するモデル更新部を含む第3の装置をさらに含む、
請求項18に記載の情報処理システム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22845589.5A EP4375890A1 (en) | 2021-07-23 | 2022-02-01 | Information processing method and information processing system |
CN202280050196.8A CN117730332A (zh) | 2021-07-23 | 2022-02-01 | 信息处理方法和信息处理系统 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021121122 | 2021-07-23 | ||
JP2021-121122 | 2021-07-23 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023002648A1 true WO2023002648A1 (ja) | 2023-01-26 |
Family
ID=84979072
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/003897 WO2023002648A1 (ja) | 2021-07-23 | 2022-02-01 | 情報処理方法及び情報処理システム |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4375890A1 (ja) |
CN (1) | CN117730332A (ja) |
WO (1) | WO2023002648A1 (ja) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018194919A (ja) | 2017-05-12 | 2018-12-06 | 富士通株式会社 | 学習プログラム、学習方法及び学習装置 |
JP2020140226A (ja) * | 2019-02-26 | 2020-09-03 | 三菱Ufj信託銀行株式会社 | 汎用人工知能装置及び汎用人工知能プログラム |
JP2020197875A (ja) | 2019-05-31 | 2020-12-10 | 富士通株式会社 | 解析プログラム、解析装置及び解析方法 |
JP2021060692A (ja) * | 2019-10-03 | 2021-04-15 | 株式会社東芝 | 推論結果評価システム、推論結果評価装置及びその方法 |
-
2022
- 2022-02-01 CN CN202280050196.8A patent/CN117730332A/zh active Pending
- 2022-02-01 WO PCT/JP2022/003897 patent/WO2023002648A1/ja active Application Filing
- 2022-02-01 EP EP22845589.5A patent/EP4375890A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018194919A (ja) | 2017-05-12 | 2018-12-06 | 富士通株式会社 | 学習プログラム、学習方法及び学習装置 |
JP2020140226A (ja) * | 2019-02-26 | 2020-09-03 | 三菱Ufj信託銀行株式会社 | 汎用人工知能装置及び汎用人工知能プログラム |
JP2020197875A (ja) | 2019-05-31 | 2020-12-10 | 富士通株式会社 | 解析プログラム、解析装置及び解析方法 |
JP2021060692A (ja) * | 2019-10-03 | 2021-04-15 | 株式会社東芝 | 推論結果評価システム、推論結果評価装置及びその方法 |
Non-Patent Citations (4)
Title |
---|
"Why Should I Trust You?", EXPLAINING THE PREDICTIONS OF ANY CLASSIFIER, Retrieved from the Internet <URL:https://arxiv.org/abs/1602.04938> |
GRAD-CAM: VISUAL EXPLANATIONS FROM DEEP NETWORKS VIA GRADIENT-BASED LOCALIZATION, Retrieved from the Internet <URL:https://arxiv.org/abs/1610.02391> |
GUMELAR AGUSTINUS BIMO: "An Anatomy of Machine Learning Data Visualization", 2019 INTERNATIONAL SEMINAR ON APPLICATION FOR TECHNOLOGY OF INFORMATION AND COMMUNICATION (ISEMANTIC), 21 September 2019 (2019-09-21), pages 1 - 6, XP033651025, DOI: 10.1109/ISEMANTIC.2019.8884340 * |
INTERPRETABILITY BEYOND FEATURE ATTRIBUTION: QUANTITATIVE TESTING WITH CONCEPT ACTIVATION VECTORS (TCAV, Retrieved from the Internet <URL:https://arxiv.org/pdf/1711.11279.pdf> |
Also Published As
Publication number | Publication date |
---|---|
EP4375890A1 (en) | 2024-05-29 |
CN117730332A (zh) | 2024-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111489412B (zh) | 用于使用神经网络生成基本逼真图像的语义图像合成 | |
Jabri et al. | Unsupervised curricula for visual meta-reinforcement learning | |
KR102532749B1 (ko) | 약한 지도 학습에 기초한 뉴럴 네트워크의 계층적 학습 방법 및 장치 | |
KR102535411B1 (ko) | 메트릭 학습 기반의 데이터 분류와 관련된 장치 및 그 방법 | |
US20230150127A1 (en) | Optimizing policy controllers for robotic agents using image embeddings | |
US20190130275A1 (en) | Gradient normalization systems and methods for adaptive loss balancing in deep multitask networks | |
WO2019099305A1 (en) | Meta-learning for multi-task learning for neural networks | |
US9111375B2 (en) | Evaluation of three-dimensional scenes using two-dimensional representations | |
EP3676765A1 (en) | Using hierarchical representations for neural network architecture searching | |
JP7144699B2 (ja) | 信号変更装置、方法、及びプログラム | |
WO2020159890A1 (en) | Method for few-shot unsupervised image-to-image translation | |
JP2007280053A (ja) | データ処理装置、データ処理方法、およびプログラム | |
KR102046113B1 (ko) | 신경망 학습 방법 및 그 장치 | |
CN109447096B (zh) | 一种基于机器学习的扫视路径预测方法和装置 | |
WO2020104499A1 (en) | Action classification in video clips using attention-based neural networks | |
JP2021526678A (ja) | 画像処理方法、装置、電子装置及び記憶媒体 | |
Huttunen | Deep neural networks: A signal processing perspective | |
WO2023002648A1 (ja) | 情報処理方法及び情報処理システム | |
EP1837807A1 (en) | Pattern recognition method | |
Conradt et al. | Automated plankton classification with a dynamic optimization and adaptation cycle | |
Ishiwaka et al. | DeepFoids: Adaptive Bio-Inspired Fish Simulation with Deep Reinforcement Learning. | |
WO2024103345A1 (zh) | 一种多任务认知的类脑建模方法 | |
Sadeghipour et al. | Social motorics–towards an embodied basis of social human-robot interaction | |
Gilley | Comparison of Search Algorithms in Two-Stage Neural Network Training for Optical Character Recognition of Handwritten Digits | |
Tveter | Exploring High Dimensional, Sparse Reward Problems Using Deep Learning and Neuroevolution |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22845589 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280050196.8 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022845589 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022845589 Country of ref document: EP Effective date: 20240223 |