WO2022195691A1 - 情報処理装置、情報処理方法及び情報処理プログラム - Google Patents
情報処理装置、情報処理方法及び情報処理プログラム Download PDFInfo
- Publication number
- WO2022195691A1 WO2022195691A1 PCT/JP2021/010452 JP2021010452W WO2022195691A1 WO 2022195691 A1 WO2022195691 A1 WO 2022195691A1 JP 2021010452 W JP2021010452 W JP 2021010452W WO 2022195691 A1 WO2022195691 A1 WO 2022195691A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- deep learning
- learning model
- label
- loss
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims description 31
- 238000003672 processing method Methods 0.000 title claims description 9
- 238000013136 deep learning model Methods 0.000 claims abstract description 117
- 238000004364 calculation method Methods 0.000 claims abstract description 50
- 238000000605 extraction Methods 0.000 claims description 33
- 238000000034 method Methods 0.000 claims description 16
- 230000004044 response Effects 0.000 abstract 2
- 238000010586 diagram Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 241000251468 Actinopterygii Species 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/0895—Weakly supervised learning, e.g. semi-supervised or self-supervised learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the present invention relates to an information processing device, an information processing method, and an information processing program.
- One is a technique for acquiring feature extraction ability, which is the ability to extract image features used for recognition from unlabeled data. Specifically, features are extracted from unlabeled data using a deep learning model, and based on the extracted features, the data are grouped together and divided into multiple clusters. Acquire feature extraction ability by assigning labels and performing learning.
- the other is a technology that gives the deep learning model the ability to extract features that have been acquired in advance, and then performs learning with labeled data, limited to the ability to identify data based on the extracted features.
- This technology is called Transfer Learning.
- the feature extraction ability and discrimination ability of the deep learning model are different from each other. Individually learned and optimized. That is, the ability to extract features from unlabeled data optimizes the ability to extract features, and the learning limited to the ability to identify data based on the extracted features optimizes the ability to discriminate. For this reason, when each process is performed in order, it becomes difficult to tune the feature quantity extraction ability in accordance with the identification ability, resulting in a local optimum solution. As a result, the overall recognition performance of the deep learning model is low.
- the disclosed technology has been made in view of the above, and aims to provide an information processing device, an information processing method, and an information processing program that improve the recognition performance of a deep learning model.
- the storage unit stores a plurality of labeled data in which a label representing a correct answer and target data are associated with each other, A plurality of unlabeled data, which are target data without , and a deep learning model are stored.
- a pseudo label generator generates a pseudo label based on the unlabeled data and the deep learning model.
- the loss calculation unit identifies the unlabeled data and the labeled data using the deep learning model based on the pseudo label and the label included in the labeled data. Calculate the loss of
- the updating unit updates the deep learning model based on the loss calculated by the loss calculating unit.
- the information processing device information processing method, and information processing program disclosed in the present application, it is possible to improve the recognition performance of the deep learning model.
- FIG. 1 is a block diagram of a learning device according to the first embodiment.
- FIG. 2 is a diagram for explaining the learning method according to the first embodiment.
- FIG. 3 is an overall flowchart of learning processing according to the first embodiment.
- FIG. 4 is a flow chart of simultaneous learning using labeled data and unlabeled data.
- FIG. 5 is a block diagram of a learning device according to the second embodiment.
- FIG. 6 is a diagram for explaining the learning method according to the second embodiment.
- FIG. 7 is a diagram showing an example of learning data used in the third embodiment.
- FIG. 8 is a diagram illustrating an example of a hardware configuration of a learning device;
- FIG. 1 is a block diagram of the learning device according to the first embodiment.
- a learning device 1 which is an information processing device according to the present embodiment, learns a deep learning model 110 that recognizes image data.
- image data is specifically data represented as a set of RGB (Red Green Blue) values in each pixel displayed on the screen.
- the learning device 1 includes a storage unit 11, a pseudo label generation unit 12, a model output unit 13, a loss calculation unit 14, and an update unit 15, as shown in FIG.
- the storage unit 11 stores a deep learning model 110, an unlabeled DB (Data Base) 111, and a labeled DB 112.
- the deep learning model 110 is a learning model that performs image recognition in this embodiment.
- the deep learning model 110 has a feature amount extraction layer for extracting features of image data and an identification layer for identifying an object appearing in the image data from the feature amount of the image data.
- the unlabeled DB 111 is a database that stores unlabeled data 201, which is image data.
- the unlabeled DB 111 stores unlabeled data 201 input by a user using an external terminal device or the like.
- the unlabeled data 201 is learning data that does not have a correct label indicating what the object in the image data is.
- the labeled DB 112 is a database that stores labeled data 202, which is image data.
- the labeled DB 112 stores labeled data 202 input by a user using an external terminal device or the like.
- the labeled data 202 is learning data with a correct label.
- the pseudo label generation unit 12 acquires the deep learning model 110 stored in the storage unit 11. Also, the pseudo-label generator 12 reads a plurality of unlabeled data 201 from the unlabeled DB 111 . At this time, the pseudo-label generator 12 preferably reads all unlabeled data 201 . Next, the pseudo-label generation unit 12 inputs each unlabeled data 201 included in the loaded image group to the deep learning model 110 to acquire outputs corresponding to each of the unlabeled data 201 .
- the pseudo-label generating unit 12 collects each unlabeled data 201 included in the read image group according to the output value from the deep learning model 110 and divides them into a predetermined number of clusters. For example, the pseudo-label generator 12 performs clustering using k-means clustering.
- the pseudo-label generation unit 12 assigns pseudo-labels that are pseudo-correct answers to each cluster. For example, if there are k classes, the pseudo-label generator 12 assigns pseudo-labels such as class #1, class #2, class #3, . . . , class #k. After that, the pseudo-label generating unit 12 outputs the information of the unlabeled data 201 included in each cluster and the pseudo-label assigned to each cluster to the loss calculating unit 14 .
- the model output unit 13 acquires outputs from the deep learning models 110 of the unlabeled data 201 and the labeled data 202, respectively.
- the model output section 13 has a first model output section 131 and a second model output section 132 .
- the loss calculation unit 14 compares the output value from the deep learning model 110 with the pseudo label or the label given to the labeled data 202, and calculates each loss.
- the loss calculator 14 has a first loss calculator 141 and a second loss calculator 142 . Details of the operations of the model output unit 13 and the loss calculation unit 14 will be described below.
- the first model output unit 131 acquires the deep learning model 110 stored in the storage unit 11.
- the first model output unit 131 also reads a plurality of unlabeled data 201 used for learning the deep learning model 110 from the unlabeled DB 111 .
- the first model output unit 131 inputs each unlabeled data 201 included in the read image group to the feature amount extraction layer of the deep learning model 110, and the deep learning model 110 corresponding to each unlabeled data 201. get the output from For example, when the read image group is D u and the unlabeled data 201 included in D u is x u , the first model output unit 131 uses the following formula (1) to output the deep learning model 110 Get y u that is
- f represents the feature amount extraction layer of the deep learning model 110 . That is, f(x u ) represents the output from the feature quantity extraction layer.
- h unsup represents a discrimination layer for unlabeled data of the deep learning model 110 . That is, h unsup (f(x u )) is an output obtained by inputting the output from the feature quantity extraction layer to the discrimination layer.
- the first model output unit 131 outputs the output value of the deep learning model 110 for each unlabeled data 201 to the first loss calculation unit 141 of the loss calculation unit 14 .
- the first model output unit 131 outputs yu , which is the output of the deep learning model 110 , to the first loss calculation unit 141 .
- the first loss calculator 141 calculates the loss when the unlabeled data 201 is used. Below, a loss may be called Loss.
- the first loss calculation unit 141 receives the input of the output value from the deep learning model 110 for the unlabeled data 201 from the first model output unit 131 . Further, the first loss calculation unit 141 receives from the pseudo label generation unit 12 input of a pseudo label for each cluster created by clustering the unlabeled data 201 together with information of the unlabeled data 201 included in each cluster.
- the first loss calculation unit 141 compares the acquired output value and the pseudo label, and the unlabeled data that is the error between the estimation result using the deep learning model 110 and the pseudo label that is the correct answer here Calculate Loss when 201 is used.
- the first loss calculator 141 calculates LossL unsup , which is the Loss when the unlabeled data 201 is used, using the following formula (2) for yu representing the acquired output value.
- CE represents a general cross-entropy loss
- the first loss calculator 141 outputs the calculated loss when using the unlabeled data 201 to the updating unit 15 .
- the first loss calculator 141 outputs the calculated L_unsup to the updater 15 .
- the second model output unit 132 acquires the deep learning model 110 stored in the storage unit 11.
- the second model output unit 132 also reads the labeled data 202 used for learning the deep learning model 110 from the labeled DB 112 .
- the second model output unit 132 inputs each labeled data 202 included in the read image group to the feature amount extraction layer of the deep learning model 110, and the deep learning model 110 corresponding to each labeled data 202. get the output from For example, if the read image group is D i and the unlabeled data 202 included in D i is xi , the second model output unit 132 uses the following formula (3) to output the deep learning model 110 Get y i where .
- f represents the feature amount extraction layer of the deep learning model 110 . That is, f(x i ) represents the output from the feature quantity extraction layer.
- h sup represents a discrimination layer for labeled data of the deep learning model 110 . That is, h sup (f(x i )) is the output obtained by inputting the output from the feature quantity extraction layer to the discrimination layer.
- the discrimination layer for the unlabeled data 201 and the discrimination layer for the unlabeled data 202 are individually learned.
- the second model output unit 132 outputs the output value of the deep learning model 110 for each labeled data 202 to the second loss calculation unit 142 .
- the second model output unit 132 outputs y i that is the output of the deep learning model 110 to the second loss calculation unit 142 .
- the second loss calculation unit 142 receives the input of the output value from the deep learning model 110 for the labeled data 202 from the second model output unit 132 . Further, the second loss calculation unit 142 acquires the label assigned to each labeled data 202 read by the model output unit 13 from the labeled DB 112 .
- the second loss calculation unit 142 compares the acquired output value with the label assigned to each labeled data 202, and the error between the estimation result using the deep learning model 110 and the correct label is Loss is calculated when some labeled data 202 is used. For example, the second loss calculation unit 142 calculates L sup which is Loss when the labeled data 202 is used for y i representing the acquired output value using the following formula (4).
- CE represents a general cross-entropy loss
- the second loss calculator 142 outputs the calculated loss when using the labeled data 202 to the updating unit 15 .
- the second loss calculator 142 outputs the calculated L sup to the updater 15 .
- the updating unit 15 receives the input of the loss when using the unlabeled data 201 from the first loss calculating unit 141 .
- the update unit 15 also receives an input of the loss when using the labeled data 202 from the second loss calculation unit 142 .
- the update unit 15 performs predetermined weighting on the estimation result when using the unlabeled data 201 and the estimation result when using the labeled data 202, and calculates the final loss.
- the update unit 15 uses the following formula (5) to obtain the final L total , which is a typical loss, is calculated.
- ⁇ is a parameter for adjusting the balance between L sup and L unsup , and is a constant that weights each. ⁇ takes a value greater than 0 and less than 1. The larger the value of ⁇ , the greater the influence of the estimation result when using the labeled data 202 on learning.
- the update unit 15 updates the parameters of the feature amount extraction layer of the deep learning model 110, the parameters of the discrimination layer for the unlabeled data 201, and the discrimination Find the layer parameters. Then, the update unit 15 updates the deep learning model 110 held by the model output unit 13 with the obtained parameters of the feature amount extraction layer of the deep learning model 110 and the parameters of the identification layer for the unlabeled data 201 . The update unit 15 also updates the deep learning model 110 held by the model output unit 13 with the obtained parameters of the feature amount extraction layer of the deep learning model 110 and the parameters of the identification layer for the labeled data 202 . For example, the update unit 15 updates the model output unit 13 and the deep learning model 110 held by the model output unit 13 by f, L sup and L unsup that minimize L total .
- the deep learning model 110 for the unlabeled data 201 and the deep learning model 110 for the labeled data 202 are trained separately and concurrently.
- the deep learning model 110 for the unlabeled data 201 and the deep learning model 110 for the labeled data 202 have the same feature extraction layer and different identification layers. Then, in the recognition phase after learning, unknown image data is recognized using the deep learning model 110 for the trained labeled data 202 held by the model output unit 13 .
- FIG. 2 is a diagram for explaining the learning method according to the first embodiment. Next, the overall flow of learning in this embodiment will be described with reference to FIG.
- a plurality of unlabeled data 201 and a plurality of labeled data 202 are prepared and stored in the unlabeled DB 111 and labeled DB 112, respectively.
- the unlabeled data is not given the correct answer, but the labeled data 202 is given labels such as flower, car, and fish.
- the first model output unit 131 and the second model output unit 132 perform feature extraction on the labeled data 201 and the labeled data 202, respectively, using the feature extraction layer of the deep learning model 110 (step S1).
- the pseudo-label generator 12 performs classification by clustering and addition of pseudo-labels.
- the first loss calculation unit 141, the second loss calculation unit 142, and the update unit 15 simultaneously perform learning using the unlabeled data 201 using the pseudo-label and learning using the labeled data 202 using the label ( steps S2 and S3).
- the feature extraction layer of the deep learning model 110, the discrimination layer for the unlabeled data 201, and the discrimination layer for the labeled data 202 are learned simultaneously.
- FIG. 3 is an overall flowchart of the learning process according to the first embodiment. Next, with reference to FIG. 3, the overall flow of learning processing according to the first embodiment will be described.
- the learning device 1 acquires the unlabeled data 201 and stores it in the unlabeled DB 111. Further, the learning device 1 acquires the labeled data 202 and stores it in the labeled DB 112 (step S11).
- the update unit 15 acquires the number of times threshold input from an external terminal device or the like (step S12).
- the update unit 15 initializes the number of times of learning and sets it to 0 (step S13).
- the pseudo-label generation unit 12 reads a plurality of unlabeled data 201 from the unlabeled DB 111, classifies them, generates a pseudo-label for each class, and assigns a pseudo-label to each class (step S14).
- the first model output unit 131 and the second model output unit 132, the first loss calculation unit 141 and the second loss calculation unit 142, and the update unit 15 perform simultaneous learning using the labeled data 202 and the unlabeled data 201. Execute (step S15).
- the update unit 15 determines whether or not the number of times of learning exceeds the number of times threshold (step S16). When the number of times of learning is equal to or less than the number of times threshold (step S16: No), the update unit 15 adds 1 to the number of times of learning and increments the number of times of learning (step S17). After that, the learning process returns to step S14.
- step S16 affirmative
- the updating unit 15 terminates the learning process in the learning device 1.
- FIG. 4 is a flowchart of simultaneous learning using labeled data and unlabeled data. Next, the flow of simultaneous learning using labeled data and unlabeled data will be described with reference to FIG. Each process shown in FIG. 4 corresponds to an example of the process executed in step S15 in FIG.
- the second model output unit 132 reads a plurality of labeled data 202 from the labeled DB 112 . Then, the second model output unit 132 inputs the read labeled data 202 to the feature amount extraction layer of the deep learning model 110 . After that, the second model output unit 132 acquires the output from the deep learning model 110 (step S101).
- the second loss calculation unit 142 acquires the label given to the labeled data 202 read by the second model output unit 132 from the labeled DB 112 . Then, the second loss calculation unit 142 compares the output value corresponding to each labeled data 202 acquired from the second model output unit 132 with the label given to the labeled data 202, and converts the labeled data 202 to Loss when using is calculated (step S102).
- the first model output unit 131 reads a plurality of unlabeled data 201 from the labeled DB 111 . Then, the first model output unit 131 inputs each read unlabeled data 201 to the feature amount extraction layer of the deep learning model 110 . After that, the first model output unit 131 acquires the output from the deep learning model 110 (step S103).
- the first loss calculator 141 compares the output value corresponding to each unlabeled data 201 acquired from the first model output unit 131 with the pseudo label acquired from the pseudo label generator 12, and uses the unlabeled data 201. Loss is calculated in the case where there is an error (step S104).
- the update unit 15 acquires the Loss when using the labeled data 202 from the second loss calculation unit 142 . Also, the update unit 15 acquires the Loss when the unlabeled data 201 is used from the first loss calculation unit 141 . Then, the updating unit 15 calculates the overall Loss using the respective weights of the Loss when using the labeled data 202 and the Loss when using the unlabeled data 201 (step S105).
- the update unit 15 updates the deep learning models 110 of the first model output unit 131 and the second model output unit 132 so as to minimize the overall loss (step S106).
- the learning apparatus divides unlabeled data into a plurality of clusters, assigns pseudo labels to each cluster, and performs deep learning using labeled data, unlabeled data, and pseudo labels. Run model training.
- the learning device can simultaneously learn the feature quantity extraction layer and the identification layer of the deep learning model using both labeled data and unlabeled data. Therefore, even when learning is performed using a large amount of unlabeled data and a small amount of labeled data, optimum recognition performance can be obtained, and the recognition performance of the deep learning model can be improved.
- FIG. 5 is a block diagram of a learning device according to the second embodiment.
- the learning device 1 according to the present embodiment differs from the first embodiment in that it performs single-task learning using one discrimination layer.
- descriptions of the functions of the same units as in the first embodiment will be omitted.
- the pseudo-label generation unit 12 performs clustering using the unlabeled data 201 in the same manner as in Example 1, and divides the unlabeled data 201 into a plurality of clusters. At this time, the pseudo-label generator 12 classifies the unlabeled data 201 into the same number of clusters as the number of labels represented by the labeled data 202 . Then, the pseudo label generator 12 assigns a pseudo label to each cluster. After that, the pseudo-label generator 12 outputs the generated pseudo-label to the loss calculator 14 .
- the model output unit 13 reads a plurality of unlabeled data 201 from the unlabeled DB 111.
- the model output unit 13 also reads a plurality of labeled data 202 from the labeled DB 112 .
- the model output unit 13 integrates the read unlabeled data 201 and labeled data 202 into integrated data.
- the model output unit 13 inputs the integrated data to the deep learning model 110 to obtain an output.
- the model output unit 13 acquires y, which is the output from the deep learning model 110 represented by the following formula (6).
- f represents the feature quantity extraction layer of the deep learning model 110. That is, f(x) is the output from the feature quantity extraction layer.
- h represents the discrimination layer of the deep learning model 110 . That is, h(f(x)) is the output obtained by inputting the output value from the feature quantity extraction layer to the discrimination layer.
- model output unit 13 outputs the output value for each integrated data to the loss calculation unit 14.
- the loss calculation unit 14 receives the input of the output value from the deep learning model 101 for each integrated data from the model output unit 13 . Further, the loss calculation unit 14 acquires from the labeled DB 112 a label representing each labeled data 201 stored in the labeled DB 112 . The loss calculation unit 14 also receives an input of a pseudo label for each class from the pseudo label generation unit 12 .
- the loss calculation unit 14 generates an integrated label by integrating the label acquired from the labeled DB 112 and the pseudo label. For example, since the number of labels acquired from the labeled DB 112 and the number of pseudo labels are the same, the loss calculation unit 14 replaces each of the pseudo labels with a label determined to indicate the same label to generate an integrated label.
- the loss calculation unit 14 compares the output value from the feature amount extraction layer of the deep learning model 101 for each integrated data with the integrated label corresponding to each integrated data, and calculates the loss when using the integrated data. do.
- the loss calculation unit 14 uses the following formula (7) to calculate L Calculate where CE is the general cross-entropy loss.
- the loss calculator 14 outputs the calculated loss to the updater 15 .
- the loss calculation unit 14 outputs to the update unit 15 L, which is Loss when integrated data is used, which is calculated using Equation (7).
- the updating unit 15 receives the loss input from the loss calculating unit 14 .
- the updating unit 15 determines the parameters of the deep learning model 110 that minimize the loss. After that, the update unit 15 updates the deep learning model 110 of the model output unit 13 using the determined parameters.
- the update unit 15 when the update unit 15 acquires L, which is Loss when integrated data is used, from the loss calculation unit 14, the update unit 15 updates the feature amount extraction layer f and the identification layer h so as to minimize L. do. That is, in this embodiment, both the unlabeled data 201 and the labeled data 202 are trained using one deep learning model 110 having similar feature extraction layers and identification layers.
- FIG. 6 is a diagram for explaining the learning method according to the second embodiment. Details of the learning device 1 according to the present embodiment will be described with reference to FIG.
- the model output unit 13 reads the unlabeled data 201 and the labeled data 202 to generate integrated data. Next, the model output unit 13 inputs the integrated data to the deep learning model 101 and acquires the output from the deep learning model 101 corresponding to each integrated data (step S201).
- the pseudo-label generation unit 12 performs clustering using the unlabeled data 201 in the same manner as in the first embodiment, and divides the unlabeled data 201 into the same number of clusters as the labels representing the labeled data 202 stored in the labeled DB 112. . Then, the pseudo label generator 12 assigns a pseudo label to each cluster (step S202).
- the loss calculation unit 14 integrates the pseudo label and the label representing the labeled data 202 stored in the labeled DB 112 to generate an integrated label. Then, the loss calculation unit 14 compares the output value corresponding to each integrated data with the integrated label to calculate the loss.
- the update unit 15 performs learning by updating the feature amount extraction layer and the identification layer of the deep learning model 110 of the model output unit 13 so as to minimize the loss calculated by the loss calculation unit 14 (step S203). .
- the learning device classifies unlabeled data into the same number of clusters as the number of labels representing labeled data. Then, the learning device generates integrated data by integrating the labeled data and the labeled unlabeled data, generates an integrated label by integrating the label of the labeled data and the pseudo label, and uses the integrated data and the integrated label do the learning.
- Example 3 will be described.
- Embodiments 1 and 2 the case of using image data as learning data has been described as an example, but it is possible to perform learning using unlabeled data and labeled data in the same way for data other than this. is.
- the learning device 1 can also learn the deep learning model 110 using moving images as learning data.
- a moving image is a set of RGB values in each pixel on the screen over time. In that case, using the trained deep learning model 110 makes it possible to identify the type of unknown moving image.
- FIG. 7 is a diagram showing an example of learning data used in the third embodiment.
- Joint data is data representing the spatial positions of joints of the human body such as wrists and elbows, as represented by points in the image 300 of FIG. For example, in a three-dimensional space, it is data represented by xyz coordinates, and in a two-dimensional plane, it is data represented by xy coordinates.
- sensor data such as acceleration information at each point when the person moves and information from a gyro sensor are added. In that case, by using the learned deep learning model 110, it is possible to identify what kind of human motion is.
- the learning device can learn a deep learning model using other data such as moving image data and joint data in addition to image data. And even when data other than image data is used, it is possible to obtain optimal recognition performance by learning using a large number of unlabeled data and a small number of labeled data, and the recognition of the deep learning model. Performance can be improved.
- FIG. 8 is a diagram illustrating an example of a hardware configuration of a learning device;
- the learning device 1 shown in FIGS. 1 and 5 is implemented by the computer 90 shown in FIG.
- computer 90 is a server.
- the computer 90 has a processor 901 , a main storage device 902 , an auxiliary storage device 203 , an input device 904 , an output device 905 , a media drive device 906 , an input/output interface 907 and a communication control device 908 .
- Each component of computer 90 is connected to each other by bus 909 .
- a processor 901 for example, a CPU (Central Processing Unit).
- the computer 90 may have multiple processors 901 .
- the computer 90 may have a GPU (Graphics Processing Unit) or the like as the processor 901 .
- Processor 901 loads the program into main memory 902 and executes the program.
- the main storage device 902 is, for example, RAM (Random Access Memory).
- the auxiliary storage device 903 is a non-volatile storage device such as a HDD (Hard Disk Drive) or SSD (Solid-State Drive).
- HDD Hard Disk Drive
- SSD Solid-State Drive
- the auxiliary storage device 903 implements the functions of the storage unit 11 in FIGS.
- the input device 904 is, for example, a keyboard, a pointing device, or a combination thereof.
- a pointing device may be, for example, a mouse, a touch pad, or a touch screen.
- Output device 905 is a display, speakers, or a combination thereof.
- the display may be a touch screen.
- the input/output interface 907 is connected to a PCIe (Peripheral Component Interconnect express) device or the like, and transmits and receives data to and from the connected device.
- PCIe Peripheral Component Interconnect express
- the communication control device 908 is, for example, a wired LAN (Local Area Network) interface, a wireless LAN interface, or a combination thereof.
- the computer 90 is connected to a network such as a wireless LAN or wired LAN via a communication control device 908 .
- the communication control device 908 may be an external NIC (Network Interface Card) or an onboard network interface controller.
- the storage medium 91 is an optical disc such as a CD (Compact Disc) or a DVD (Digital Versatile Disk), a magneto-optical disc, a magnetic disc, a semiconductor memory card such as a flash memory, or the like.
- the media drive device 906 is a device that writes data to and reads data from the inserted storage medium 91 .
- the program executed by the processor 901 may be installed in the auxiliary storage device 903 in advance.
- the program may be provided by being stored in the storage medium 91 , read from the storage medium 91 by the medium drive device 906 , copied to the auxiliary storage device 903 , and then loaded into the main storage device 902 .
- the program may be downloaded to the computer 90 via the network and communication control device 908 from a program provider on the network and installed.
- the processor 901 implements the functions of the pseudo-label generation unit 12, the model output unit 13, the loss calculation unit 14, and the update unit 15 illustrated in FIGS. 1 and 5 by executing programs.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
図8は、学習装置のハードウェア構成の一例を示す図である。図1及び5に示した学習装置1は、図8のコンピュータ90により実現される。例えば、コンピュータ90は、サーバである。
11 記憶部
12 疑似ラベル生成部
13 モデル出力部
14 損失算出部
15 更新部
131 第1モデル出力部
132 第2モデル出力部
141 第1損失算出部
142 第2損失算出部
Claims (7)
- 正解を表すラベルと対象データとが対応付けられた複数のラベルありデータ、正解との対応付けがない対象データである複数のラベルなしデータ、及び、深層学習モデルを記憶する記憶部と、
前記ラベルなしデータ及び前記深層学習モデルを基に疑似ラベルを生成する疑似ラベル生成部と、
前記疑似ラベル及び前記ラベルありデータに含まれる前記ラベルを基に、前記深層学習モデルを用いて前記ラベルなしデータの識別を行った場合及び前記ラベルありデータの識別を行った場合の損失を算出する損失算出部と、
前記損失算出部により算出された前記損失を基に、前記深層学習モデルを更新する更新部と
を備えたことを特徴とする情報処理装置。 - 前記深層学習モデルは、特徴量抽出層及び識別層を有し、
前記深層学習モデルを取得して第1深層学習モデルとして保持し、前記ラベルなしデータを前記第1深層学習モデルに入力して第1出力値を得る第1モデル出力部と、
前記深層学習モデルを取得して第2深層学習モデルとして保持し、前記ラベルありデータを前記第2深層学習モデルに入力して第2出力値を得る第2モデル出力部と
をさらに備え、
前記損失算出部は、
前記第1モデル出力部により得られた第1出力値と前記疑似ラベルとを用いて、第1損失を算出する第1損失算出部と、
前記第2モデル出力部により得られた第2出力値と前記ラベルとを用いて、第2損失を算出する第2損失算出部とを有し、
前記更新部は、前記第1損失及び前記第2損失の双方を基に、前記第1深層学習モデルに対する第1更新及び前記第2深層学習モデルに対する第2更新を行う
ことを特徴とする請求項1に記載の情報処理装置。 - 前記更新部は、前記第1更新及び前記第2更新として、前記第1深層学習モデル及び前記第2深層学習モデルに含まれるそれぞれの前記特徴量抽出層については同様の更新を行い、前記第1深層学習モデルに含まれる第1識別層と前記第2深層学習モデルに含まれる第2識別層とは、それぞれ異なる更新を行うことを特徴とする請求項2に記載の情報処理装置。
- 前記疑似ラベル生成部は、複数の前記ラベルなしデータを前記深層学習モデルへ入力して得られる出力値を基に所定数のクラスタに分類し、前記クラスタ毎に前記疑似ラベルを割り当てることを特徴とする請求項1に記載の情報処理装置。
- 前記ラベルなしデータ及び前記ラベルありデータを統合して統合データを作成し、前記統合データを前記深層学習モデルに入力して出力値を得るモデル出力部をさらに備え、
前記損失算出部は、前記ラベルありデータに含まれる前記ラベルと前記疑似ラベルとを統合して統合ラベルを生成し、前記モデル出力部により得られた前記出力値と前記統合ラベルとを基に前記損失を算出する
ことを特徴とする請求項1に記載の情報処理装置。 - 正解を表すラベルと対象データとが対応付けられた複数のラベルありデータ、正解との対応付けがない対象データである複数のラベルなしデータ、及び、深層学習モデルを用いて学習を行う情報処理方法であって、
前記ラベルなしデータ及び前記深層学習モデルを基に疑似ラベルを生成し、
前記疑似ラベル及び前記ラベルありデータに含まれる前記ラベルを基に、前記深層学習モデルを用いて前記ラベルなしデータの識別を行った場合及び前記ラベルありデータの識別を行った場合の損失を算出し、
算出した前記損失を基に、前記深層学習モデルを更新する
ことを特徴とする情報処理方法。 - 正解を表すラベルと対象データとが対応付けられた複数のラベルありデータ、正解との対応付けがない対象データである複数のラベルなしデータ、及び、深層学習モデルを用いた学習をコンピュータに実行させる情報処理プログラムであって、
前記ラベルなしデータ及び前記深層学習モデルを基に疑似ラベルを生成し、
前記疑似ラベル及び前記ラベルありデータに含まれる前記ラベルを基に、前記深層学習モデルを用いて前記ラベルなしデータの識別を行った場合及び前記ラベルありデータの識別を行った場合の損失を算出し、
算出した前記損失を基に、前記深層学習モデルを更新する
処理を前記コンピュータに実行させることを特徴とする情報処理プログラム。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21931445.7A EP4310734A4 (en) | 2021-03-15 | 2021-03-15 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING PROGRAM |
JP2023506415A JPWO2022195691A1 (ja) | 2021-03-15 | 2021-03-15 | |
PCT/JP2021/010452 WO2022195691A1 (ja) | 2021-03-15 | 2021-03-15 | 情報処理装置、情報処理方法及び情報処理プログラム |
US18/458,363 US20230409911A1 (en) | 2021-03-15 | 2023-08-30 | Information processing device, information processing method, and non-transitory computer-readable recording medium storing information processing program |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2021/010452 WO2022195691A1 (ja) | 2021-03-15 | 2021-03-15 | 情報処理装置、情報処理方法及び情報処理プログラム |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/458,363 Continuation US20230409911A1 (en) | 2021-03-15 | 2023-08-30 | Information processing device, information processing method, and non-transitory computer-readable recording medium storing information processing program |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022195691A1 true WO2022195691A1 (ja) | 2022-09-22 |
Family
ID=83320061
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/010452 WO2022195691A1 (ja) | 2021-03-15 | 2021-03-15 | 情報処理装置、情報処理方法及び情報処理プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230409911A1 (ja) |
EP (1) | EP4310734A4 (ja) |
JP (1) | JPWO2022195691A1 (ja) |
WO (1) | WO2022195691A1 (ja) |
-
2021
- 2021-03-15 WO PCT/JP2021/010452 patent/WO2022195691A1/ja active Application Filing
- 2021-03-15 EP EP21931445.7A patent/EP4310734A4/en active Pending
- 2021-03-15 JP JP2023506415A patent/JPWO2022195691A1/ja active Pending
-
2023
- 2023-08-30 US US18/458,363 patent/US20230409911A1/en active Pending
Non-Patent Citations (5)
Title |
---|
DONG-HYUN LEE: "Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks", ICML 2013 WORKSHOP : CHALLENGES IN REPRESENTATION LEARNING (WREPL), ATLANTA, GEORGIA, USA, 16 June 2013 (2013-06-16), Atlanta, Georgia, USA , XP055716966, Retrieved from the Internet <URL:http://deeplearning.net/wp-content/uploads/2013/03/pseudo_label_final.pdf> [retrieved on 20200721] * |
SAITO KUNIAKI, USHIKU YOSHITAKA, HARADA TATSUYA: "Asymmetric Tri-training for Unsupervised Domain Adaptation", ARXIV.ORG, 13 May 2017 (2017-05-13), pages 1 - 12, XP055972209, Retrieved from the Internet <URL:https://arxiv.org/pdf/1702.08400.pdf> [retrieved on 20221018] * |
See also references of EP4310734A4 |
XIAOMENG XIN, JINJUN WANG, RUJXIE, SANPINZHOU, WENLI HUANG, NANNING ZHENG: "Semi-supervised person re- identification using multi-view clustering", PATTERN RECOGNITION, vol. 88, 19 November 2019 (2019-11-19), pages 285 - 297, XP055972201 * |
YUKI M. ASANOCHRISTIAN RUPPRECHTANDREA VEDALDI: "Self-labelling via simultaneous clustering and representation learning", ICLR2020, 20 August 2020 (2020-08-20) |
Also Published As
Publication number | Publication date |
---|---|
EP4310734A1 (en) | 2024-01-24 |
JPWO2022195691A1 (ja) | 2022-09-22 |
EP4310734A4 (en) | 2024-05-01 |
US20230409911A1 (en) | 2023-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10452899B2 (en) | Unsupervised deep representation learning for fine-grained body part recognition | |
US11023806B2 (en) | Learning apparatus, identifying apparatus, learning and identifying system, and recording medium | |
CN112990054B (zh) | 紧凑的无语言面部表情嵌入和新颖三元组的训练方案 | |
EP3198373B1 (en) | Tracking hand/body pose | |
CA2843343C (en) | Systems and methods of detecting body movements using globally generated multi-dimensional gesture data | |
CN113139628B (zh) | 样本图像的识别方法、装置、设备及可读存储介质 | |
CN110619059B (zh) | 一种基于迁移学习的建筑物标定方法 | |
CN114841257B (zh) | 一种基于自监督对比约束下的小样本目标检测方法 | |
JPWO2018167900A1 (ja) | ニューラルネットワーク学習装置、方法、およびプログラム | |
JPWO2004008392A1 (ja) | 3次元物体モデルを用いた画像照合システム、画像照合方法及び画像照合プログラム | |
US20210110215A1 (en) | Information processing device, information processing method, and computer-readable recording medium recording information processing program | |
WO2016095068A1 (en) | Pedestrian detection apparatus and method | |
JP2011154501A (ja) | 学習装置、学習方法、識別装置、識別方法、プログラム、及び情報処理システム | |
WO2019146057A1 (ja) | 学習装置、実写画像分類装置の生成システム、実写画像分類装置の生成装置、学習方法及びプログラム | |
CN104915673A (zh) | 一种基于视觉词袋模型的目标分类方法和系统 | |
US11954755B2 (en) | Image processing device and operation method thereof | |
CN111161314A (zh) | 目标对象的位置区域确定方法、装置、电子设备及存储介质 | |
CN115222007A (zh) | 面向胶质瘤多任务一体化网络的改进粒子群参数优化方法 | |
CN114298122A (zh) | 数据分类方法、装置、设备、存储介质及计算机程序产品 | |
CN111797705A (zh) | 一种基于人物关系建模的动作识别方法 | |
CN115063664A (zh) | 用于工业视觉检测的模型学习方法、训练方法及系统 | |
CN113902989A (zh) | 直播场景检测方法、存储介质及电子设备 | |
CN108537253A (zh) | 一种基于概率成对约束的自适应半监督降维方法 | |
WO2022195691A1 (ja) | 情報処理装置、情報処理方法及び情報処理プログラム | |
CN107341189A (zh) | 一种辅助人工对图像进行筛查、分类和存储的方法及系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21931445 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023506415 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2021931445 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2021931445 Country of ref document: EP Effective date: 20231016 |