WO2021240636A1 - Distributed deep learning system - Google Patents

Distributed deep learning system Download PDF

Info

Publication number
WO2021240636A1
WO2021240636A1 PCT/JP2020/020708 JP2020020708W WO2021240636A1 WO 2021240636 A1 WO2021240636 A1 WO 2021240636A1 JP 2020020708 W JP2020020708 W JP 2020020708W WO 2021240636 A1 WO2021240636 A1 WO 2021240636A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer group
client terminal
input layer
weight
output
Prior art date
Application number
PCT/JP2020/020708
Other languages
French (fr)
Japanese (ja)
Inventor
顕至 田仲
猛 伊藤
勇輝 有川
健 坂本
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to JP2022527309A priority Critical patent/JP7464118B2/en
Priority to PCT/JP2020/020708 priority patent/WO2021240636A1/en
Publication of WO2021240636A1 publication Critical patent/WO2021240636A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the present invention relates to a distributed deep learning system that executes deep learning in a distributed and coordinated manner on a plurality of nodes.
  • Deep learning requires a large number of matrix operations to be performed.
  • an accelerator a dedicated arithmetic unit called an accelerator.
  • this computer cannot be easily introduced by a general user because the cost of purchasing the computer is high and the power consumption is extremely high.
  • supervised learning is known as one that can achieve high accuracy.
  • Supervised learning is a method in which learning data with a label indicating a correct answer is given to a computer for learning.
  • it is difficult to achieve high accuracy when the number of learning data is insufficient, and it is considered that tens of thousands or more of data are required to train difficult processes these days.
  • the first problem is that labeling data requires human resources with knowledge of the subject area. An example that requires specialized knowledge is the medical field.
  • the second problem is that learning data and labels may contain personal information, so methods such as uploading data to a cloud server that may leak information to an unspecified number of people are not allowed. Is.
  • Non-Patent Document 1 Previous research has proposed a method of separating deep learning into an edge device (client terminal) and a cloud server (see Non-Patent Document 1). This method focuses on the fact that the inference stage of deep learning can be performed with less computational resources and data resources than the learning stage, and that learning data cannot be reproduced from the weight of the trained model.
  • FIG. 21 shows the configuration of the distributed deep learning system disclosed in Non-Patent Document 1.
  • the cloud server 100 has an initial model 1000.
  • the cloud server 100 distributes the model 1000 to the client terminals 101-A, 101-B, and 101-C.
  • Each client terminal 101-A, 101-B, 101-C deploys the model provided by the cloud server 100 on the terminal.
  • the client terminal 101-C which has sufficient computational resources and data resources, learns the model 1000-C in its own environment and uses the model 1000-C. Update.
  • the client terminal 101-C that has updated the model 1000-C returns the difference between the weights of each layer of the model distributed from the cloud server 100 and the updated model 1000-C to the cloud server 100.
  • the cloud server 100 averages the models sent from the client terminals 101-A, 101-B, and 101-C, updates its own model 1000, and renews the updated model 1000 to the client terminals 101-A, It will be distributed to 101-B and 101-C.
  • Non-Patent Document 1 has the following effects.
  • Non-Patent Document 1 has a problem that a model specialized for a client terminal cannot be created when the tendency of the learning data that can be acquired differs depending on the client terminal.
  • the present invention has been made to solve the above problems, and provides a distributed deep learning system capable of creating a model specialized for a client terminal without requiring computational resources of the client terminal as compared with the conventional method.
  • the purpose is to do.
  • the distributed deep learning system of the present invention includes a client terminal and a cloud server connected to the client terminal via a network, and the client terminal is an output value as a result of inputting sample data into an input layer group of a model.
  • the first calculation unit configured to calculate the above and the output value of the intermediate layer group calculated by the cloud server are input to the output layer group of the model and the output value of the model is calculated.
  • a third calculation configured to calculate the error function of the weight of the output layer group based on the output value of the model and the label of the sample data when the model is trained.
  • a fourth calculation unit configured to calculate the weight error function of the input layer group based on the weight error function of the intermediate layer group calculated by the cloud server during training of the model.
  • the weight of the input layer group is updated based on the error function calculated by the fourth calculation unit, and the weight of the output layer group is updated based on the error function calculated by the third calculation unit.
  • the first model update unit configured as described above, the first transmission unit configured to transmit the output value of the input layer group and the error function of the weight of the output layer group to the cloud server, and the first transmission unit.
  • the cloud server includes a first receiving unit configured to receive an output value of the intermediate layer group calculated by the cloud server and an error function of the weight of the intermediate layer group, and the cloud server is a client terminal.
  • a fifth calculation unit configured to calculate the output value of the result of inputting the output value of the input layer group calculated by the above into the intermediate layer group, and calculated by the client terminal at the time of training the model.
  • the second model update unit configured to update the weights of the intermediate layer group, and the output value of the intermediate layer group and the error function of the weights of the intermediate layer group are transmitted to the client terminal.
  • a second transmitter configured to receive an output value of the input layer group calculated by the client terminal and an error function of the weight of the output layer group. It is characterized by being prepared.
  • the computational resources of the client terminal are required more than the conventional method. Instead, it is possible to realize a distributed deep learning system that can create a model specialized for the client terminal.
  • FIG. 1 is a diagram showing a configuration of a distributed deep learning system according to a first embodiment of the present invention.
  • FIG. 2 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 3 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 4 is a flowchart illustrating the inference operation of the client terminal of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 5 is a flowchart illustrating the inference operation of the cloud server of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 1 is a diagram showing a configuration of a distributed deep learning system according to a first embodiment of the present invention.
  • FIG. 2 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 3 is a block diagram showing a configuration of
  • FIG. 6 is a flowchart illustrating a learning operation of a client terminal of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 7 is a flowchart illustrating the learning operation of the cloud server of the distributed deep learning system according to the first embodiment of the present invention.
  • FIG. 8 is a diagram showing a configuration of a distributed deep learning system according to a second embodiment of the present invention.
  • FIG. 9 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the second embodiment of the present invention.
  • FIG. 10 is a diagram showing a configuration of a distributed deep learning system according to a third embodiment of the present invention.
  • FIG. 11 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the third embodiment of the present invention.
  • FIG. 12 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the third embodiment of the present invention.
  • FIG. 13 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a third embodiment of the present invention.
  • FIG. 14 is a diagram showing a configuration of a distributed deep learning system according to a fourth embodiment of the present invention.
  • FIG. 15 is a block diagram showing a configuration of a client terminal of a distributed deep learning system according to a fourth embodiment of the present invention.
  • FIG. 16 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a fourth embodiment of the present invention.
  • FIG. 12 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the third embodiment of the present invention.
  • FIG. 13 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a third embodiment of the present
  • FIG. 17 is a diagram showing a configuration of a distributed deep learning system according to a fifth embodiment of the present invention.
  • FIG. 18 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the fifth embodiment of the present invention.
  • FIG. 19 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a fifth embodiment of the present invention.
  • FIG. 20 is a block diagram showing a configuration example of a computer that realizes a client terminal according to the first to fifth embodiments of the present invention.
  • FIG. 21 is a diagram showing a configuration of a conventional distributed deep learning system.
  • FIG. 1 is a diagram showing a configuration of a distributed deep learning system according to a first embodiment of the present invention.
  • the distributed deep learning system includes a client terminal 1 and a cloud server 2 connected to the client terminal 1 via a network.
  • the model (neural network model) used in this embodiment is divided into three groups: an input layer group 200, an output layer group 202, and an intermediate layer group 201 between the input layer group 200 and the output layer group 202. ..
  • the input group 200, the intermediate group 201, and the output group 202 are each composed of one or more layers.
  • the input layer group 200 and the output layer group 202 are mounted on the client terminal 1, and the intermediate layer group 201 is mounted on the cloud server 2.
  • FIG. 2 is a block diagram showing the configuration of the client terminal 1
  • FIG. 3 is a block diagram showing the configuration of the cloud server 2.
  • the client terminal 1 includes a storage unit 10, a data acquisition unit 11, a calculation unit 12 (first calculation unit), a transmission unit 13 (first transmission unit), and a reception unit 14 (first reception unit). , Calculation unit 15 (second calculation unit), calculation unit 16 (third calculation unit), calculation unit 17 (fourth calculation unit), and model update unit 18 (first model update unit). And have.
  • the storage unit 10 stores the data of the input layer group 200 and the output layer group 202, and the input layer group 200 and the output layer group 202 are constructed. The construction of the input layer group 200 and the output layer group 202 is performed by the CPU (not shown) of the client terminal 1.
  • the cloud server 2 includes a storage unit 20, a reception unit 21 (second reception unit), a calculation unit 22 (fifth calculation unit), a transmission unit 23 (second transmission unit), and a calculation unit 24 ( A sixth calculation unit) and a model update unit 25 (second model update unit) are provided.
  • the data of the intermediate group 201 is stored in the storage unit 20, and the intermediate group 201 is constructed.
  • the construction of the intermediate layer group 201 is performed by the CPU (not shown) of the cloud server 2.
  • FIG. 4 is a flowchart explaining the inference operation of the client terminal 1 of the distributed deep learning system of this embodiment
  • FIG. 5 is a flowchart explaining the inference operation of the cloud server 2.
  • the data acquisition unit 11 of the client terminal 1 acquires the sample data input by the user (step S100 in FIG. 4).
  • the calculation unit 12 of the client terminal 1 calculates the result of inputting the sample data acquired by the data acquisition unit 11 into the input layer group 200 (step S101 in FIG. 4).
  • the transmission unit 13 of the client terminal 1 receives the calculation result of the output value of the input layer group 200 from the calculation unit 12, and transmits this calculation result to the cloud server 2 (step S102 in FIG. 4).
  • the receiving unit 21 of the cloud server 2 receives the output value of the input layer group 200 from the client terminal 1 (step S200 in FIG. 5).
  • the calculation unit 22 of the cloud server 2 calculates the result of inputting the output value of the input layer group 200 into the intermediate layer group 201 (FIG. 5, step S201).
  • the transmission unit 23 of the cloud server 2 receives the calculation result of the output value of the intermediate layer group 201 from the calculation unit 22, and transmits this calculation result to the client terminal 1 (step S202 of FIG. 5).
  • the receiving unit 14 of the client terminal 1 receives the output value of the intermediate layer group 201 from the cloud server 2 (step S103 in FIG. 4).
  • the calculation unit 15 of the client terminal 1 calculates the result of inputting the output value of the intermediate layer group 201 into the output layer group 202 (FIG. 4, step S104).
  • the output value of the output layer group 202 that is, the output value of the model can be calculated.
  • this step is called forward propagation.
  • FIG. 6 is a flowchart explaining the learning operation of the client terminal 1 of the distributed deep learning system of this embodiment
  • FIG. 7 is a flowchart explaining the learning operation of the cloud server 2.
  • the data acquisition unit 11 of the client terminal 1 acquires sample data (learning data) with a label input by the user (step S300 in FIG. 6).
  • the operation of the client terminal 1 in steps S301 to S304 of FIG. 6 is as described in steps S101 to S104.
  • the operation of the cloud server 2 in steps S400 to S402 of FIG. 7 is as described in steps S200 to S202.
  • the calculation unit 16 of the client terminal 1 calculates the gradient of the error function for each of the layer weights in the output layer group 202 based on the output value of the model and the label attached to the sample data (step 6 of FIG. 6). S305).
  • the transmission unit 13 of the client terminal 1 receives the calculation result of the gradient of the error function from the calculation unit 16 and transmits this calculation result to the cloud server 2 (step S306 in FIG. 6).
  • the receiving unit 21 of the cloud server 2 receives the calculation result of the gradient of the error function from the client terminal 1 (step S403 in FIG. 7).
  • the calculation unit 24 of the cloud server 2 calculates the gradient of the error function for each of the weights of the layers in the intermediate layer group 201 based on the gradient of the error function received from the client terminal 1 (step S404 of FIG. 7).
  • the transmission unit 23 of the cloud server 2 receives the calculation result of the gradient of the error function from the calculation unit 24, and transmits this calculation result to the client terminal 1 (step S405 of FIG. 7).
  • the model update unit 25 of the cloud server 2 updates the weights of the layers in the intermediate layer group 201 based on the gradient of the error function calculated by the calculation unit 24 (step S406 of FIG. 7).
  • the receiving unit 14 of the client terminal 1 receives the calculation result of the gradient of the error function from the cloud server 2 (step S307 in FIG. 6).
  • the calculation unit 17 of the client terminal 1 calculates the gradient of the error function for each of the weights of the layers in the input layer group 200 based on the gradient of the error function received from the cloud server 2 (step S308 in FIG. 6).
  • the model update unit 18 of the client terminal 1 updates the weights of the layers in the input layer group 200 based on the gradient of the error function calculated by the calculation unit 17, and outputs the weight based on the gradient of the error function calculated by the calculation unit 16.
  • the weights of the layers in the layer group 202 are updated (step S309 in FIG. 6).
  • the calculation of the intermediate layer group 201 is executed by the cloud server 2, so that the client terminal 1 does not require computational resources as compared with the existing method. Further, in this embodiment, since the input layer group 200 and the output layer group 202 are learned on the client terminal 1, a model specialized for the client terminal 1 can be created.
  • the sample data is not sent to the cloud server 2, and the label is added to the sample data on the client terminal 1, so that the sample data and the information contained in the label can be protected. ..
  • FIG. 8 is a diagram showing a configuration of a distributed deep learning system according to a second embodiment of the present invention.
  • the distributed deep learning system of this embodiment is composed of client terminals 1a-A and 1a-B, and a cloud server 2a connected to client terminals 1a-A and 1a-B via a network.
  • each client terminal has a separate input layer group and output layer group.
  • the input layer group 200a-A and the output layer group 202a-A of the first model are mounted on the client terminals 1a-A, and the intermediate layer group 201a of the first model is mounted on the cloud server 2a.
  • the input layer group 200a-B and the output layer group 202a-B of the second model are mounted on the client terminals 1a-B, and the intermediate layer group 201a of the second model is mounted on the cloud server 2a.
  • the first model and the second model share the intermediate group 201a.
  • FIG. 9 is a block diagram showing the configuration of the cloud server 2a.
  • the cloud server 2a includes a storage unit 20a, a reception unit 21, a calculation unit 22a, a transmission unit 23, a calculation unit 24a, and a model update unit 25a.
  • the data of the intermediate group group 201a is stored in the storage unit 20a, and the intermediate group group 201a is constructed.
  • the construction of the intermediate layer group 201a is performed by the CPU (not shown) of the cloud server 2a.
  • the inference operation flow of each of the client terminals 1a-A and 1a-B is the same as the operation of the client terminal 1 of the first embodiment, and the inference operation flow of the cloud server 2a is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the inference operation of this embodiment will be described with reference to the reference numerals of FIGS. 4 and 5.
  • the client terminals 1a-A and 1a-B execute the process of FIG. 4 for the acquired sample data, respectively.
  • the difference from the first embodiment in the inference operation of this embodiment is that when data arrives from the client terminals 1a-A and the client terminals 1a-B at the same time, the client terminals 1a-A and the client terminals 1a-B It is to share the middle layer group 201a by time division. That is, the cloud server 2a processes the data from the client terminals 1a-A and the data from the client terminals 1a-B in a time-division manner.
  • the calculation unit 22a of the cloud server 2a receives the output values of the input layer groups 200a-A and 200a-B from the client terminals 1a-A and 1a-B, for example, the input of the client terminals 1a-A.
  • the result of inputting the output value of the layer group 200a-A into the intermediate layer group 201a is calculated (FIG. 5, step S201).
  • the transmission unit 23 of the cloud server 2a returns the calculation result of the calculation unit 22a to the client terminals 1a-A that are the transmission sources of the output values of the input layer groups 200a-A (step S202 of FIG. 5).
  • the calculation unit 22a calculates the result of inputting the output value of the input layer group 200a-B of the client terminals 1a-B into the intermediate layer group 201a (step S201).
  • the transmission unit 23 returns the calculation result of the calculation unit 22a to the client terminals 1a-B of the transmission source of the output value of the input layer group 200a-B (step S202).
  • the flow of learning operation of each of the client terminals 1a-A and 1a-B is the same as the operation of the client terminal 1 of the first embodiment, and the flow of the learning operation of the cloud server 2a is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the learning operation of this embodiment will be described with reference to the reference numerals of FIGS. 6 and 7.
  • the client terminals 1a-A and 1a-B execute the process of FIG. 6 for the acquired sample data with labels.
  • the difference from the first embodiment in the learning operation of this embodiment is that when data arrives from the client terminals 1a-A and the client terminals 1a-B at the same time, the cloud server 2a has the data from the client terminals 1a-A. And the data from the client terminals 1a-B is processed by time division.
  • the time division processing of the cloud server 2a in steps S401 and S402 of FIG. 7 is the same as the processing described in steps S201 and S202 of this embodiment.
  • the intermediate layer is based on the gradient of the error function received from the client terminals 1a-A.
  • the gradient of the error function is calculated for each of the layer weights in group 201a (FIG. 7 step S404).
  • the transmission unit 23 of the cloud server 2a returns the calculation result of the calculation unit 24a to the client terminals 1a-A of the transmission source of the gradient of the error function (step S405 of FIG. 7).
  • the calculation unit 24a calculates the gradient of the error function for each of the weights of the layers in the intermediate layer group 201a based on the gradient of the error function received from the client terminals 1a-B (step S404).
  • the transmission unit 23 returns the calculation result of the calculation unit 24a to the client terminals 1a-B of the transmission source of the gradient of the error function (step S405).
  • the model update unit 25a of the cloud server 2a calculates the calculation result of the calculation unit 24a based on the gradient of the error function received from the client terminals 1a-A and the calculation unit 24a based on the gradient of the error function received from the client terminals 1a-B.
  • the average value with the result is calculated for each layer weight in the intermediate layer group 201a, and the weight of the layer in the intermediate layer group 201a is updated based on the calculated average value (FIG. 7, step S406).
  • the sample data is biased depending on the environment surrounding the client terminals.
  • the bias of the data of the client terminals 1a-A and the bias of the data of the client terminals 1a-B are averaged, so that the difference in the data between the client terminals may adversely affect the inference / learning.
  • the learning specialized for each client terminal can be carried out.
  • the client terminals 1a-B can utilize the intermediate layer learned by the client terminals 1a-A, the client terminals 1a-B only need to acquire data, and the labeling cost of the client terminals 1a-B is sufficient. Can be suppressed.
  • FIG. 10 is a diagram showing a configuration of a distributed deep learning system according to a third embodiment of the present invention.
  • the distributed deep learning system of this embodiment is composed of client terminals 1b-A and 1b-B, and a cloud server 2b connected to client terminals 1b-A and 1b-B via a network.
  • each of the plurality of client terminals has a separate input layer group and output layer group, and there are a plurality of sample data types (for example, image data and audio data), and each client terminal is a sample. It has an input layer group for each type of data.
  • the input layer group 200b-A- ⁇ and the output layer group 202b-A for the sample data ⁇ are mounted on the client terminal 1b-A, and the intermediate layer group 201b of the first model is mounted.
  • the input layer group 200b-A- ⁇ and the output layer group 202b-A of the second model for sample data ⁇ are mounted on the client terminal 1b-A, and the intermediate layer group 201b of the second model is implemented.
  • the first model and the second model share the intermediate group 201b and the output group 202b-A.
  • the input layer group 200b-B- ⁇ and the output layer group 202b-B of the third model for the data ⁇ are mounted on the client terminal 1b-B, and the intermediate layer group 201b of the third model is mounted on the cloud server 2b. Has been done.
  • the input layer group 200b-B- ⁇ and the output layer group 202b-B of the fourth model for data ⁇ are mounted on the client terminal 1b-B, and the intermediate layer group 201b of the fourth model is mounted on the cloud server 2b.
  • the third model and the fourth model share the intermediate group 201b and the output group 202b-B.
  • FIG. 11 is a block diagram showing the configuration of the client terminals 1b-A and 1b-B
  • FIG. 12 is a block diagram showing the configuration of the cloud server 2b.
  • the client terminals 1b-A and 1b-B have a storage unit 10b, a data acquisition unit 11, a calculation unit 12b, 15b, 16b, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18b, respectively. It includes a transmitting unit 19 and a receiving unit 30.
  • the storage unit 10b of the client terminal 1b-A stores the data of the input layer group 200b-A- ⁇ , 200b-A- ⁇ and the output layer group 202b-A, and the input layer group 200b-A- ⁇ , A 200b-A- ⁇ and an output group 202b-A are constructed.
  • the construction of the input layer group 200b-A- ⁇ , 200b-A- ⁇ and the output layer group 202b-A is performed by the CPU (not shown) of the client terminal 1b-A.
  • the storage unit 10b of the client terminal 1b-B stores the data of the input layer group 200b-B- ⁇ , 200b-B- ⁇ and the output layer group 202b-B, and the input layer group 200b-B- ⁇ , A 200b-B- ⁇ and an output group 202b-B are constructed.
  • the construction of the input layer group 200b-B- ⁇ , 200b-B- ⁇ and the output layer group 202b-B is performed by the CPU (not shown) of the client terminal 1b-B.
  • the cloud server 2b includes a storage unit 20b, a reception unit 21, calculation units 22b and 24b, a transmission unit 23, and a model update unit 25b.
  • the data of the intermediate layer group 201b is stored in the storage unit 20b, and the intermediate layer group 201b is constructed.
  • the construction of the intermediate layer group 201b is performed by the CPU (not shown) of the cloud server 2b.
  • the inference operation flow of each of the client terminals 1b-A and 1b-B is the same as the operation of the client terminal 1 of the first embodiment, and the inference operation flow of the cloud server 2b is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the inference operation of this embodiment will be described with reference to the reference numerals of FIGS. 4 and 5.
  • the client terminals 1b-A and 1b-B each execute the processing of FIG. 4 for the acquired sample data.
  • the calculation unit 12b of the client terminal 1b-A calculates the result of inputting the data ⁇ acquired by the data acquisition unit 11 into the input layer group 200b-A- ⁇ (step S101 in FIG. 4).
  • the transmission unit 13 of the client terminal 1b-A receives the calculation result of the output value of the input layer group 200b-A- ⁇ from the calculation unit 12b, and transmits this calculation result to the cloud server 2b (step S102 in FIG. 4).
  • the calculation unit 12b of the client terminal 1b-A calculates the result of inputting the data ⁇ acquired by the data acquisition unit 11 into the input layer group 200b-A- ⁇ (step S101).
  • the transmission unit 13 of the client terminal 1b-A transmits the calculation result of the output value of the input layer group 200b-A- ⁇ to the cloud server 2b (step S102).
  • the calculation unit 12b of the client terminal 1b-B calculates the result of inputting the data ⁇ acquired by the data acquisition unit 11 into the input layer group 200b-B- ⁇ (step S101).
  • the transmission unit 13 of the client terminal 1b-B transmits the calculation result of the output value of the input layer group 200b-B- ⁇ to the cloud server 2b (step S102).
  • the calculation unit 12b of the client terminal 1b-B calculates the result of inputting the data ⁇ acquired by the data acquisition unit 11 into the input layer group 200b-B- ⁇ (step S101).
  • the transmission unit 13 of the client terminal 1b-B transmits the output value of the input layer group 200b-B- ⁇ to the cloud server 2b (step S102).
  • the type of data can be easily identified by, for example, an identifier attached to the data.
  • the cloud server 2b processes the data from the client terminal 1b-A and the data from the client terminal 1b-B in a time-division manner.
  • the cloud server 2b inputs the calculation result of the output value of the input layer group 200b-A- ⁇ having the data ⁇ as the input, the calculation result of the output value of the input layer group 200b-A- ⁇ having the data ⁇ as the input, and the data ⁇ .
  • the calculation unit 15b of the client terminals 1b-A and 1b-B calculates the result of inputting the output value of the intermediate layer group 201b received from the cloud server 2b into the output layer groups 202b-A and 202b-B, respectively (FIG. 4). Step S104).
  • the data received by the client terminal 1b-A includes the output value of the intermediate layer group 201b calculated from the output value of the input layer group 200b-A- ⁇ and the output value of the input layer group 200b-A- ⁇ . Since there are two types of output values of the intermediate layer group 201b calculated from, the process of step S104 is executed for each of these two types of output values.
  • the data received by the client terminal 1b-B includes the output value of the intermediate layer group 201b calculated from the output value of the input layer group 200b-B- ⁇ and the output value of the input layer group 200b-B- ⁇ . Since there are two types of output values of the intermediate layer group 201b calculated from, the process of step S104 is executed for each of these two types of output values.
  • the data acquisition unit 11 of the client terminal 1b-A that could not acquire the data ⁇ may acquire the data ⁇ .
  • Complementary data for example, zero value, average value of past data, etc.
  • the data acquisition unit 11 of the client terminal 1b-B that could not acquire the data ⁇ may generate complementary data of the data ⁇ .
  • FIG. 13 is a flowchart illustrating the learning operation of the client terminals 1b-A and 1b-B of the distributed deep learning system of this embodiment. Since the flow of the learning operation of the cloud server 2b is the same as the operation of the cloud server 2 of the first embodiment, the reference numerals of FIG. 7 will be used for description.
  • the client terminals 1b-A and 1b-B each execute the processing of FIG. 13 for the acquired sample data with labels.
  • the processing of the client terminals 1b-A and 1b-B in steps S500 to S504 of FIG. 13 is the same as the processing of steps S100 to S104 described in this embodiment.
  • the calculation unit 16b and the transmission unit 13 of the client terminal 1b-A execute the same processing as in steps S305 and S306 of FIG. 6 in a time-division manner for each of the data ⁇ and the data ⁇ (steps S505 and S506 in FIG. 13). Specifically, the calculation unit 16b calculates the gradient of the error function for each of the weights of the layers in the output layer group 202b-A based on the output value of the first model and the label of the data ⁇ , and the second. The gradient of the error function is calculated for each of the layer weights in the output layer group 202b-A based on the output value of the model and the label of the data ⁇ .
  • the calculation unit 16b and the transmission unit 13 of the client terminal 1b-B execute the same processing as in steps S305 and S306 in a time-division manner for each of the data ⁇ and the data ⁇ (steps S505 and S506). Specifically, the calculation unit 16b calculates the gradient of the error function for each of the weights of the layers in the output layer group 202b-B based on the output value of the third model and the label of the data ⁇ , and the fourth The gradient of the error function is calculated for each of the layer weights in the output layer group 202b-A based on the output value of the model and the label of the data ⁇ .
  • the time division processing of the cloud server 2b in steps S401 and S402 of FIG. 7 is the same as the processing described in steps S201 and S202 of this embodiment.
  • the calculation unit 24b and the transmission unit 23 of the cloud server 2b perform the processing of steps S404 and S405 in FIG. 7 on the client terminal 1b-A based on the output value of the first model for the data ⁇ and the label of the data ⁇ .
  • the gradient of the error function calculated by the client terminal 1b-A based on the output value of the second model for the data ⁇ and the label of the data ⁇ , the gradient of the error function calculated by the client terminal 1b-A, the output of the third model for the data ⁇ .
  • the gradient of the error function calculated by the client terminal 1b-B based on the value and the label of this data ⁇ , the output value of the fourth model for the data ⁇ and the label of this data ⁇ are used by the client terminal 1b-B. Perform time divisions for each of the calculated error function gradients.
  • the model update unit 25b of the cloud server 2b calculates the calculation result of the calculation unit 24b based on the gradient of the error function received from the client terminals 1b-A and the calculation unit 24b based on the gradient of the error function received from the client terminals 1b-B.
  • the average value with the result is calculated for each layer weight in the intermediate layer group 201b, and the weight of the layer in the intermediate layer group 201b is updated based on the calculated average value (FIG. 7, step S406).
  • the calculation result of the calculation unit 24b includes the calculation result based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the first model and the client terminal 1b using the output value of the second model. -The calculation result based on the gradient of the error function calculated by A, the calculation result based on the gradient of the error function calculated by the client terminal 1b-B using the output value of the third model, and the output value of the fourth model. There are four types of calculation results based on the gradient of the error function calculated by the client terminal 1b-B using.
  • the client terminal 1b-A could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1b-A was not labeled.
  • the client terminal 1b-B could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1b-B was not labeled.
  • the calculation unit 16b of the client terminal 1b-A cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-A using the output value of the second model, and the client terminal cannot calculate the gradient of the error function.
  • the calculation unit 16b of 1b-B cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-B using the output value of the third model.
  • the calculation unit 24b of the cloud server 2b is an error function for the weight of the layer in the intermediate layer group 201b based on the result that the client terminal 1b-A should have calculated using the output value of the second model.
  • the gradient cannot be calculated, and the gradient of the error function is calculated for the weights of the layers in the intermediate layer group 201b based on the result that the client terminal 1b-B should have calculated using the output value of the third model. It cannot be calculated.
  • the calculation unit 17b of the client terminal 1b-A is based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the first model, and the input layer group 200b-.
  • the gradient of the error function is calculated for each of the layer weights in A- ⁇ (FIG. 13, step S508).
  • the calculation unit 17b of the client terminal 1b-A is an input layer group based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the second model.
  • the gradient of the error function is calculated for each of the layer weights in 200b-A- ⁇ (step S508).
  • the calculation unit 17b of the client terminal 1b-B uses the output value of the third model to calculate the input layer group 200b-based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-B.
  • the gradient of the error function is calculated for each of the layer weights in B- ⁇ (step S508).
  • the calculation unit 17b of the client terminal 1b-B is an input layer group based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-B using the output value of the fourth model.
  • the gradient of the error function is calculated for each of the layer weights in 200b-B- ⁇ (step S508).
  • the model update unit 18b of the client terminal 1b-A is a layer in the input layer group 200b-A- ⁇ based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-A- ⁇ .
  • the weights of the layers in the input layer group 200b-A- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-A- ⁇ .
  • the calculation unit 16b calculates the gradient of the error function based on the output value of the first model and the label of the data ⁇ , and the calculation unit 16b is the first.
  • the average value of the output value of the model 2 and the calculation result of the gradient of the error function calculated based on the label of the data ⁇ is calculated for each layer weight in the output layer group 202b-A, and the calculated average value is calculated.
  • the weights of the layers in the output layer group 202b-A are updated based on (FIG. 13, step S509).
  • the model update unit 18b of the client terminal 1b-B is a layer in the input layer group 200b-B- ⁇ based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-B- ⁇ .
  • the weights of the layers in the input layer group 200b-B- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-B- ⁇ .
  • the calculation unit 16b calculates the gradient of the error function based on the output value of the third model and the label of the data ⁇ , and the calculation unit 16b is the first.
  • the average value of the output value of the model 4 and the calculation result of the gradient of the error function calculated based on the label of the data ⁇ is calculated for each layer weight in the output layer group 202b-B, and the calculated average value is calculated.
  • the weights of the layers in the output layer group 202b-B are updated based on (step S509).
  • the client terminals 1b-A and 1b-B input using the calculation result of the error function in their own device.
  • the strata cannot be updated.
  • the client terminal 1b-A could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1b-A was not labeled.
  • the client terminal 1b-B could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1b-B was not labeled.
  • the calculation unit 16b of the client terminal 1b-A cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-A using the output value of the second model, and the client terminal cannot calculate the gradient of the error function.
  • the calculation unit 16b of 1b-B cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-B using the output value of the third model.
  • the model update unit 18b of the client terminal 1b-A cannot use the result that the calculation unit 16b should have calculated using the output value of the second model to update the output layer group 202b-A. ..
  • the model update unit 18b of the client terminal 1b-B may use the result that the calculation unit 16b should have calculated using the output value of the third model to update the output layer group 202b-B. Can not.
  • the calculation unit 17b of the client terminal 1b-A cannot calculate the gradient of the error function with respect to the weight of the layer in the input layer group 200b-A- ⁇ , and the calculation unit 17b of the client terminal 1b-B does not.
  • the gradient of the error function cannot be calculated for the weights of the layers in the input layer group 200b-B- ⁇ . Therefore, the model update unit 18b of the client terminal 1b-A cannot use the result that the calculation unit 17b should have calculated to update the input layer group 200b-A- ⁇ .
  • the model update unit 18b of the client terminal 1b-B cannot use the result that the calculation unit 17b should have calculated to update the input layer group 200b-B- ⁇ . Therefore, in order to update the input layer groups 200b-A- ⁇ and 200b-B- ⁇ , it is necessary to transmit the weight from the client terminal that can acquire the labeled data ⁇ and ⁇ .
  • the model update unit 18b of the client terminal 1b-A could not update the input layer group 200b-A- ⁇ , so that the transmission unit 19 is included in the input layer group for data ⁇ .
  • the update result of the layer weight is requested from another client terminal (step S510 in FIG. 13).
  • the weight of the layer in the input layer group for the data ⁇ is updated.
  • the result is requested from another client terminal (step S510).
  • the receiving unit 30 of the client terminal 1b-A receives the request from the client terminal 1b-B (step S511 in FIG. 13).
  • the transmission unit 19 of the client terminal 1b-A transmits the update result of the weight of the layer in the input layer group 200b-A- ⁇ to the client terminal 1b-B in response to the request from the client terminal 1b-B (FIG. 13). Step S512).
  • the receiving unit 30 of the client terminal 1b-B receives the request from the client terminal 1b-A (step S511).
  • the transmission unit 19 of the client terminal 1b-B transmits the update result of the weight of the layer in the input layer group 200b-B- ⁇ to the client terminal 1b-A in response to the request from the client terminal 1b-A (step S512). ).
  • the receiving unit 30 of the client terminal 1b-A receives the update result of the weight of the layer in the input layer group 200b-B- ⁇ from the client terminal 1b-B (step S513 in FIG. 13).
  • the model update unit 18b of the client terminal 1b-A updates the weights of the layers in the input layer group 200b-A- ⁇ by using the update result of the weights of the layers in the input layer group 200b-B- ⁇ (FIG. 13). Step S514).
  • the receiving unit 30 of the client terminal 1b-B receives the update result of the weight of the layer in the input layer group 200b-A- ⁇ from the client terminal 1b-A (step S513).
  • the model update unit 18b of the client terminal 1b-B updates the weights of the layers in the input layer group 200b-B- ⁇ by using the update result of the weights of the layers in the input layer group 200b-A- ⁇ (step S514). ).
  • the client terminals 1b-A and 1b-B can acquire the labeled data ⁇ and ⁇ , the processing of steps S510 to S514 becomes unnecessary.
  • a multifaceted model can be constructed as in this example, even a client terminal that can acquire only one of the sample data ⁇ and ⁇ can perform inference with a certain degree of accuracy, which is useful for initial decision making. be able to.
  • the model can be trained by sharing the calculation result with the client terminal that can acquire the sample data, and the sample data can be used. It is possible to protect the personal information contained.
  • FIG. 14 is a diagram showing a configuration of a distributed deep learning system according to a fourth embodiment of the present invention.
  • the distributed deep learning system of this embodiment is composed of client terminals 1c-A and 1c-B, client terminals 1c-A and 1c-B, and a cloud server 2c connected via a network.
  • each of the plurality of client terminals has a separate input layer group and output layer group, and there are a plurality of types of sample data, and each client terminal has a different type. It has an input layer group and an output layer group for each type of sample data.
  • the input layer group 200c-A- ⁇ and the output layer group 202c-A- ⁇ for the sample data ⁇ are mounted on the client terminal 1c-A, and the intermediate layer of the first model.
  • the group 201c is implemented in the cloud server 2c.
  • the input layer group 200c-A- ⁇ and the output layer group 202c-A- ⁇ of the second model for sample data ⁇ are mounted on the client terminal 1c-A, and the intermediate layer of the second model.
  • the group 201c is implemented in the cloud server 2c.
  • the input layer group 200c-B- ⁇ of the third model and the output layer group 202c-B- ⁇ for the data ⁇ are mounted on the client terminal 1c-B, and the intermediate layer group 201c of the third model is the cloud server 2c. It is implemented in.
  • the input layer group 200c-B- ⁇ and the output layer group 202c-B- ⁇ of the fourth model for data ⁇ are mounted on the client terminal 1c-B, and the intermediate layer group 201c of the fourth model is the cloud server 2c. It is implemented in.
  • the first to fourth models share the intermediate group 201c.
  • FIG. 15 is a block diagram showing the configurations of client terminals 1c-A and 1c-B, and the same configurations as those in FIG. 11 are designated by the same reference numerals.
  • the client terminals 1c-A and 1c-B have a storage unit 10c, a data acquisition unit 11, a calculation unit 12b, 15c, 16c, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18c, respectively. It includes a transmitting unit 19c and a receiving unit 30c.
  • the storage unit 10c of the client terminal 1c-A stores data of the input layer groups 200c-A- ⁇ , 200c-A- ⁇ and the output layer groups 202c-A- ⁇ , 202c-A- ⁇ , and inputs the data.
  • Group groups 200c-A- ⁇ , 200c-A- ⁇ and output group groups 202c-A- ⁇ , 202c-A- ⁇ are constructed.
  • the construction of the input layer groups 200c-A- ⁇ , 200c-A- ⁇ and the output layer groups 202c-A- ⁇ , 202c-A- ⁇ is performed by the CPU (not shown) of the client terminal 1c-A.
  • the storage unit 10c of the client terminal 1c-B stores data of the input layer groups 200c-B- ⁇ , 200c-B- ⁇ and the output layer groups 202c-B- ⁇ , 202c-B- ⁇ , and inputs the data.
  • Group groups 200c-B- ⁇ , 200c-B- ⁇ and output group groups 202c-B- ⁇ , 202c-B- ⁇ are constructed.
  • the construction of the input layer groups 200c-B- ⁇ , 200c-B- ⁇ and the output layer groups 202c-B- ⁇ , 202c-B- ⁇ is performed by the CPU (not shown) of the client terminal 1c-B. Since the configuration of the cloud server 2c is the same as that of the cloud server 2b of the third embodiment, the reference numerals in FIG. 12 will be used for description.
  • the client terminals 1c-A and 1c-B each execute the process shown in FIG. 4 for the acquired sample data.
  • the processes of steps S100 to S102 are the same as the processes described in the third embodiment.
  • the inference operation of the cloud server 2c is the same as that of the third embodiment.
  • the calculation unit 15c of the client terminal 1c-A receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-A- ⁇ from the cloud server 2c via the reception unit 14, and this intermediate layer group.
  • the result of inputting the output value of 201c into the output layer group 202c-A- ⁇ is calculated (FIG. 4, step S104).
  • the calculation unit 15c of the client terminal 1c-A receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-A- ⁇ from the cloud server 2c, and the output value of the intermediate layer group 201c. Is input to the output layer group 202c-A- ⁇ , and the result is calculated (step S104).
  • the calculation unit 15c of the client terminal 1c-B receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-B- ⁇ from the cloud server 2c via the reception unit 14, and the intermediate layer group The result of inputting the output value of 201c into the output layer group 202c-B- ⁇ is calculated (step S104). Further, the calculation unit 15c of the client terminal 1c-B receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-B- ⁇ from the cloud server 2c, and the output value of the intermediate layer group 201c. Is input to the output layer group 202c-B- ⁇ , and the result is calculated (step S104).
  • FIG. 16 is a flowchart illustrating the learning operation of the client terminals 1c-A and 1c-B of the distributed deep learning system of this embodiment. Since the flow of the learning operation of the cloud server 2c is the same as the operation of the cloud server 2 of the first embodiment, the reference numerals of FIG. 7 will be used for description.
  • the client terminals 1c-A and 1c-B each execute the processing of FIG. 16 for the acquired sample data with labels.
  • the processing of the client terminals 1c-A and 1c-B in steps S600 to S604 of FIG. 16 is the same as the processing of steps S100 to S104 described in this embodiment.
  • the calculation unit 16c and the transmission unit 13 of the client terminal 1c-A execute the same processing as in steps S305 and S306 of FIG. 6 in a time-division manner for each of the data ⁇ and the data ⁇ (steps S605 and S606 of FIG. 16). Specifically, the calculation unit 16c calculates the gradient of the error function based on the output value of the first model and the label of the data ⁇ for each of the weights of the layers in the output layer group 202c-A- ⁇ . The gradient of the error function is calculated for each of the layer weights in the output layer group 202c-A- ⁇ based on the output value of the second model and the label of the data ⁇ .
  • the calculation unit 16c and the transmission unit 13 of the client terminal 1c-B execute the same processing as in steps S305 and S306 in a time-division manner for each of the data ⁇ and the data ⁇ (steps S605 and S606). Specifically, the calculation unit 16c calculates the gradient of the error function based on the output value of the third model and the label of the data ⁇ for each of the weights of the layers in the output layer group 202c-B- ⁇ . The gradient of the error function is calculated for each of the layer weights in the output layer group 202c-A- ⁇ based on the output value of the fourth model and the label of the data ⁇ .
  • the processing of the cloud server 2c in steps S400 to S406 of FIG. 7 is the same as the processing described in the third embodiment.
  • the processing of the client terminals 1c-A and 1c-B in steps S607 and S608 of FIG. 16 is the same as the processing of steps S507 and 508 described in the third embodiment.
  • the model update unit 18c of the client terminal 1c-A is a layer in the input layer group 200c-A- ⁇ based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-A- ⁇ .
  • the weights of the layers in the input layer group 200c-A- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-A- ⁇ .
  • the model update unit 18c of the client terminal 1c-A has the output layer group 202c-A- based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the first model and the label of the data ⁇ .
  • the layers in the output layer group 202c-A- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the second model and the label of the data ⁇ by updating the weights of the layers in ⁇ .
  • the weight of is updated (FIG. 16 step S609).
  • the model update unit 18c of the client terminal 1c-B is a layer in the input layer group 200c-B- ⁇ based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-B- ⁇ .
  • the weights of the layers in the input layer group 200c-B- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-B- ⁇ .
  • the model update unit 18c of the client terminal 1c-B has an output layer group 202c-B- based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the third model and the label of the data ⁇ .
  • the layers in the output layer group 202c-B- ⁇ are updated based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the fourth model and the label of the data ⁇ by updating the weights of the layers in ⁇ .
  • the weight of is updated (step S609).
  • the client terminals 1c-A and 1c-B input using the calculation result of the error function in their own device. It is not possible to update layers and output layers.
  • the client terminal 1c-A could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1c-A was not labeled.
  • the client terminal 1c-B could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1c-B was not labeled.
  • the calculation unit 16c of the client terminal 1c-A cannot calculate the gradient of the error function with respect to the weight of the layer in the output layer group 202c-A- ⁇ , and the calculation unit 16c of the client terminal 1c-B cannot calculate.
  • the gradient of the error function cannot be calculated for the weights of the layers in the output layer group 202c-B- ⁇ .
  • the calculation unit 17b of the client terminal 1c-A cannot calculate the gradient of the error function for the weight of the layer in the input layer group 200c-A- ⁇ , and the calculation unit 17b of the client terminal 1c-B cannot calculate the gradient of the error function.
  • the gradient of the error function cannot be calculated for the weights of the layers in the group 200c-B- ⁇ .
  • the labeled data ⁇ , ⁇ could be acquired. It is necessary to send the weight from the client terminal.
  • the model update unit 18c of the client terminal 1c-A could not update the input layer group 200c-A- ⁇ and the output layer group 202c-A- ⁇ .
  • the update result of the weight of the layer in the input layer group for data ⁇ and the update result of the weight of the layer in the output layer group for data ⁇ are requested from other client terminals (FIG. 16 step S610).
  • the transmission unit 19c of the client terminal 1c-B is for data ⁇ because the model update unit 18c of the client terminal 1c-B could not update the input layer group 200c-B- ⁇ and the output layer group 202c-B- ⁇ .
  • the update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for the data ⁇ are requested from the other client terminal (step S610).
  • the receiving unit 30c of the client terminal 1c-A receives the request from the client terminal 1c-B (step S611 in FIG. 16).
  • the transmission unit 19c of the client terminal 1c-A updates the weight of the layer in the input layer group 200c-A- ⁇ and the layer in the output layer group 202c-A- ⁇ in response to the request from the client terminal 1c-B.
  • the update result of the weight of is transmitted to the client terminal 1c-B (step S612 in FIG. 16).
  • the receiving unit 30c of the client terminal 1c-B receives the request from the client terminal 1c-A (step S611).
  • the transmission unit 19c of the client terminal 1c-B updates the weight of the layer in the input layer group 200c-B- ⁇ and the layer in the output layer group 202c-B- ⁇ in response to the request from the client terminal 1c-A.
  • the update result of the weight of is transmitted to the client terminal 1c-A (step S612).
  • the receiving unit 30c of the client terminal 1c-A updates the weight of the layer in the input layer group 200c-B- ⁇ and the update result of the weight of the layer in the output layer group 202c-B- ⁇ from the client terminal 1c-B. And is received (FIG. 16 step S613).
  • the model update unit 18c of the client terminal 1c-A updates the weights of the layers in the input layer group 200c-A- ⁇ by using the update result of the weights of the layers in the input layer group 200c-B- ⁇ , and the output layer.
  • the weight of the layer in the output layer group 202c-A- ⁇ is updated using the update result of the weight of the layer in the group 202c-B- ⁇ (step S614 of FIG. 16).
  • the receiving unit 30c of the client terminal 1c-B updates the weight of the layer in the input layer group 200c-A- ⁇ and the update result of the weight of the layer in the output layer group 202c-A- ⁇ from the client terminal 1c-A. And is received (step S613).
  • the model update unit 18c of the client terminal 1c-B updates the weights of the layers in the input layer group 200c-B- ⁇ by using the update result of the weights of the layers in the input layer group 200c-A- ⁇ , and the output layer.
  • the weight of the layer in the output layer group 202c-B- ⁇ is updated using the update result of the weight of the layer in the group 202c-A- ⁇ (step S614).
  • the client terminals 1c-A and 1c-B can acquire the labeled data ⁇ and ⁇ , the processing of steps S610 to S614 becomes unnecessary.
  • the weight communication method in the third and fourth embodiments includes a centralized type and a distributed type.
  • a centralized configuration is shown in FIG.
  • the distributed deep learning system of this embodiment is composed of client terminals 1d-A and 1d-B, a cloud server 2c, and a storage server 3 connected to client terminals 1d-A and 1d-B via a network.
  • NS Storage Server
  • FIG. 18 is a block diagram showing the configurations of client terminals 1d-A and 1d-B, and the same configurations as those in FIG. 15 are designated by the same reference numerals.
  • the client terminals 1d-A and 1d-B have a storage unit 10c, a data acquisition unit 11, a calculation unit 12b, 15c, 16c, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18c, respectively. It includes a writing unit 31 and a reading unit 32.
  • FIG. 19 is a flowchart illustrating the learning operation of the client terminals 1d-A and 1d-B.
  • the processing of steps S600 to S609 of FIG. 19 is the same as that of the fourth embodiment.
  • the writing unit 31 of the client terminal 1d-A has the update result of the layer weight in the input layer group 200c-A- ⁇ , 200c-A- ⁇ and the output layer group 202c-A- ⁇ , 202c-A- ⁇ .
  • the update result of the weight of the layer is written to the storage server 3 via the network (step S615 in FIG. 19).
  • the writing unit 31 of the client terminal 1d-B has the update result of the layer weight in the input layer group 200c-B- ⁇ , 200c-B- ⁇ and the output layer group 202c-B- ⁇ , 202c-B- ⁇ .
  • the update result of the weight of the layer is written to the storage server 3 via the network (step S615).
  • the client terminals 1d-A and 1d-B may write at least a part of the update result to the storage server 3 when the sample data cannot be acquired or the sample data is not labeled. Can not.
  • the client terminal 1d-A could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1d-A was not labeled.
  • the client terminal 1d-B could not acquire the data ⁇ , or the data ⁇ acquired by the client terminal 1d-B was not labeled.
  • the client terminal 1d-A cannot write the update results of the input layer group 200c-A- ⁇ and the output layer group 202c-A- ⁇ to the storage server 3.
  • the client terminal 1d-B cannot write the update results of the input layer group 200c-B- ⁇ and the output layer group 202c-B- ⁇ to the storage server 3.
  • the reading unit 32 of the client terminal 1d-A is for data ⁇ because the model updating unit 18c of the client terminal 1d-A could not update the input layer group 200c-A- ⁇ and the output layer group 202c-A- ⁇ .
  • the update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for data ⁇ are read from the storage server 3 (FIG. 19, step S616).
  • the reading unit 32 of the client terminal 1d-B is for data ⁇ because the model updating unit 18c of the client terminal 1d-B could not update the input layer group 200c-B- ⁇ and the output layer group 202c-B- ⁇ .
  • the update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for data ⁇ are read from the storage server 3 (step S616).
  • the model update unit 18c of the client terminal 1d-A updates the weights of the layers in the input layer group 200c-A- ⁇ by using the update result of the weights of the layers in the input layer group 200c-B- ⁇ , and the output layer.
  • the weight of the layer in the output layer group 202c-A- ⁇ is updated using the update result of the weight of the layer in the group 202c-B- ⁇ (FIG. 19, step S617).
  • the model update unit 18c of the client terminal 1d-B updates the weights of the layers in the input layer group 200c-B- ⁇ by using the update result of the weights of the layers in the input layer group 200c-A- ⁇ , and the output layer.
  • the weight of the layer in the output layer group 202c-B- ⁇ is updated using the update result of the weight of the layer in the group 202c-A- ⁇ (step S617).
  • the client terminal that could not acquire the sample data can update the input layer group and the output layer group by reading the update result of the client terminal that could acquire the sample data from the storage server 3.
  • the storage server 3 stores weight data for each type of input layer group (for each type of data), and also stores weight data for each type of output layer group (for each type of data).
  • the writing unit 31 of the client terminals 1d-A and 1d-B overwrites the weight if the same type of weight is already stored in the storage server 3. You may. Further, the writing unit 31 may calculate the average value of the accumulated weights of the same type and the newly written weights, and overwrite the accumulated weights with this average value.
  • steps S510 to S514 and S610 to S614 are performed by one-to-one communication between the two client terminals.
  • the system operates even if the number of client terminals is increased or decreased, so it is robust against communication failures.
  • the communication load is small and the delay is also small.
  • the cost increases and the robustness against communication failure decreases.
  • the centralized configuration is applied to the fourth embodiment, but it goes without saying that the configuration may be applied to the third embodiment.
  • the update result is written to the storage server and read from the storage server only for the input layer group.
  • the number of client terminals is two, but it goes without saying that there may be three or more client terminals.
  • Each of the client terminals described in the first to fifth embodiments can be realized by a computer provided with a CPU (Central Prodessing Unit), a storage device, and an interface, and a program for controlling these hardware resources.
  • a computer provided with a CPU (Central Prodessing Unit), a storage device, and an interface, and a program for controlling these hardware resources.
  • An example of the configuration of this computer is shown in FIG.
  • the computer includes a CPU 300, a storage device 301, and an interface device (I / F) 302.
  • I / F interface device
  • a network or the like is connected to the I / F 302.
  • the program for realizing the present invention is stored in the storage device 301.
  • Each CPU 300 of the client terminal executes the process described in the first to fifth embodiments according to the program stored in each storage device 301.
  • the cloud server and the storage server can also be realized by a computer having the same configuration as in FIG.
  • the present invention can be applied to a distributed deep learning system that executes deep learning in a distributed and coordinated manner on a client terminal and a cloud server.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Transfer Between Computers (AREA)
  • Image Analysis (AREA)

Abstract

This distributed deep learning system comprises a client terminal (1) and a cloud server (2) connected to the client terminal (1) via a network. An input layer group (200) and an output layer group (202) of a model are built on the client terminal (1), and an intermediate layer group (201) of the model is built on the cloud server (2).

Description

分散深層学習システムDistributed deep learning system
 本発明は、深層学習を複数のノードで分散協調して実行する分散深層学習システムに関するものである。 The present invention relates to a distributed deep learning system that executes deep learning in a distributed and coordinated manner on a plurality of nodes.
 深層学習は、その性能の高さ、適用範囲の広さから、様々なアプリケーションが提案され、それまでの技術を上回る性能が示されている。深層学習が既存手法を上回るためには、計算資源とデータ資源の両方が必要になることが知られている。 For deep learning, various applications have been proposed due to its high performance and wide range of application, and its performance surpasses that of previous technologies. It is known that both computational and data resources are required for deep learning to surpass existing methods.
[計算資源について]
 深層学習では、大量の行列演算を実行する必要がある。最新の深層学習モデルでは、従来のパーソナルコンピュータによって現実的な時間内で処理を完了させることは難しく、アクセラレータと呼ばれる専用演算器が必要になる。近年では、このアクセラレータを複数台搭載した計算機を用いて深層学習・推論を行うのが一般的である。しかしながら、この計算機は、購入にかかる費用が高額であることと、消費電力が極めて大きいことから、一般のユーザが容易に導入できるものではない。
[Computational resources]
Deep learning requires a large number of matrix operations to be performed. In the latest deep learning model, it is difficult to complete the process in a realistic time by a conventional personal computer, and a dedicated arithmetic unit called an accelerator is required. In recent years, it is common to perform deep learning / inference using a computer equipped with a plurality of these accelerators. However, this computer cannot be easily introduced by a general user because the cost of purchasing the computer is high and the power consumption is extremely high.
[データ資源について]
 深層学習の中で、高い精度を達成できるものとして教師あり学習が知られている。教師あり学習は、正解を示すラベルを付加した学習データを計算機に与えて学習させる方法である。教師あり学習は、学習データの数が不足していると高い精度を達成することが難しく、昨今では難しい処理を学習させるには数万以上のデータが必要であると考えられている。
[About data resources]
In deep learning, supervised learning is known as one that can achieve high accuracy. Supervised learning is a method in which learning data with a label indicating a correct answer is given to a computer for learning. In supervised learning, it is difficult to achieve high accuracy when the number of learning data is insufficient, and it is considered that tens of thousands or more of data are required to train difficult processes these days.
 しかしながら、学習用のデータ資源には2つ問題がある。第1の問題は、データにラベル付けをするには、対象の分野に関する知見を持った人手が必要になることである。専門の知見が必要になる例としては例えば医療分野がある。第2の問題は、学習データとラベルには個人情報が含まれる可能性があるため、クラウドサーバにデータをアップロードするなどの、不特定多数に情報が漏洩する虞れがある方法が許されないことである。 However, there are two problems with data resources for learning. The first problem is that labeling data requires human resources with knowledge of the subject area. An example that requires specialized knowledge is the medical field. The second problem is that learning data and labels may contain personal information, so methods such as uploading data to a cloud server that may leak information to an unspecified number of people are not allowed. Is.
 先行研究では、深層学習をエッジデバイス(クライアント端末)とクラウドサーバに分離する方法が提案されている(非特許文献1参照)。この方法は、深層学習の推論段階が学習段階に比べて、計算資源とデータ資源が少なくても行えることと、学習したモデルの重みから学習データを再現できないことに着目した方法である。 Previous research has proposed a method of separating deep learning into an edge device (client terminal) and a cloud server (see Non-Patent Document 1). This method focuses on the fact that the inference stage of deep learning can be performed with less computational resources and data resources than the learning stage, and that learning data cannot be reproduced from the weight of the trained model.
 非特許文献1に開示された分散深層学習システムの構成を図21に示す。クラウドサーバ100は、初期モデル1000を有している。クラウドサーバ100は、モデル1000を各クライアント端末101-A,101-B,101-Cに配布する。
 各クライアント端末101-A,101-B,101-Cは、クラウドサーバ100から提供されたモデルをそれぞれ端末上に展開する。各クライアント端末101-A,101-B,101-Cのうち、計算資源とデータ資源に余裕があるクライアント端末101-Cは、自前環境でモデル1000-Cの学習を行い、モデル1000-Cを更新する。
FIG. 21 shows the configuration of the distributed deep learning system disclosed in Non-Patent Document 1. The cloud server 100 has an initial model 1000. The cloud server 100 distributes the model 1000 to the client terminals 101-A, 101-B, and 101-C.
Each client terminal 101-A, 101-B, 101-C deploys the model provided by the cloud server 100 on the terminal. Of the client terminals 101-A, 101-B, and 101-C, the client terminal 101-C, which has sufficient computational resources and data resources, learns the model 1000-C in its own environment and uses the model 1000-C. Update.
 モデル1000-Cを更新したクライアント端末101-Cは、クラウドサーバ100から配布されたモデルと更新したモデル1000-Cの各層の重みの差分をクラウドサーバ100に返送する。
 クラウドサーバ100は、各クライアント端末101-A,101-B,101-Cから送られてきたモデルを平均化し、自らのモデル1000を更新し、更新後のモデル1000を再度クライアント端末101-A,101-B,101-Cに配布する。
The client terminal 101-C that has updated the model 1000-C returns the difference between the weights of each layer of the model distributed from the cloud server 100 and the updated model 1000-C to the cloud server 100.
The cloud server 100 averages the models sent from the client terminals 101-A, 101-B, and 101-C, updates its own model 1000, and renews the updated model 1000 to the client terminals 101-A, It will be distributed to 101-B and 101-C.
 非特許文献1に開示された方法には、以下のような効果がある。
(I)計算資源の乏しいクライント端末は、学習をしなくてよい。
(II)データ資源が乏しく、ラベル付けなどの知見に乏しいユーザのクライアント端末でも、学習結果を享受できる。
(III)クライアント端末とクラウドサーバとは、モデルの各層の重みのみを送受信するので、学習データに含まれる個人情報が保護される。
The method disclosed in Non-Patent Document 1 has the following effects.
(I) Client terminals with scarce computational resources do not have to learn.
(II) The learning results can be enjoyed even by the client terminal of the user who lacks data resources and lacks knowledge such as labeling.
(III) Since the client terminal and the cloud server send and receive only the weights of each layer of the model, the personal information contained in the learning data is protected.
 しかしながら、非特許文献1に開示された方法では、クライアント端末毎に取得できる学習データの傾向が異なる場合に、クライアント端末に特化したモデルを作成できないという課題があった。 However, the method disclosed in Non-Patent Document 1 has a problem that a model specialized for a client terminal cannot be created when the tendency of the learning data that can be acquired differs depending on the client terminal.
 本発明は、上記課題を解決するためになされたもので、従来の方法よりもクライアント端末の計算資源を必要とせず、クライアント端末に特化したモデルを作成することができる分散深層学習システムを提供することを目的とする。 The present invention has been made to solve the above problems, and provides a distributed deep learning system capable of creating a model specialized for a client terminal without requiring computational resources of the client terminal as compared with the conventional method. The purpose is to do.
 本発明の分散深層学習システムは、クライアント端末と、前記クライアント端末とネットワークを介して接続されたクラウドサーバとを備え、前記クライアント端末は、サンプルデータをモデルの入力層群に入力した結果の出力値を計算するように構成された第1の計算部と、前記クラウドサーバによって計算された中間層群の出力値を前記モデルの出力層群に入力して前記モデルの出力値を計算するように構成された第2の計算部と、前記モデルの学習時に前記モデルの出力値と前記サンプルデータのラベルとに基づいて前記出力層群の重みの誤差関数を計算するように構成された第3の計算部と、前記モデルの学習時に前記クラウドサーバによって計算された前記中間層群の重みの誤差関数に基づいて前記入力層群の重みの誤差関数を計算するように構成された第4の計算部と、前記第4の計算部によって計算された誤差関数に基づいて前記入力層群の重みを更新し、前記第3の計算部によって計算された誤差関数に基づいて前記出力層群の重みを更新するように構成された第1のモデル更新部と、前記入力層群の出力値と前記出力層群の重みの誤差関数とを前記クラウドサーバに送信するように構成された第1の送信部と、前記クラウドサーバによって計算された前記中間層群の出力値と前記中間層群の重みの誤差関数とを受信するように構成された第1の受信部とを備え、前記クラウドサーバは、前記クライアント端末によって計算された前記入力層群の出力値を前記中間層群に入力した結果の出力値を計算するように構成された第5の計算部と、前記モデルの学習時に前記クライアント端末によって計算された前記出力層群の重みの誤差関数に基づいて前記中間層群の重みの誤差関数を計算するように構成された第6の計算部と、前記第6の計算部によって計算された誤差関数に基づいて前記中間層群の重みを更新するように構成された第2のモデル更新部と、前記中間層群の出力値と前記中間層群の重みの誤差関数とを前記クライアント端末に送信するように構成された第2の送信部と、前記クライアント端末によって計算された前記入力層群の出力値と前記出力層群の重みの誤差関数とを受信するように構成された第2の受信部とを備えることを特徴とするものである。 The distributed deep learning system of the present invention includes a client terminal and a cloud server connected to the client terminal via a network, and the client terminal is an output value as a result of inputting sample data into an input layer group of a model. The first calculation unit configured to calculate the above and the output value of the intermediate layer group calculated by the cloud server are input to the output layer group of the model and the output value of the model is calculated. A third calculation configured to calculate the error function of the weight of the output layer group based on the output value of the model and the label of the sample data when the model is trained. And a fourth calculation unit configured to calculate the weight error function of the input layer group based on the weight error function of the intermediate layer group calculated by the cloud server during training of the model. , The weight of the input layer group is updated based on the error function calculated by the fourth calculation unit, and the weight of the output layer group is updated based on the error function calculated by the third calculation unit. The first model update unit configured as described above, the first transmission unit configured to transmit the output value of the input layer group and the error function of the weight of the output layer group to the cloud server, and the first transmission unit. The cloud server includes a first receiving unit configured to receive an output value of the intermediate layer group calculated by the cloud server and an error function of the weight of the intermediate layer group, and the cloud server is a client terminal. A fifth calculation unit configured to calculate the output value of the result of inputting the output value of the input layer group calculated by the above into the intermediate layer group, and calculated by the client terminal at the time of training the model. Based on a sixth calculation unit configured to calculate the weight error function of the intermediate layer group based on the weight error function of the output layer group, and an error function calculated by the sixth calculation unit. The second model update unit configured to update the weights of the intermediate layer group, and the output value of the intermediate layer group and the error function of the weights of the intermediate layer group are transmitted to the client terminal. A second transmitter configured to receive an output value of the input layer group calculated by the client terminal and an error function of the weight of the output layer group. It is characterized by being prepared.
 本発明によれば、クライアント端末にモデルの入力層群と出力層群とを構築し、クラウドサーバにモデルの中間層群を構築することにより、従来の方法よりもクライアント端末の計算資源を必要とせず、クライアント端末に特化したモデルを作成することができる分散深層学習システムを実現することができる。 According to the present invention, by constructing the input layer group and the output layer group of the model on the client terminal and the intermediate layer group of the model on the cloud server, the computational resources of the client terminal are required more than the conventional method. Instead, it is possible to realize a distributed deep learning system that can create a model specialized for the client terminal.
図1は、本発明の第1の実施例に係る分散深層学習システムの構成を示す図である。FIG. 1 is a diagram showing a configuration of a distributed deep learning system according to a first embodiment of the present invention. 図2は、本発明の第1の実施例に係る分散深層学習システムのクライアント端末の構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the first embodiment of the present invention. 図3は、本発明の第1の実施例に係る分散深層学習システムのクラウドサーバの構成を示すブロック図である。FIG. 3 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the first embodiment of the present invention. 図4は、本発明の第1の実施例に係る分散深層学習システムのクライアント端末の推論動作を説明するフローチャートである。FIG. 4 is a flowchart illustrating the inference operation of the client terminal of the distributed deep learning system according to the first embodiment of the present invention. 図5は、本発明の第1の実施例に係る分散深層学習システムのクラウドサーバの推論動作を説明するフローチャートである。FIG. 5 is a flowchart illustrating the inference operation of the cloud server of the distributed deep learning system according to the first embodiment of the present invention. 図6は、本発明の第1の実施例に係る分散深層学習システムのクライアント端末の学習動作を説明するフローチャートである。FIG. 6 is a flowchart illustrating a learning operation of a client terminal of the distributed deep learning system according to the first embodiment of the present invention. 図7は、本発明の第1の実施例に係る分散深層学習システムのクラウドサーバの学習動作を説明するフローチャートである。FIG. 7 is a flowchart illustrating the learning operation of the cloud server of the distributed deep learning system according to the first embodiment of the present invention. 図8は、本発明の第2の実施例に係る分散深層学習システムの構成を示す図である。FIG. 8 is a diagram showing a configuration of a distributed deep learning system according to a second embodiment of the present invention. 図9は、本発明の第2の実施例に係る分散深層学習システムのクラウドサーバの構成を示すブロック図である。FIG. 9 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the second embodiment of the present invention. 図10は、本発明の第3の実施例に係る分散深層学習システムの構成を示す図である。FIG. 10 is a diagram showing a configuration of a distributed deep learning system according to a third embodiment of the present invention. 図11は、本発明の第3の実施例に係る分散深層学習システムのクライアント端末の構成を示すブロック図である。FIG. 11 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the third embodiment of the present invention. 図12は、本発明の第3の実施例に係る分散深層学習システムのクラウドサーバの構成を示すブロック図である。FIG. 12 is a block diagram showing a configuration of a cloud server of the distributed deep learning system according to the third embodiment of the present invention. 図13は、本発明の第3の実施例に係る分散深層学習システムのクライアント端末の学習動作を説明するフローチャートである。FIG. 13 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a third embodiment of the present invention. 図14は、本発明の第4の実施例に係る分散深層学習システムの構成を示す図である。FIG. 14 is a diagram showing a configuration of a distributed deep learning system according to a fourth embodiment of the present invention. 図15は、本発明の第4の実施例に係る分散深層学習システムのクライアント端末の構成を示すブロック図である。FIG. 15 is a block diagram showing a configuration of a client terminal of a distributed deep learning system according to a fourth embodiment of the present invention. 図16は、本発明の第4の実施例に係る分散深層学習システムのクライアント端末の学習動作を説明するフローチャートである。FIG. 16 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a fourth embodiment of the present invention. 図17は、本発明の第5の実施例に係る分散深層学習システムの構成を示す図である。FIG. 17 is a diagram showing a configuration of a distributed deep learning system according to a fifth embodiment of the present invention. 図18は、本発明の第5の実施例に係る分散深層学習システムのクライアント端末の構成を示すブロック図である。FIG. 18 is a block diagram showing a configuration of a client terminal of the distributed deep learning system according to the fifth embodiment of the present invention. 図19は、本発明の第5の実施例に係る分散深層学習システムのクライアント端末の学習動作を説明するフローチャートである。FIG. 19 is a flowchart illustrating a learning operation of a client terminal of a distributed deep learning system according to a fifth embodiment of the present invention. 図20は、本発明の第1~第5の実施例に係るクライアント端末を実現するコンピュータの構成例を示すブロック図である。FIG. 20 is a block diagram showing a configuration example of a computer that realizes a client terminal according to the first to fifth embodiments of the present invention. 図21は、従来の分散深層学習システムの構成を示す図である。FIG. 21 is a diagram showing a configuration of a conventional distributed deep learning system.
[第1の実施例]
 以下、本発明の実施例について図面を参照して説明する。図1は本発明の第1の実施例に係る分散深層学習システムの構成を示す図である。分散深層学習システムは、クライアント端末1と、クライアント端末1とネットワークを介して接続されたクラウドサーバ2とから構成される。
[First Example]
Hereinafter, examples of the present invention will be described with reference to the drawings. FIG. 1 is a diagram showing a configuration of a distributed deep learning system according to a first embodiment of the present invention. The distributed deep learning system includes a client terminal 1 and a cloud server 2 connected to the client terminal 1 via a network.
 本実施例で使用するモデル(ニューラルネットワークモデル)は、入力層群200と、出力層群202と、入力層群200と出力層群202の間の中間層群201の3つに分割されている。入力層群200と中間層群201と出力層群202とは、それぞれ1乃至複数の層からなる。そして、本実施例では、クライアント端末1に入力層群200と出力層群202とを実装し、クラウドサーバ2に中間層群201を実装している。 The model (neural network model) used in this embodiment is divided into three groups: an input layer group 200, an output layer group 202, and an intermediate layer group 201 between the input layer group 200 and the output layer group 202. .. The input group 200, the intermediate group 201, and the output group 202 are each composed of one or more layers. In this embodiment, the input layer group 200 and the output layer group 202 are mounted on the client terminal 1, and the intermediate layer group 201 is mounted on the cloud server 2.
 図2はクライアント端末1の構成を示すブロック図、図3はクラウドサーバ2の構成を示すブロック図である。クライアント端末1は、記憶部10と、データ取得部11と、計算部12(第1の計算部)と、送信部13(第1の送信部)と、受信部14(第1の受信部)と、計算部15(第2の計算部)と、計算部16(第3の計算部)と、計算部17(第4の計算部)と、モデル更新部18(第1のモデル更新部)とを備えている。記憶部10には、入力層群200と出力層群202のデータが記憶されており、入力層群200と出力層群202とが構築されている。この入力層群200と出力層群202の構築は、クライアント端末1のCPU(不図示)によって行われる。 FIG. 2 is a block diagram showing the configuration of the client terminal 1, and FIG. 3 is a block diagram showing the configuration of the cloud server 2. The client terminal 1 includes a storage unit 10, a data acquisition unit 11, a calculation unit 12 (first calculation unit), a transmission unit 13 (first transmission unit), and a reception unit 14 (first reception unit). , Calculation unit 15 (second calculation unit), calculation unit 16 (third calculation unit), calculation unit 17 (fourth calculation unit), and model update unit 18 (first model update unit). And have. The storage unit 10 stores the data of the input layer group 200 and the output layer group 202, and the input layer group 200 and the output layer group 202 are constructed. The construction of the input layer group 200 and the output layer group 202 is performed by the CPU (not shown) of the client terminal 1.
 クラウドサーバ2は、記憶部20と、受信部21(第2の受信部)と、計算部22(第5の計算部)と、送信部23(第2の送信部)と、計算部24(第6の計算部)と、モデル更新部25(第2のモデル更新部)とを備えている。記憶部20には、中間層群201のデータが記憶されており、中間層群201が構築されている。この中間層群201の構築は、クラウドサーバ2のCPU(不図示)によって行われる。 The cloud server 2 includes a storage unit 20, a reception unit 21 (second reception unit), a calculation unit 22 (fifth calculation unit), a transmission unit 23 (second transmission unit), and a calculation unit 24 ( A sixth calculation unit) and a model update unit 25 (second model update unit) are provided. The data of the intermediate group 201 is stored in the storage unit 20, and the intermediate group 201 is constructed. The construction of the intermediate layer group 201 is performed by the CPU (not shown) of the cloud server 2.
 図4は本実施例の分散深層学習システムのクライアント端末1の推論動作を説明するフローチャート、図5はクラウドサーバ2の推論動作を説明するフローチャートである。
 クライアント端末1のデータ取得部11は、ユーザによって入力されたサンプルデータを取得する(図4ステップS100)。
 クライアント端末1の計算部12は、データ取得部11によって取得されたサンプルデータを入力層群200に入力した結果を計算する(図4ステップS101)。
FIG. 4 is a flowchart explaining the inference operation of the client terminal 1 of the distributed deep learning system of this embodiment, and FIG. 5 is a flowchart explaining the inference operation of the cloud server 2.
The data acquisition unit 11 of the client terminal 1 acquires the sample data input by the user (step S100 in FIG. 4).
The calculation unit 12 of the client terminal 1 calculates the result of inputting the sample data acquired by the data acquisition unit 11 into the input layer group 200 (step S101 in FIG. 4).
 クライアント端末1の送信部13は、入力層群200の出力値の計算結果を計算部12から受け取り、この計算結果をクラウドサーバ2に送信する(図4ステップS102)。
 クラウドサーバ2の受信部21は、クライアント端末1から入力層群200の出力値を受信する(図5ステップS200)。
The transmission unit 13 of the client terminal 1 receives the calculation result of the output value of the input layer group 200 from the calculation unit 12, and transmits this calculation result to the cloud server 2 (step S102 in FIG. 4).
The receiving unit 21 of the cloud server 2 receives the output value of the input layer group 200 from the client terminal 1 (step S200 in FIG. 5).
 クラウドサーバ2の計算部22は、入力層群200の出力値を中間層群201に入力した結果を計算する(図5ステップS201)。
 クラウドサーバ2の送信部23は、中間層群201の出力値の計算結果を計算部22から受け取り、この計算結果をクライアント端末1に送信する(図5ステップS202)。
The calculation unit 22 of the cloud server 2 calculates the result of inputting the output value of the input layer group 200 into the intermediate layer group 201 (FIG. 5, step S201).
The transmission unit 23 of the cloud server 2 receives the calculation result of the output value of the intermediate layer group 201 from the calculation unit 22, and transmits this calculation result to the client terminal 1 (step S202 of FIG. 5).
 クライアント端末1の受信部14は、クラウドサーバ2から中間層群201の出力値を受信する(図4ステップS103)。
 クライアント端末1の計算部15は、中間層群201の出力値を出力層群202に入力した結果を計算する(図4ステップS104)。
The receiving unit 14 of the client terminal 1 receives the output value of the intermediate layer group 201 from the cloud server 2 (step S103 in FIG. 4).
The calculation unit 15 of the client terminal 1 calculates the result of inputting the output value of the intermediate layer group 201 into the output layer group 202 (FIG. 4, step S104).
 こうして、出力層群202の出力値、すなわちモデルの出力値を計算することができる。この出力値を求める工程では、モデルの入力層群200から出力層群202に向かって順番に計算していくことから、この工程を順伝搬(forward  propagation)と呼ぶ。 In this way, the output value of the output layer group 202, that is, the output value of the model can be calculated. In the step of obtaining this output value, since the calculation is performed in order from the input layer group 200 of the model toward the output layer group 202, this step is called forward propagation.
 図6は本実施例の分散深層学習システムのクライアント端末1の学習動作を説明するフローチャート、図7はクラウドサーバ2の学習動作を説明するフローチャートである。
 クライアント端末1のデータ取得部11は、ユーザによって入力されたラベル付きのサンプルデータ(学習データ)を取得する(図6ステップS300)。
FIG. 6 is a flowchart explaining the learning operation of the client terminal 1 of the distributed deep learning system of this embodiment, and FIG. 7 is a flowchart explaining the learning operation of the cloud server 2.
The data acquisition unit 11 of the client terminal 1 acquires sample data (learning data) with a label input by the user (step S300 in FIG. 6).
 図6のステップS301~S304のクライアント端末1の動作は、ステップS101~S104で説明したとおりである。
 図7のステップS400~S402のクラウドサーバ2の動作は、ステップS200~S202で説明したとおりである。
The operation of the client terminal 1 in steps S301 to S304 of FIG. 6 is as described in steps S101 to S104.
The operation of the cloud server 2 in steps S400 to S402 of FIG. 7 is as described in steps S200 to S202.
 クライアント端末1の計算部16は、モデルの出力値と、サンプルデータに付加されたラベルとに基づいて誤差関数の勾配を、出力層群202内の層の重みの各々について計算する(図6ステップS305)。
 クライアント端末1の送信部13は、誤差関数の勾配の計算結果を計算部16から受け取り、この計算結果をクラウドサーバ2に送信する(図6ステップS306)。
The calculation unit 16 of the client terminal 1 calculates the gradient of the error function for each of the layer weights in the output layer group 202 based on the output value of the model and the label attached to the sample data (step 6 of FIG. 6). S305).
The transmission unit 13 of the client terminal 1 receives the calculation result of the gradient of the error function from the calculation unit 16 and transmits this calculation result to the cloud server 2 (step S306 in FIG. 6).
 クラウドサーバ2の受信部21は、クライアント端末1から誤差関数の勾配の計算結果を受信する(図7ステップS403)。
 クラウドサーバ2の計算部24は、クライアント端末1から受信した誤差関数の勾配に基づいて、中間層群201内の層の重みの各々について誤差関数の勾配を計算する(図7ステップS404)。
The receiving unit 21 of the cloud server 2 receives the calculation result of the gradient of the error function from the client terminal 1 (step S403 in FIG. 7).
The calculation unit 24 of the cloud server 2 calculates the gradient of the error function for each of the weights of the layers in the intermediate layer group 201 based on the gradient of the error function received from the client terminal 1 (step S404 of FIG. 7).
 クラウドサーバ2の送信部23は、誤差関数の勾配の計算結果を計算部24から受け取り、この計算結果をクライアント端末1に送信する(図7ステップS405)。
 クラウドサーバ2のモデル更新部25は、計算部24が計算した誤差関数の勾配に基づいて中間層群201内の層の重みを更新する(図7ステップS406)。
The transmission unit 23 of the cloud server 2 receives the calculation result of the gradient of the error function from the calculation unit 24, and transmits this calculation result to the client terminal 1 (step S405 of FIG. 7).
The model update unit 25 of the cloud server 2 updates the weights of the layers in the intermediate layer group 201 based on the gradient of the error function calculated by the calculation unit 24 (step S406 of FIG. 7).
 クライアント端末1の受信部14は、クラウドサーバ2から誤差関数の勾配の計算結果を受信する(図6ステップS307)。
 クライアント端末1の計算部17は、クラウドサーバ2から受信した誤差関数の勾配に基づいて、入力層群200内の層の重みの各々について誤差関数の勾配を計算する(図6ステップS308)。
The receiving unit 14 of the client terminal 1 receives the calculation result of the gradient of the error function from the cloud server 2 (step S307 in FIG. 6).
The calculation unit 17 of the client terminal 1 calculates the gradient of the error function for each of the weights of the layers in the input layer group 200 based on the gradient of the error function received from the cloud server 2 (step S308 in FIG. 6).
 クライアント端末1のモデル更新部18は、計算部17が計算した誤差関数の勾配に基づいて入力層群200内の層の重みを更新し、計算部16が計算した誤差関数の勾配に基づいて出力層群202内の層の重みを更新する(図6ステップS309)。 The model update unit 18 of the client terminal 1 updates the weights of the layers in the input layer group 200 based on the gradient of the error function calculated by the calculation unit 17, and outputs the weight based on the gradient of the error function calculated by the calculation unit 16. The weights of the layers in the layer group 202 are updated (step S309 in FIG. 6).
 こうして、モデル全体の更新が完了する。誤差関数を求める工程では、モデルの出力層群202から入力層群200に向かって順番に計算していくことから、この工程を逆伝搬(back  propagation)と呼ぶ。 In this way, the update of the entire model is completed. In the process of obtaining the error function, since the calculation is performed in order from the output layer group 202 of the model toward the input layer group 200, this process is called back propagation.
 本実施例では、推論動作においても学習動作においても、中間層群201の計算をクラウドサーバ2で実行することで、既存の方法よりもクライアント端末1に計算資源を要求しない。
 また、本実施例では、クライアント端末1上で、入力層群200と出力層群202を学習するので、クライアント端末1に特化したモデルを作成することができる。
In this embodiment, in both the inference operation and the learning operation, the calculation of the intermediate layer group 201 is executed by the cloud server 2, so that the client terminal 1 does not require computational resources as compared with the existing method.
Further, in this embodiment, since the input layer group 200 and the output layer group 202 are learned on the client terminal 1, a model specialized for the client terminal 1 can be created.
 さらに、本実施例では、サンプルデータをクラウドサーバ2に送ることはなく、またサンプルデータへのラベルの付加もクライアント端末1で行われるので、サンプルデータやラベルに含まれる情報を保護することができる。 Further, in this embodiment, the sample data is not sent to the cloud server 2, and the label is added to the sample data on the client terminal 1, so that the sample data and the information contained in the label can be protected. ..
[第2の実施例]
 次に、本発明の第2の実施例について説明する。図8は本発明の第2の実施例に係る分散深層学習システムの構成を示す図である。本実施例の分散深層学習システムは、クライアント端末1a-A,1a-Bと、クライアント端末1a-A,1a-Bとネットワークを介して接続されたクラウドサーバ2aとから構成される。
[Second Example]
Next, a second embodiment of the present invention will be described. FIG. 8 is a diagram showing a configuration of a distributed deep learning system according to a second embodiment of the present invention. The distributed deep learning system of this embodiment is composed of client terminals 1a-A and 1a-B, and a cloud server 2a connected to client terminals 1a-A and 1a-B via a network.
 本実施例では、クライアント端末が複数存在する場合に、それぞれのクライアント端末が別個の入力層群と出力層群を有する。例えば図8の例のようにクライアント端末が2つの場合、モデルが2つ作成される。第1のモデルの入力層群200a-Aと出力層群202a-Aとはクライアント端末1a-Aに実装され、第1のモデルの中間層群201aはクラウドサーバ2aに実装されている。第2のモデルの入力層群200a-Bと出力層群202a-Bとはクライアント端末1a-Bに実装され、第2のモデルの中間層群201aはクラウドサーバ2aに実装されている。このように、第1のモデルと第2のモデルは、中間層群201aを共有する。 In this embodiment, when there are a plurality of client terminals, each client terminal has a separate input layer group and output layer group. For example, when there are two client terminals as in the example of FIG. 8, two models are created. The input layer group 200a-A and the output layer group 202a-A of the first model are mounted on the client terminals 1a-A, and the intermediate layer group 201a of the first model is mounted on the cloud server 2a. The input layer group 200a-B and the output layer group 202a-B of the second model are mounted on the client terminals 1a-B, and the intermediate layer group 201a of the second model is mounted on the cloud server 2a. Thus, the first model and the second model share the intermediate group 201a.
 クライアント端末1a-A,1a-Bの構成は第1の実施例のクライアント端末1と同様であるので、図2の符号を用いて説明する。
 図9はクラウドサーバ2aの構成を示すブロック図である。クラウドサーバ2aは、記憶部20aと、受信部21と、計算部22aと、送信部23と、計算部24aと、モデル更新部25aとを備えている。記憶部20aには、中間層群201aのデータが記憶されており、中間層群201aが構築されている。この中間層群201aの構築は、クラウドサーバ2aのCPU(不図示)によって行われる。
Since the configurations of the client terminals 1a-A and 1a-B are the same as those of the client terminal 1 of the first embodiment, they will be described with reference to the reference numerals in FIG.
FIG. 9 is a block diagram showing the configuration of the cloud server 2a. The cloud server 2a includes a storage unit 20a, a reception unit 21, a calculation unit 22a, a transmission unit 23, a calculation unit 24a, and a model update unit 25a. The data of the intermediate group group 201a is stored in the storage unit 20a, and the intermediate group group 201a is constructed. The construction of the intermediate layer group 201a is performed by the CPU (not shown) of the cloud server 2a.
 クライアント端末1a-A,1a-Bのそれぞれの推論の動作の流れは、第1の実施例のクライアント端末1の動作と同様であり、クラウドサーバ2aの推論の動作の流れは、第1の実施例のクラウドサーバ2の動作と同様なので、図4、図5の符号を用いて本実施例の推論動作について説明する。 The inference operation flow of each of the client terminals 1a-A and 1a-B is the same as the operation of the client terminal 1 of the first embodiment, and the inference operation flow of the cloud server 2a is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the inference operation of this embodiment will be described with reference to the reference numerals of FIGS. 4 and 5.
 クライアント端末1a-A,1a-Bは、それぞれ取得したサンプルデータについて図4の処理を実行する。
 本実施例の推論動作における第1の実施例との違いは、クライアント端末1a-Aとクライアント端末1a-Bとから同時にデータが到来した場合、クライアント端末1a-Aとクライアント端末1a-Bとが時分割で中間層群201aを共有することである。すなわち、クラウドサーバ2aは、クライアント端末1a-Aからのデータとクライアント端末1a-Bからのデータを時分割で処理する。
The client terminals 1a-A and 1a-B execute the process of FIG. 4 for the acquired sample data, respectively.
The difference from the first embodiment in the inference operation of this embodiment is that when data arrives from the client terminals 1a-A and the client terminals 1a-B at the same time, the client terminals 1a-A and the client terminals 1a-B It is to share the middle layer group 201a by time division. That is, the cloud server 2a processes the data from the client terminals 1a-A and the data from the client terminals 1a-B in a time-division manner.
 具体的には、クラウドサーバ2aの計算部22aは、クライアント端末1a-A,1a-Bから入力層群200a-A,200a-Bの出力値を受信した場合、例えばクライアント端末1a-Aの入力層群200a-Aの出力値を中間層群201aに入力した結果を計算する(図5ステップS201)。クラウドサーバ2aの送信部23は、計算部22aの計算結果を入力層群200a-Aの出力値の送信元のクライアント端末1a-Aに返送する(図5ステップS202)。 Specifically, when the calculation unit 22a of the cloud server 2a receives the output values of the input layer groups 200a-A and 200a-B from the client terminals 1a-A and 1a-B, for example, the input of the client terminals 1a-A. The result of inputting the output value of the layer group 200a-A into the intermediate layer group 201a is calculated (FIG. 5, step S201). The transmission unit 23 of the cloud server 2a returns the calculation result of the calculation unit 22a to the client terminals 1a-A that are the transmission sources of the output values of the input layer groups 200a-A (step S202 of FIG. 5).
 続いて、計算部22aは、クライアント端末1a-Bの入力層群200a-Bの出力値を中間層群201aに入力した結果を計算する(ステップS201)。送信部23は、計算部22aの計算結果を入力層群200a-Bの出力値の送信元のクライアント端末1a-Bに返送する(ステップS202)。 Subsequently, the calculation unit 22a calculates the result of inputting the output value of the input layer group 200a-B of the client terminals 1a-B into the intermediate layer group 201a (step S201). The transmission unit 23 returns the calculation result of the calculation unit 22a to the client terminals 1a-B of the transmission source of the output value of the input layer group 200a-B (step S202).
 クライアント端末1a-A,1a-Bのそれぞれの学習の動作の流れは、第1の実施例のクライアント端末1の動作と同様であり、クラウドサーバ2aの学習の動作の流れは、第1の実施例のクラウドサーバ2の動作と同様なので、図6、図7の符号を用いて本実施例の学習動作について説明する。 The flow of learning operation of each of the client terminals 1a-A and 1a-B is the same as the operation of the client terminal 1 of the first embodiment, and the flow of the learning operation of the cloud server 2a is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the learning operation of this embodiment will be described with reference to the reference numerals of FIGS. 6 and 7.
 クライアント端末1a-A,1a-Bは、それぞれ取得したラベル付きのサンプルデータについて図6の処理を実行する。
 本実施例の学習動作における第1の実施例との違いは、クライアント端末1a-Aとクライアント端末1a-Bとから同時にデータが到来した場合、クラウドサーバ2aは、クライアント端末1a-Aからのデータとクライアント端末1a-Bからのデータを時分割で処理することである。図7のステップS401,S402におけるクラウドサーバ2aの時分割処理は、本実施例のステップS201,S202で説明した処理と同じである。
The client terminals 1a-A and 1a-B execute the process of FIG. 6 for the acquired sample data with labels.
The difference from the first embodiment in the learning operation of this embodiment is that when data arrives from the client terminals 1a-A and the client terminals 1a-B at the same time, the cloud server 2a has the data from the client terminals 1a-A. And the data from the client terminals 1a-B is processed by time division. The time division processing of the cloud server 2a in steps S401 and S402 of FIG. 7 is the same as the processing described in steps S201 and S202 of this embodiment.
 クラウドサーバ2aの計算部24aは、クライアント端末1a-A,1a-Bから誤差関数の勾配の計算結果を受信した場合、例えばクライアント端末1a-Aから受信した誤差関数の勾配に基づいて、中間層群201a内の層の重みの各々について誤差関数の勾配を計算する(図7ステップS404)。クラウドサーバ2aの送信部23は、計算部24aの計算結果を誤差関数の勾配の送信元のクライアント端末1a-Aに返送する(図7ステップS405)。 When the calculation unit 24a of the cloud server 2a receives the calculation result of the gradient of the error function from the client terminals 1a-A and 1a-B, for example, the intermediate layer is based on the gradient of the error function received from the client terminals 1a-A. The gradient of the error function is calculated for each of the layer weights in group 201a (FIG. 7 step S404). The transmission unit 23 of the cloud server 2a returns the calculation result of the calculation unit 24a to the client terminals 1a-A of the transmission source of the gradient of the error function (step S405 of FIG. 7).
 続いて、計算部24aは、クライアント端末1a-Bから受信した誤差関数の勾配に基づいて、中間層群201a内の層の重みの各々について誤差関数の勾配を計算する(ステップS404)。送信部23は、計算部24aの計算結果を誤差関数の勾配の送信元のクライアント端末1a-Bに返送する(ステップS405)。 Subsequently, the calculation unit 24a calculates the gradient of the error function for each of the weights of the layers in the intermediate layer group 201a based on the gradient of the error function received from the client terminals 1a-B (step S404). The transmission unit 23 returns the calculation result of the calculation unit 24a to the client terminals 1a-B of the transmission source of the gradient of the error function (step S405).
 クラウドサーバ2aのモデル更新部25aは、クライアント端末1a-Aから受信した誤差関数の勾配に基づく計算部24aの計算結果とクライアント端末1a-Bから受信した誤差関数の勾配に基づく計算部24aの計算結果との平均値を、中間層群201a内の層の重み毎に計算し、計算した平均値に基づいて中間層群201a内の層の重みを更新する(図7ステップS406)。 The model update unit 25a of the cloud server 2a calculates the calculation result of the calculation unit 24a based on the gradient of the error function received from the client terminals 1a-A and the calculation unit 24a based on the gradient of the error function received from the client terminals 1a-B. The average value with the result is calculated for each layer weight in the intermediate layer group 201a, and the weight of the layer in the intermediate layer group 201a is updated based on the calculated average value (FIG. 7, step S406).
 本実施例のようにクライアント端末が複数存在する場合、例えばクライアント端末1a-Aとクライアント端末1a-Bのデータ取得において、クライアント端末を取り巻く環境によってサンプルデータに偏りが生じている可能性がある。従来の方法では、クライアント端末1a-Aのデータの偏りとクライアント端末1a-Bのデータの偏りを平均化することになるので、クライアント端末間のデータの違いが推論・学習に悪影響を及ぼす可能性がある。これに対して、本実施例では、クライアント端末毎に入力層群と出力層群の学習ができるので、各クライアント端末に特化した学習が実施できる。 When there are a plurality of client terminals as in this embodiment, for example, in the data acquisition of the client terminals 1a-A and the client terminals 1a-B, there is a possibility that the sample data is biased depending on the environment surrounding the client terminals. In the conventional method, the bias of the data of the client terminals 1a-A and the bias of the data of the client terminals 1a-B are averaged, so that the difference in the data between the client terminals may adversely affect the inference / learning. There is. On the other hand, in this embodiment, since the input layer group and the output layer group can be learned for each client terminal, the learning specialized for each client terminal can be carried out.
 また、本実施例では、クライアント端末1a-Aが学習した中間層をクライアント端末1a-Bが活用できることから、クライアント端末1a-Bはデータ取得さえすれば良く、クライアント端末1a-Bのラベル付けコストを抑えることができる。 Further, in this embodiment, since the client terminals 1a-B can utilize the intermediate layer learned by the client terminals 1a-A, the client terminals 1a-B only need to acquire data, and the labeling cost of the client terminals 1a-B is sufficient. Can be suppressed.
[第3の実施例]
 次に、本発明の第3の実施例について説明する。図10は本発明の第3の実施例に係る分散深層学習システムの構成を示す図である。本実施例の分散深層学習システムは、クライアント端末1b-A,1b-Bと、クライアント端末1b-A,1b-Bとネットワークを介して接続されたクラウドサーバ2bとから構成される。
[Third Example]
Next, a third embodiment of the present invention will be described. FIG. 10 is a diagram showing a configuration of a distributed deep learning system according to a third embodiment of the present invention. The distributed deep learning system of this embodiment is composed of client terminals 1b-A and 1b-B, and a cloud server 2b connected to client terminals 1b-A and 1b-B via a network.
 本実施例では、複数のクライアント端末のそれぞれが別個の入力層群と出力層群を有し、かつサンプルデータの種別(例えば画像データと音声データ)が複数存在して、それぞれのクライアント端末がサンプルデータの種別毎に入力層群を有する。 In this embodiment, each of the plurality of client terminals has a separate input layer group and output layer group, and there are a plurality of sample data types (for example, image data and audio data), and each client terminal is a sample. It has an input layer group for each type of data.
 例えば図10の例のようにクライアント端末が2つでサンプルデータの種別が2つの場合、モデルが4つ作成される。サンプルデータα(例えば画像データ)用の第1のモデルの入力層群200b-A-αと出力層群202b-Aとはクライアント端末1b-Aに実装され、第1のモデルの中間層群201bはクラウドサーバ2bに実装されている。サンプルデータβ(例えば音声データ)用の第2のモデルの入力層群200b-A-βと出力層群202b-Aとはクライアント端末1b-Aに実装され、第2のモデルの中間層群201bはクラウドサーバ2bに実装されている。第1のモデルと第2のモデルは、中間層群201bと出力層群202b-Aを共有する。 For example, when there are two client terminals and two types of sample data as in the example of FIG. 10, four models are created. The input layer group 200b-A-α and the output layer group 202b-A for the sample data α (for example, image data) are mounted on the client terminal 1b-A, and the intermediate layer group 201b of the first model is mounted. Is implemented in the cloud server 2b. The input layer group 200b-A-β and the output layer group 202b-A of the second model for sample data β (for example, voice data) are mounted on the client terminal 1b-A, and the intermediate layer group 201b of the second model is implemented. Is implemented in the cloud server 2b. The first model and the second model share the intermediate group 201b and the output group 202b-A.
 データα用の第3のモデルの入力層群200b-B-αと出力層群202b-Bとはクライアント端末1b-Bに実装され、第3のモデルの中間層群201bはクラウドサーバ2bに実装されている。データβ用の第4のモデルの入力層群200b-B-βと出力層群202b-Bとはクライアント端末1b-Bに実装され、第4のモデルの中間層群201bはクラウドサーバ2bに実装されている。第3のモデルと第4のモデルは、中間層群201bと出力層群202b-Bを共有する。 The input layer group 200b-B-α and the output layer group 202b-B of the third model for the data α are mounted on the client terminal 1b-B, and the intermediate layer group 201b of the third model is mounted on the cloud server 2b. Has been done. The input layer group 200b-B-β and the output layer group 202b-B of the fourth model for data β are mounted on the client terminal 1b-B, and the intermediate layer group 201b of the fourth model is mounted on the cloud server 2b. Has been done. The third model and the fourth model share the intermediate group 201b and the output group 202b-B.
 図11はクライアント端末1b-A,1b-Bの構成を示すブロック図、図12はクラウドサーバ2bの構成を示すブロック図である。クライアント端末1b-A,1b-Bは、それぞれ記憶部10bと、データ取得部11と、計算部12b,15b,16b,17bと、送信部13と、受信部14と、モデル更新部18bと、送信部19と、受信部30とを備えている。 FIG. 11 is a block diagram showing the configuration of the client terminals 1b-A and 1b-B, and FIG. 12 is a block diagram showing the configuration of the cloud server 2b. The client terminals 1b-A and 1b-B have a storage unit 10b, a data acquisition unit 11, a calculation unit 12b, 15b, 16b, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18b, respectively. It includes a transmitting unit 19 and a receiving unit 30.
 クライアント端末1b-Aの記憶部10bには、入力層群200b-A-α,200b-A-βと出力層群202b-Aのデータが記憶されており、入力層群200b-A-α,200b-A-βと出力層群202b-Aとが構築されている。この入力層群200b-A-α,200b-A-βと出力層群202b-Aの構築は、クライアント端末1b-AのCPU(不図示)によって行われる。 The storage unit 10b of the client terminal 1b-A stores the data of the input layer group 200b-A-α, 200b-A-β and the output layer group 202b-A, and the input layer group 200b-A-α, A 200b-A-β and an output group 202b-A are constructed. The construction of the input layer group 200b-A-α, 200b-A-β and the output layer group 202b-A is performed by the CPU (not shown) of the client terminal 1b-A.
 クライアント端末1b-Bの記憶部10bには、入力層群200b-B-α,200b-B-βと出力層群202b-Bのデータが記憶されており、入力層群200b-B-α,200b-B-βと出力層群202b-Bとが構築されている。この入力層群200b-B-α,200b-B-βと出力層群202b-Bの構築は、クライアント端末1b-BのCPU(不図示)によって行われる。 The storage unit 10b of the client terminal 1b-B stores the data of the input layer group 200b-B-α, 200b-B-β and the output layer group 202b-B, and the input layer group 200b-B-α, A 200b-B-β and an output group 202b-B are constructed. The construction of the input layer group 200b-B-α, 200b-B-β and the output layer group 202b-B is performed by the CPU (not shown) of the client terminal 1b-B.
 クラウドサーバ2bは、記憶部20bと、受信部21と、計算部22b,24bと、送信部23と、モデル更新部25bとを備えている。記憶部20bには、中間層群201bのデータが記憶されており、中間層群201bが構築されている。この中間層群201bの構築は、クラウドサーバ2bのCPU(不図示)によって行われる。 The cloud server 2b includes a storage unit 20b, a reception unit 21, calculation units 22b and 24b, a transmission unit 23, and a model update unit 25b. The data of the intermediate layer group 201b is stored in the storage unit 20b, and the intermediate layer group 201b is constructed. The construction of the intermediate layer group 201b is performed by the CPU (not shown) of the cloud server 2b.
 クライアント端末1b-A,1b-Bのそれぞれの推論の動作の流れは、第1の実施例のクライアント端末1の動作と同様であり、クラウドサーバ2bの推論の動作の流れは、第1の実施例のクラウドサーバ2の動作と同様なので、図4、図5の符号を用いて本実施例の推論動作について説明する。 The inference operation flow of each of the client terminals 1b-A and 1b-B is the same as the operation of the client terminal 1 of the first embodiment, and the inference operation flow of the cloud server 2b is the first implementation. Since the operation is the same as that of the cloud server 2 of the example, the inference operation of this embodiment will be described with reference to the reference numerals of FIGS. 4 and 5.
 クライアント端末1b-A,1b-Bは、それぞれ取得したサンプルデータについて図4の処理を実行する。
 クライアント端末1b-Aの計算部12bは、データ取得部11によって取得されたデータαを入力層群200b-A-αに入力した結果を計算する(図4ステップS101)。クライアント端末1b-Aの送信部13は、入力層群200b-A-αの出力値の計算結果を計算部12bから受け取り、この計算結果をクラウドサーバ2bに送信する(図4ステップS102)。
The client terminals 1b-A and 1b-B each execute the processing of FIG. 4 for the acquired sample data.
The calculation unit 12b of the client terminal 1b-A calculates the result of inputting the data α acquired by the data acquisition unit 11 into the input layer group 200b-A-α (step S101 in FIG. 4). The transmission unit 13 of the client terminal 1b-A receives the calculation result of the output value of the input layer group 200b-A-α from the calculation unit 12b, and transmits this calculation result to the cloud server 2b (step S102 in FIG. 4).
 クライアント端末1b-Aの計算部12bは、データ取得部11によって取得されたデータβを入力層群200b-A-βに入力した結果を計算する(ステップS101)。クライアント端末1b-Aの送信部13は、入力層群200b-A-βの出力値の計算結果をクラウドサーバ2bに送信する(ステップS102)。 The calculation unit 12b of the client terminal 1b-A calculates the result of inputting the data β acquired by the data acquisition unit 11 into the input layer group 200b-A-β (step S101). The transmission unit 13 of the client terminal 1b-A transmits the calculation result of the output value of the input layer group 200b-A-β to the cloud server 2b (step S102).
 クライアント端末1b-Bの計算部12bは、データ取得部11によって取得されたデータαを入力層群200b-B-αに入力した結果を計算する(ステップS101)。クライアント端末1b-Bの送信部13は、入力層群200b-B-αの出力値の計算結果をクラウドサーバ2bに送信する(ステップS102)。 The calculation unit 12b of the client terminal 1b-B calculates the result of inputting the data α acquired by the data acquisition unit 11 into the input layer group 200b-B-α (step S101). The transmission unit 13 of the client terminal 1b-B transmits the calculation result of the output value of the input layer group 200b-B-α to the cloud server 2b (step S102).
 クライアント端末1b-Bの計算部12bは、データ取得部11によって取得されたデータβを入力層群200b-B-βに入力した結果を計算する(ステップS101)。クライアント端末1b-Bの送信部13は、入力層群200b-B-βの出力値をクラウドサーバ2bに送信する(ステップS102)。
 なお、データの種別は、例えばデータに付加された識別子によって容易に識別可能である。
The calculation unit 12b of the client terminal 1b-B calculates the result of inputting the data β acquired by the data acquisition unit 11 into the input layer group 200b-B-β (step S101). The transmission unit 13 of the client terminal 1b-B transmits the output value of the input layer group 200b-B-β to the cloud server 2b (step S102).
The type of data can be easily identified by, for example, an identifier attached to the data.
 第2の実施例と同様に、クラウドサーバ2bは、クライアント端末1b-Aからのデータとクライアント端末1b-Bからのデータを時分割で処理する。クラウドサーバ2bは、データαを入力とする入力層群200b-A-αの出力値の計算結果とデータβを入力とする入力層群200b-A-βの出力値の計算結果とデータαを入力とする入力層群200b-B-αの出力値の計算結果とデータβを入力とする入力層群200b-B-βの出力値の計算結果のそれぞれについて図5のステップS201,S202の処理を時分割で実行する。 Similar to the second embodiment, the cloud server 2b processes the data from the client terminal 1b-A and the data from the client terminal 1b-B in a time-division manner. The cloud server 2b inputs the calculation result of the output value of the input layer group 200b-A-α having the data α as the input, the calculation result of the output value of the input layer group 200b-A-β having the data β as the input, and the data α. Processing of steps S201 and S202 of FIG. 5 for the calculation result of the output value of the input layer group 200b-B-α to be input and the calculation result of the output value of the input layer group 200b-B-β to be input to the data β, respectively. Is executed in time divisions.
 クライアント端末1b-A,1b-Bの計算部15bは、それぞれクラウドサーバ2bから受信した中間層群201bの出力値を出力層群202b-A,202b-Bに入力した結果を計算する(図4ステップS104)。 The calculation unit 15b of the client terminals 1b-A and 1b-B calculates the result of inputting the output value of the intermediate layer group 201b received from the cloud server 2b into the output layer groups 202b-A and 202b-B, respectively (FIG. 4). Step S104).
 このとき、クライアント端末1b-Aが受信するデータには、入力層群200b-A-αの出力値から計算された中間層群201bの出力値と、入力層群200b-A-βの出力値から計算された中間層群201bの出力値の2種類があるので、これら2種類の出力値のそれぞれについてステップS104の処理が実行される。同様に、クライアント端末1b-Bが受信するデータには、入力層群200b-B-αの出力値から計算された中間層群201bの出力値と、入力層群200b-B-βの出力値から計算された中間層群201bの出力値の2種類があるので、これら2種類の出力値のそれぞれについてステップS104の処理が実行される。 At this time, the data received by the client terminal 1b-A includes the output value of the intermediate layer group 201b calculated from the output value of the input layer group 200b-A-α and the output value of the input layer group 200b-A-β. Since there are two types of output values of the intermediate layer group 201b calculated from, the process of step S104 is executed for each of these two types of output values. Similarly, the data received by the client terminal 1b-B includes the output value of the intermediate layer group 201b calculated from the output value of the input layer group 200b-B-α and the output value of the input layer group 200b-B-β. Since there are two types of output values of the intermediate layer group 201b calculated from, the process of step S104 is executed for each of these two types of output values.
 なお、クライアント端末1b-Aがデータαのみを取得し、クライアント端末1b-Bがデータβのみを取得した場合、データβを取得できなかったクライアント端末1b-Aのデータ取得部11は、データβの補完データ(例えば、ゼロ値や、過去のデータの平均値等)を生成するようにしてもよい。データαを取得できなかったクライアント端末1b-Bのデータ取得部11は、データαの補完データを生成するようにしてもよい。 When the client terminal 1b-A acquires only the data α and the client terminal 1b-B acquires only the data β, the data acquisition unit 11 of the client terminal 1b-A that could not acquire the data β may acquire the data β. Complementary data (for example, zero value, average value of past data, etc.) may be generated. The data acquisition unit 11 of the client terminal 1b-B that could not acquire the data α may generate complementary data of the data α.
 図13は本実施例の分散深層学習システムのクライアント端末1b-A,1b-Bの学習動作を説明するフローチャートである。クラウドサーバ2bの学習の動作の流れは、第1の実施例のクラウドサーバ2の動作と同様なので、図7の符号を用いて説明する。 FIG. 13 is a flowchart illustrating the learning operation of the client terminals 1b-A and 1b-B of the distributed deep learning system of this embodiment. Since the flow of the learning operation of the cloud server 2b is the same as the operation of the cloud server 2 of the first embodiment, the reference numerals of FIG. 7 will be used for description.
 クライアント端末1b-A,1b-Bは、それぞれ取得したラベル付きのサンプルデータについて図13の処理を実行する。
 図13のステップS500~S504のクライアント端末1b-A,1b-Bの処理は、本実施例で説明したステップS100~S104の処理と同じである。
The client terminals 1b-A and 1b-B each execute the processing of FIG. 13 for the acquired sample data with labels.
The processing of the client terminals 1b-A and 1b-B in steps S500 to S504 of FIG. 13 is the same as the processing of steps S100 to S104 described in this embodiment.
 クライアント端末1b-Aの計算部16bと送信部13とは、図6のステップS305,S306と同様の処理をデータαとデータβのそれぞれについて時分割で実行する(図13ステップS505,S506)。具体的には、計算部16bは、第1のモデルの出力値とデータαのラベルとに基づいて誤差関数の勾配を出力層群202b-A内の層の重みの各々について計算し、第2のモデルの出力値とデータβのラベルとに基づいて誤差関数の勾配を出力層群202b-A内の層の重みの各々について計算する。 The calculation unit 16b and the transmission unit 13 of the client terminal 1b-A execute the same processing as in steps S305 and S306 of FIG. 6 in a time-division manner for each of the data α and the data β (steps S505 and S506 in FIG. 13). Specifically, the calculation unit 16b calculates the gradient of the error function for each of the weights of the layers in the output layer group 202b-A based on the output value of the first model and the label of the data α, and the second. The gradient of the error function is calculated for each of the layer weights in the output layer group 202b-A based on the output value of the model and the label of the data β.
 クライアント端末1b-Bの計算部16bと送信部13とは、ステップS305,S306と同様の処理をデータαとデータβのそれぞれについて時分割で実行する(ステップS505,S506)。具体的には、計算部16bは、第3のモデルの出力値とデータαのラベルとに基づいて誤差関数の勾配を出力層群202b-B内の層の重みの各々について計算し、第4のモデルの出力値とデータβのラベルとに基づいて誤差関数の勾配を出力層群202b-A内の層の重みの各々について計算する。 The calculation unit 16b and the transmission unit 13 of the client terminal 1b-B execute the same processing as in steps S305 and S306 in a time-division manner for each of the data α and the data β (steps S505 and S506). Specifically, the calculation unit 16b calculates the gradient of the error function for each of the weights of the layers in the output layer group 202b-B based on the output value of the third model and the label of the data α, and the fourth The gradient of the error function is calculated for each of the layer weights in the output layer group 202b-A based on the output value of the model and the label of the data β.
 図7のステップS401,S402におけるクラウドサーバ2bの時分割処理は、本実施例のステップS201,S202で説明した処理と同じである。
 クラウドサーバ2bの計算部24bと送信部23とは、図7のステップS404,S405の処理を、データαに対する第1のモデルの出力値とこのデータαのラベルとに基づいてクライアント端末1b-Aが計算した誤差関数の勾配、データβに対する第2のモデルの出力値とこのデータβのラベルとに基づいてクライアント端末1b-Aが計算した誤差関数の勾配、データαに対する第3のモデルの出力値とこのデータαのラベルとに基づいてクライアント端末1b-Bが計算した誤差関数の勾配、データβに対する第4のモデルの出力値とこのデータβのラベルとに基づいてクライアント端末1b-Bが計算した誤差関数の勾配のそれぞれについて時分割で実行する。
The time division processing of the cloud server 2b in steps S401 and S402 of FIG. 7 is the same as the processing described in steps S201 and S202 of this embodiment.
The calculation unit 24b and the transmission unit 23 of the cloud server 2b perform the processing of steps S404 and S405 in FIG. 7 on the client terminal 1b-A based on the output value of the first model for the data α and the label of the data α. The gradient of the error function calculated by the client terminal 1b-A based on the output value of the second model for the data β and the label of the data β, the gradient of the error function calculated by the client terminal 1b-A, the output of the third model for the data α. The gradient of the error function calculated by the client terminal 1b-B based on the value and the label of this data α, the output value of the fourth model for the data β and the label of this data β are used by the client terminal 1b-B. Perform time divisions for each of the calculated error function gradients.
 クラウドサーバ2bのモデル更新部25bは、クライアント端末1b-Aから受信した誤差関数の勾配に基づく計算部24bの計算結果とクライアント端末1b-Bから受信した誤差関数の勾配に基づく計算部24bの計算結果との平均値を、中間層群201b内の層の重み毎に計算し、計算した平均値に基づいて中間層群201b内の層の重みを更新する(図7ステップS406)。 The model update unit 25b of the cloud server 2b calculates the calculation result of the calculation unit 24b based on the gradient of the error function received from the client terminals 1b-A and the calculation unit 24b based on the gradient of the error function received from the client terminals 1b-B. The average value with the result is calculated for each layer weight in the intermediate layer group 201b, and the weight of the layer in the intermediate layer group 201b is updated based on the calculated average value (FIG. 7, step S406).
 ただし、クライアント端末1b-A,1b-Bがサンプルデータを取得できなかった場合、あるいはサンプルデータにラベルが付加されていない場合には、誤差関数を計算することはできない。
 計算部24bの計算結果には、第1のモデルの出力値を用いてクライアント端末1b-Aが計算した誤差関数の勾配に基づく計算結果と、第2のモデルの出力値を用いてクライアント端末1b-Aが計算した誤差関数の勾配に基づく計算結果と、第3のモデルの出力値を用いてクライアント端末1b-Bが計算した誤差関数の勾配に基づく計算結果と、第4のモデルの出力値を用いてクライアント端末1b-Bが計算した誤差関数の勾配に基づく計算結果の4種類がある。
However, if the client terminals 1b-A and 1b-B cannot acquire the sample data, or if the sample data is not labeled, the error function cannot be calculated.
The calculation result of the calculation unit 24b includes the calculation result based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the first model and the client terminal 1b using the output value of the second model. -The calculation result based on the gradient of the error function calculated by A, the calculation result based on the gradient of the error function calculated by the client terminal 1b-B using the output value of the third model, and the output value of the fourth model. There are four types of calculation results based on the gradient of the error function calculated by the client terminal 1b-B using.
 例えばクライアント端末1b-Aがデータβを取得できなかったか、あるいはクライアント端末1b-Aが取得したデータβにラベルが付加されていなかったとする。また、クライアント端末1b-Bがデータαを取得できなかったか、あるいはクライアント端末1b-Bが取得したデータαにラベルが付加されていなかったとする。この場合、クライアント端末1b-Aの計算部16bは、第2のモデルの出力値を用いて出力層群202b-A内の層の重みについて誤差関数の勾配を計算することができず、クライアント端末1b-Bの計算部16bは、第3のモデルの出力値を用いて出力層群202b-B内の層の重みについて誤差関数の勾配を計算することができない。 For example, it is assumed that the client terminal 1b-A could not acquire the data β, or the data β acquired by the client terminal 1b-A was not labeled. Further, it is assumed that the client terminal 1b-B could not acquire the data α, or the data α acquired by the client terminal 1b-B was not labeled. In this case, the calculation unit 16b of the client terminal 1b-A cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-A using the output value of the second model, and the client terminal cannot calculate the gradient of the error function. The calculation unit 16b of 1b-B cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-B using the output value of the third model.
 したがって、クラウドサーバ2bの計算部24bは、第2のモデルの出力値を用いてクライアント端末1b-Aが計算する筈であった結果を基に中間層群201b内の層の重みについて誤差関数の勾配を計算することはできず、第3のモデルの出力値を用いてクライアント端末1b-Bが計算する筈であった結果を基に中間層群201b内の層の重みについて誤差関数の勾配を計算することはできない。 Therefore, the calculation unit 24b of the cloud server 2b is an error function for the weight of the layer in the intermediate layer group 201b based on the result that the client terminal 1b-A should have calculated using the output value of the second model. The gradient cannot be calculated, and the gradient of the error function is calculated for the weights of the layers in the intermediate layer group 201b based on the result that the client terminal 1b-B should have calculated using the output value of the third model. It cannot be calculated.
 クライアント端末1b-Aの計算部17bは、第1のモデルの出力値を用いてクライアント端末1b-Aが計算した誤差関数の勾配に基づくクラウドサーバ2bの計算結果に基づいて、入力層群200b-A-α内の層の重みの各々について誤差関数の勾配を計算する(図13ステップS508)。また、クライアント端末1b-Aの計算部17bは、第2のモデルの出力値を用いてクライアント端末1b-Aが計算した誤差関数の勾配に基づくクラウドサーバ2bの計算結果に基づいて、入力層群200b-A-β内の層の重みの各々について誤差関数の勾配を計算する(ステップS508)。 The calculation unit 17b of the client terminal 1b-A is based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the first model, and the input layer group 200b-. The gradient of the error function is calculated for each of the layer weights in A-α (FIG. 13, step S508). Further, the calculation unit 17b of the client terminal 1b-A is an input layer group based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-A using the output value of the second model. The gradient of the error function is calculated for each of the layer weights in 200b-A-β (step S508).
 クライアント端末1b-Bの計算部17bは、第3のモデルの出力値を用いてクライアント端末1b-Bが計算した誤差関数の勾配に基づくクラウドサーバ2bの計算結果に基づいて、入力層群200b-B-α内の層の重みの各々について誤差関数の勾配を計算する(ステップS508)。また、クライアント端末1b-Bの計算部17bは、第4のモデルの出力値を用いてクライアント端末1b-Bが計算した誤差関数の勾配に基づくクラウドサーバ2bの計算結果に基づいて、入力層群200b-B-β内の層の重みの各々について誤差関数の勾配を計算する(ステップS508)。 The calculation unit 17b of the client terminal 1b-B uses the output value of the third model to calculate the input layer group 200b-based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-B. The gradient of the error function is calculated for each of the layer weights in B-α (step S508). Further, the calculation unit 17b of the client terminal 1b-B is an input layer group based on the calculation result of the cloud server 2b based on the gradient of the error function calculated by the client terminal 1b-B using the output value of the fourth model. The gradient of the error function is calculated for each of the layer weights in 200b-B-β (step S508).
 クライアント端末1b-Aのモデル更新部18bは、計算部17bが入力層群200b-A-α内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200b-A-α内の層の重みを更新し、計算部17bが入力層群200b-A-β内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200b-A-β内の層の重みを更新する。また、クライアント端末1b-Aのモデル更新部18bは、計算部16bが第1のモデルの出力値とデータαのラベルとに基づいて計算した誤差関数の勾配の計算結果と、計算部16bが第2のモデルの出力値とデータβのラベルとに基づいて計算した誤差関数の勾配の計算結果との平均値を、出力層群202b-A内の層の重み毎に計算し、計算した平均値に基づいて出力層群202b-A内の層の重みを更新する(図13ステップS509)。 The model update unit 18b of the client terminal 1b-A is a layer in the input layer group 200b-A-α based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-A-α. The weights of the layers in the input layer group 200b-A-β are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-A-β. Further, in the model update unit 18b of the client terminal 1b-A, the calculation unit 16b calculates the gradient of the error function based on the output value of the first model and the label of the data α, and the calculation unit 16b is the first. The average value of the output value of the model 2 and the calculation result of the gradient of the error function calculated based on the label of the data β is calculated for each layer weight in the output layer group 202b-A, and the calculated average value is calculated. The weights of the layers in the output layer group 202b-A are updated based on (FIG. 13, step S509).
 クライアント端末1b-Bのモデル更新部18bは、計算部17bが入力層群200b-B-α内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200b-B-α内の層の重みを更新し、計算部17bが入力層群200b-B-β内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200b-B-β内の層の重みを更新する。また、クライアント端末1b-Bのモデル更新部18bは、計算部16bが第3のモデルの出力値とデータαのラベルとに基づいて計算した誤差関数の勾配の計算結果と、計算部16bが第4のモデルの出力値とデータβのラベルとに基づいて計算した誤差関数の勾配の計算結果との平均値を、出力層群202b-B内の層の重み毎に計算し、計算した平均値に基づいて出力層群202b-B内の層の重みを更新する(ステップS509)。 The model update unit 18b of the client terminal 1b-B is a layer in the input layer group 200b-B-α based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-B-α. The weights of the layers in the input layer group 200b-B-β are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200b-B-β. Further, in the model update unit 18b of the client terminal 1b-B, the calculation unit 16b calculates the gradient of the error function based on the output value of the third model and the label of the data α, and the calculation unit 16b is the first. The average value of the output value of the model 4 and the calculation result of the gradient of the error function calculated based on the label of the data β is calculated for each layer weight in the output layer group 202b-B, and the calculated average value is calculated. The weights of the layers in the output layer group 202b-B are updated based on (step S509).
 ただし、クライアント端末1b-A,1b-Bは、サンプルデータを取得できなかった場合、あるいはサンプルデータにラベルが付加されていない場合には、自装置での誤差関数の計算結果を使用して入力層群を更新することはできない。 However, if the sample data cannot be acquired or the sample data is not labeled, the client terminals 1b-A and 1b-B input using the calculation result of the error function in their own device. The strata cannot be updated.
 例えばクライアント端末1b-Aがデータβを取得できなかったか、あるいはクライアント端末1b-Aが取得したデータβにラベルが付加されていなかったとする。また、クライアント端末1b-Bがデータαを取得できなかったか、あるいはクライアント端末1b-Bが取得したデータαにラベルが付加されていなかったとする。この場合、クライアント端末1b-Aの計算部16bは、第2のモデルの出力値を用いて出力層群202b-A内の層の重みについて誤差関数の勾配を計算することができず、クライアント端末1b-Bの計算部16bは、第3のモデルの出力値を用いて出力層群202b-B内の層の重みについて誤差関数の勾配を計算することができない。 For example, it is assumed that the client terminal 1b-A could not acquire the data β, or the data β acquired by the client terminal 1b-A was not labeled. Further, it is assumed that the client terminal 1b-B could not acquire the data α, or the data α acquired by the client terminal 1b-B was not labeled. In this case, the calculation unit 16b of the client terminal 1b-A cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-A using the output value of the second model, and the client terminal cannot calculate the gradient of the error function. The calculation unit 16b of 1b-B cannot calculate the gradient of the error function for the weights of the layers in the output layer group 202b-B using the output value of the third model.
 したがって、クライアント端末1b-Aのモデル更新部18bは、第2のモデルの出力値を用いて計算部16bが計算する筈であった結果を出力層群202b-Aの更新に使用することはできない。同様に、クライアント端末1b-Bのモデル更新部18bは、第3のモデルの出力値を用いて計算部16bが計算する筈であった結果を出力層群202b-Bの更新に使用することはできない。 Therefore, the model update unit 18b of the client terminal 1b-A cannot use the result that the calculation unit 16b should have calculated using the output value of the second model to update the output layer group 202b-A. .. Similarly, the model update unit 18b of the client terminal 1b-B may use the result that the calculation unit 16b should have calculated using the output value of the third model to update the output layer group 202b-B. Can not.
 また、クライアント端末1b-Aの計算部17bは、入力層群200b-A-β内の層の重みについて誤差関数の勾配を計算することができず、クライアント端末1b-Bの計算部17bは、入力層群200b-B-α内の層の重みについて誤差関数の勾配を計算することができない。したがって、クライアント端末1b-Aのモデル更新部18bは、計算部17bが計算する筈であった結果を入力層群200b-A-βの更新に使用することはできない。同様に、クライアント端末1b-Bのモデル更新部18bは、計算部17bが計算する筈であった結果を入力層群200b-B-αの更新に使用することはできない。このため、入力層群200b-A-βと200b-B-αを更新するためには、ラベル付きのデータα,βを取得できたクライアント端末から重みを送信する必要がある。 Further, the calculation unit 17b of the client terminal 1b-A cannot calculate the gradient of the error function with respect to the weight of the layer in the input layer group 200b-A-β, and the calculation unit 17b of the client terminal 1b-B does not. The gradient of the error function cannot be calculated for the weights of the layers in the input layer group 200b-B-α. Therefore, the model update unit 18b of the client terminal 1b-A cannot use the result that the calculation unit 17b should have calculated to update the input layer group 200b-A-β. Similarly, the model update unit 18b of the client terminal 1b-B cannot use the result that the calculation unit 17b should have calculated to update the input layer group 200b-B-α. Therefore, in order to update the input layer groups 200b-A-β and 200b-B-α, it is necessary to transmit the weight from the client terminal that can acquire the labeled data α and β.
 具体的には、クライアント端末1b-Aの送信部19は、クライアント端末1b-Aのモデル更新部18bが入力層群200b-A-βを更新できなかったため、データβ用の入力層群内の層の重みの更新結果を他のクライアント端末に対して要求する(図13ステップS510)。 Specifically, in the transmission unit 19 of the client terminal 1b-A, the model update unit 18b of the client terminal 1b-A could not update the input layer group 200b-A-β, so that the transmission unit 19 is included in the input layer group for data β. The update result of the layer weight is requested from another client terminal (step S510 in FIG. 13).
 クライアント端末1b-Bの送信部19は、クライアント端末1b-Bのモデル更新部18bが入力層群200b-B-αを更新できなかったため、データα用の入力層群内の層の重みの更新結果を他のクライアント端末に対して要求する(ステップS510)。 In the transmission unit 19 of the client terminal 1b-B, since the model update unit 18b of the client terminal 1b-B could not update the input layer group 200b-B-α, the weight of the layer in the input layer group for the data α is updated. The result is requested from another client terminal (step S510).
 クライアント端末1b-Aの受信部30は、クライアント端末1b-Bからの要求を受信する(図13ステップS511)。クライアント端末1b-Aの送信部19は、クライアント端末1b-Bからの要求に応じて入力層群200b-A-α内の層の重みの更新結果をクライアント端末1b-Bに送信する(図13ステップS512)。 The receiving unit 30 of the client terminal 1b-A receives the request from the client terminal 1b-B (step S511 in FIG. 13). The transmission unit 19 of the client terminal 1b-A transmits the update result of the weight of the layer in the input layer group 200b-A-α to the client terminal 1b-B in response to the request from the client terminal 1b-B (FIG. 13). Step S512).
 クライアント端末1b-Bの受信部30は、クライアント端末1b-Aからの要求を受信する(ステップS511)。クライアント端末1b-Bの送信部19は、クライアント端末1b-Aからの要求に応じて入力層群200b-B-β内の層の重みの更新結果をクライアント端末1b-Aに送信する(ステップS512)。 The receiving unit 30 of the client terminal 1b-B receives the request from the client terminal 1b-A (step S511). The transmission unit 19 of the client terminal 1b-B transmits the update result of the weight of the layer in the input layer group 200b-B-β to the client terminal 1b-A in response to the request from the client terminal 1b-A (step S512). ).
 クライアント端末1b-Aの受信部30は、クライアント端末1b-Bから入力層群200b-B-β内の層の重みの更新結果を受信する(図13ステップS513)。クライアント端末1b-Aのモデル更新部18bは、入力層群200b-B-β内の層の重みの更新結果を用いて入力層群200b-A-β内の層の重みを更新する(図13ステップS514)。 The receiving unit 30 of the client terminal 1b-A receives the update result of the weight of the layer in the input layer group 200b-B-β from the client terminal 1b-B (step S513 in FIG. 13). The model update unit 18b of the client terminal 1b-A updates the weights of the layers in the input layer group 200b-A-β by using the update result of the weights of the layers in the input layer group 200b-B-β (FIG. 13). Step S514).
 クライアント端末1b-Bの受信部30は、クライアント端末1b-Aから入力層群200b-A-α内の層の重みの更新結果を受信する(ステップS513)。クライアント端末1b-Bのモデル更新部18bは、入力層群200b-A-α内の層の重みの更新結果を用いて入力層群200b-B-α内の層の重みを更新する(ステップS514)。
 クライアント端末1b-A,1b-Bのそれぞれがラベル付きのデータα,βを取得できた場合にはステップS510~S514の処理が不要になることは言うまでもない。
The receiving unit 30 of the client terminal 1b-B receives the update result of the weight of the layer in the input layer group 200b-A-α from the client terminal 1b-A (step S513). The model update unit 18b of the client terminal 1b-B updates the weights of the layers in the input layer group 200b-B-α by using the update result of the weights of the layers in the input layer group 200b-A-α (step S514). ).
Needless to say, if the client terminals 1b-A and 1b-B can acquire the labeled data α and β, the processing of steps S510 to S514 becomes unnecessary.
 深層学習では、異なる方法によって取得されたデータを統合して推論・学習を行いたい場合があるが、データαをクライアント端末1b-Aしか取得できず、データβをクライアント端末1b-Bしか取得できないことがある。このような場合でも、クライアント端末1b-A,1b-Bがそれぞれ取得できたサンプルデータを出し合うことで推論を実現することができ、かつサンプルデータに含まれる個人情報を保護することができる。 In deep learning, there are cases where it is desired to integrate data acquired by different methods for inference / learning, but data α can only be acquired from client terminal 1b-A, and data β can only be acquired from client terminal 1b-B. Sometimes. Even in such a case, inference can be realized by exchanging the sample data acquired by the client terminals 1b-A and 1b-B, respectively, and the personal information included in the sample data can be protected.
 本実施例のように多角的なモデルを構築できれば、サンプルデータα,βのうちどちらかのデータしか取得できないクライアント端末でも、ある程度の精度の推論を実施することができ、初期の意思決定に役立てることができる。また、サンプルデータα,βのうちどちらかのデータしか取得できないクライアント端末でも、サンプルデータを取得できたクライアント端末と計算結果を共有することでモデルの学習を実施することができ、かつサンプルデータに含まれる個人情報を保護することができる。 If a multifaceted model can be constructed as in this example, even a client terminal that can acquire only one of the sample data α and β can perform inference with a certain degree of accuracy, which is useful for initial decision making. be able to. In addition, even for a client terminal that can acquire only one of the sample data α and β, the model can be trained by sharing the calculation result with the client terminal that can acquire the sample data, and the sample data can be used. It is possible to protect the personal information contained.
[第4の実施例]
 次に、本発明の第4の実施例について説明する。図14は本発明の第4の実施例に係る分散深層学習システムの構成を示す図である。本実施例の分散深層学習システムは、クライアント端末1c-A,1c-Bと、クライアント端末1c-A,1c-Bとネットワークを介して接続されたクラウドサーバ2cとから構成される。
[Fourth Example]
Next, a fourth embodiment of the present invention will be described. FIG. 14 is a diagram showing a configuration of a distributed deep learning system according to a fourth embodiment of the present invention. The distributed deep learning system of this embodiment is composed of client terminals 1c-A and 1c-B, client terminals 1c-A and 1c-B, and a cloud server 2c connected via a network.
 第3の実施例と同様に、本実施例では、複数のクライアント端末のそれぞれが別個の入力層群と出力層群を有し、かつサンプルデータの種別が複数存在して、それぞれのクライアント端末がサンプルデータの種別毎に入力層群と出力層群を有する。 Similar to the third embodiment, in this embodiment, each of the plurality of client terminals has a separate input layer group and output layer group, and there are a plurality of types of sample data, and each client terminal has a different type. It has an input layer group and an output layer group for each type of sample data.
 例えば図14の例のようにクライアント端末が2つでサンプルデータの種別が2つの場合、モデルが4つ作成される。サンプルデータα(例えば画像データ)用の第1のモデルの入力層群200c-A-αと出力層群202c-A-αとはクライアント端末1c-Aに実装され、第1のモデルの中間層群201cはクラウドサーバ2cに実装されている。サンプルデータβ(例えば音声データ)用の第2のモデルの入力層群200c-A-βと出力層群202c-A-βとはクライアント端末1c-Aに実装され、第2のモデルの中間層群201cはクラウドサーバ2cに実装されている。 For example, when there are two client terminals and two types of sample data as in the example of FIG. 14, four models are created. The input layer group 200c-A-α and the output layer group 202c-A-α for the sample data α (for example, image data) are mounted on the client terminal 1c-A, and the intermediate layer of the first model. The group 201c is implemented in the cloud server 2c. The input layer group 200c-A-β and the output layer group 202c-A-β of the second model for sample data β (for example, voice data) are mounted on the client terminal 1c-A, and the intermediate layer of the second model. The group 201c is implemented in the cloud server 2c.
 データα用の第3のモデルの入力層群200c-B-αと出力層群202c-B-αとはクライアント端末1c-Bに実装され、第3のモデルの中間層群201cはクラウドサーバ2cに実装されている。データβ用の第4のモデルの入力層群200c-B-βと出力層群202c-B-βとはクライアント端末1c-Bに実装され、第4のモデルの中間層群201cはクラウドサーバ2cに実装されている。第1~第4のモデルは、中間層群201cを共有する。 The input layer group 200c-B-α of the third model and the output layer group 202c-B-α for the data α are mounted on the client terminal 1c-B, and the intermediate layer group 201c of the third model is the cloud server 2c. It is implemented in. The input layer group 200c-B-β and the output layer group 202c-B-β of the fourth model for data β are mounted on the client terminal 1c-B, and the intermediate layer group 201c of the fourth model is the cloud server 2c. It is implemented in. The first to fourth models share the intermediate group 201c.
 図15はクライアント端末1c-A,1c-Bの構成を示すブロック図であり、図11と同一の構成には同一の符号を付してある。クライアント端末1c-A,1c-Bは、それぞれ記憶部10cと、データ取得部11と、計算部12b,15c,16c,17bと、送信部13と、受信部14と、モデル更新部18cと、送信部19cと、受信部30cとを備えている。 FIG. 15 is a block diagram showing the configurations of client terminals 1c-A and 1c-B, and the same configurations as those in FIG. 11 are designated by the same reference numerals. The client terminals 1c-A and 1c-B have a storage unit 10c, a data acquisition unit 11, a calculation unit 12b, 15c, 16c, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18c, respectively. It includes a transmitting unit 19c and a receiving unit 30c.
 クライアント端末1c-Aの記憶部10cには、入力層群200c-A-α,200c-A-βと出力層群202c-A-α,202c-A-βのデータが記憶されており、入力層群200c-A-α,200c-A-βと出力層群202c-A-α,202c-A-βとが構築されている。この入力層群200c-A-α,200c-A-βと出力層群202c-A-α,202c-A-βの構築は、クライアント端末1c-AのCPU(不図示)によって行われる。 The storage unit 10c of the client terminal 1c-A stores data of the input layer groups 200c-A-α, 200c-A-β and the output layer groups 202c-A-α, 202c-A-β, and inputs the data. Group groups 200c-A-α, 200c-A-β and output group groups 202c-A-α, 202c-A-β are constructed. The construction of the input layer groups 200c-A-α, 200c-A-β and the output layer groups 202c-A-α, 202c-A-β is performed by the CPU (not shown) of the client terminal 1c-A.
 クライアント端末1c-Bの記憶部10cには、入力層群200c-B-α,200c-B-βと出力層群202c-B-α,202c-B-βのデータが記憶されており、入力層群200c-B-α,200c-B-βと出力層群202c-B-α,202c-B-βとが構築されている。この入力層群200c-B-α,200c-B-βと出力層群202c-B-α,202c-B-βの構築は、クライアント端末1c-BのCPU(不図示)によって行われる。
 クラウドサーバ2cの構成は、第3の実施例のクラウドサーバ2bと同様であるので、図12の符号を用いて説明する。
The storage unit 10c of the client terminal 1c-B stores data of the input layer groups 200c-B-α, 200c-B-β and the output layer groups 202c-B-α, 202c-B-β, and inputs the data. Group groups 200c-B-α, 200c-B-β and output group groups 202c-B-α, 202c-B-β are constructed. The construction of the input layer groups 200c-B-α, 200c-B-β and the output layer groups 202c-B-α, 202c-B-β is performed by the CPU (not shown) of the client terminal 1c-B.
Since the configuration of the cloud server 2c is the same as that of the cloud server 2b of the third embodiment, the reference numerals in FIG. 12 will be used for description.
 クライアント端末1c-A,1c-Bは、それぞれ取得したサンプルデータについて図4の処理を実行する。ステップS100~S102の処理は第3の実施例で説明した処理と同様である。クラウドサーバ2cの推論の動作は、第3の実施例と同様である。 The client terminals 1c-A and 1c-B each execute the process shown in FIG. 4 for the acquired sample data. The processes of steps S100 to S102 are the same as the processes described in the third embodiment. The inference operation of the cloud server 2c is the same as that of the third embodiment.
 クライアント端末1c-Aの計算部15cは、入力層群200c-A-αの出力値から計算された中間層群201cの出力値をクラウドサーバ2cから受信部14を介して受け取り、この中間層群201cの出力値を出力層群202c-A-αに入力した結果を計算する(図4ステップS104)。また、クライアント端末1c-Aの計算部15cは、入力層群200c-A-βの出力値から計算された中間層群201cの出力値をクラウドサーバ2cから受け取り、この中間層群201cの出力値を出力層群202c-A-βに入力した結果を計算する(ステップS104)。 The calculation unit 15c of the client terminal 1c-A receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-A-α from the cloud server 2c via the reception unit 14, and this intermediate layer group. The result of inputting the output value of 201c into the output layer group 202c-A-α is calculated (FIG. 4, step S104). Further, the calculation unit 15c of the client terminal 1c-A receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-A-β from the cloud server 2c, and the output value of the intermediate layer group 201c. Is input to the output layer group 202c-A-β, and the result is calculated (step S104).
 クライアント端末1c-Bの計算部15cは、入力層群200c-B-αの出力値から計算された中間層群201cの出力値をクラウドサーバ2cから受信部14を介して受け取り、この中間層群201cの出力値を出力層群202c-B-αに入力した結果を計算する(ステップS104)。また、クライアント端末1c-Bの計算部15cは、入力層群200c-B-βの出力値から計算された中間層群201cの出力値をクラウドサーバ2cから受け取り、この中間層群201cの出力値を出力層群202c-B-βに入力した結果を計算する(ステップS104)。 The calculation unit 15c of the client terminal 1c-B receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-B-α from the cloud server 2c via the reception unit 14, and the intermediate layer group The result of inputting the output value of 201c into the output layer group 202c-B-α is calculated (step S104). Further, the calculation unit 15c of the client terminal 1c-B receives the output value of the intermediate layer group 201c calculated from the output value of the input layer group 200c-B-β from the cloud server 2c, and the output value of the intermediate layer group 201c. Is input to the output layer group 202c-B-β, and the result is calculated (step S104).
 図16は本実施例の分散深層学習システムのクライアント端末1c-A,1c-Bの学習動作を説明するフローチャートである。クラウドサーバ2cの学習の動作の流れは、第1の実施例のクラウドサーバ2の動作と同様なので、図7の符号を用いて説明する。 FIG. 16 is a flowchart illustrating the learning operation of the client terminals 1c-A and 1c-B of the distributed deep learning system of this embodiment. Since the flow of the learning operation of the cloud server 2c is the same as the operation of the cloud server 2 of the first embodiment, the reference numerals of FIG. 7 will be used for description.
 クライアント端末1c-A,1c-Bは、それぞれ取得したラベル付きのサンプルデータについて図16の処理を実行する。
 図16のステップS600~S604のクライアント端末1c-A,1c-Bの処理は、本実施例で説明したステップS100~S104の処理と同じである。
The client terminals 1c-A and 1c-B each execute the processing of FIG. 16 for the acquired sample data with labels.
The processing of the client terminals 1c-A and 1c-B in steps S600 to S604 of FIG. 16 is the same as the processing of steps S100 to S104 described in this embodiment.
 クライアント端末1c-Aの計算部16cと送信部13とは、図6のステップS305,S306と同様の処理をデータαとデータβのそれぞれについて時分割で実行する(図16ステップS605,S606)。具体的には、計算部16cは、第1のモデルの出力値とデータαのラベルとに基づいて誤差関数の勾配を出力層群202c-A-α内の層の重みの各々について計算し、第2のモデルの出力値とデータβのラベルとに基づいて誤差関数の勾配を出力層群202c-A-β内の層の重みの各々について計算する。 The calculation unit 16c and the transmission unit 13 of the client terminal 1c-A execute the same processing as in steps S305 and S306 of FIG. 6 in a time-division manner for each of the data α and the data β (steps S605 and S606 of FIG. 16). Specifically, the calculation unit 16c calculates the gradient of the error function based on the output value of the first model and the label of the data α for each of the weights of the layers in the output layer group 202c-A-α. The gradient of the error function is calculated for each of the layer weights in the output layer group 202c-A-β based on the output value of the second model and the label of the data β.
 クライアント端末1c-Bの計算部16cと送信部13とは、ステップS305,S306と同様の処理をデータαとデータβのそれぞれについて時分割で実行する(ステップS605,S606)。具体的には、計算部16cは、第3のモデルの出力値とデータαのラベルとに基づいて誤差関数の勾配を出力層群202c-B-α内の層の重みの各々について計算し、第4のモデルの出力値とデータβのラベルとに基づいて誤差関数の勾配を出力層群202c-A-β内の層の重みの各々について計算する。 The calculation unit 16c and the transmission unit 13 of the client terminal 1c-B execute the same processing as in steps S305 and S306 in a time-division manner for each of the data α and the data β (steps S605 and S606). Specifically, the calculation unit 16c calculates the gradient of the error function based on the output value of the third model and the label of the data α for each of the weights of the layers in the output layer group 202c-B-α. The gradient of the error function is calculated for each of the layer weights in the output layer group 202c-A-β based on the output value of the fourth model and the label of the data β.
 図7のステップS400~S406におけるクラウドサーバ2cの処理は、第3の実施例で説明した処理と同様である。
 図16のステップS607,S608のクライアント端末1c-A,1c-Bの処理は、第3の実施例で説明したステップS507,508の処理と同様である。
The processing of the cloud server 2c in steps S400 to S406 of FIG. 7 is the same as the processing described in the third embodiment.
The processing of the client terminals 1c-A and 1c-B in steps S607 and S608 of FIG. 16 is the same as the processing of steps S507 and 508 described in the third embodiment.
 クライアント端末1c-Aのモデル更新部18cは、計算部17bが入力層群200c-A-α内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200c-A-α内の層の重みを更新し、計算部17bが入力層群200c-A-β内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200c-A-β内の層の重みを更新する。また、クライアント端末1c-Aのモデル更新部18cは、計算部16cが第1のモデルの出力値とデータαのラベルとに基づいて計算した誤差関数の勾配に基づいて出力層群202c-A-α内の層の重みを更新し、計算部16cが第2のモデルの出力値とデータβのラベルとに基づいて計算した誤差関数の勾配に基づいて出力層群202c-A-β内の層の重みを更新する(図16ステップS609)。 The model update unit 18c of the client terminal 1c-A is a layer in the input layer group 200c-A-α based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-A-α. The weights of the layers in the input layer group 200c-A-β are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-A-β. Further, the model update unit 18c of the client terminal 1c-A has the output layer group 202c-A- based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the first model and the label of the data α. The layers in the output layer group 202c-A-β are updated based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the second model and the label of the data β by updating the weights of the layers in α. The weight of is updated (FIG. 16 step S609).
 クライアント端末1c-Bのモデル更新部18cは、計算部17bが入力層群200c-B-α内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200c-B-α内の層の重みを更新し、計算部17bが入力層群200c-B-β内の層の重みについて計算した誤差関数の勾配に基づいて入力層群200c-B-β内の層の重みを更新する。また、クライアント端末1c-Bのモデル更新部18cは、計算部16cが第3のモデルの出力値とデータαのラベルとに基づいて計算した誤差関数の勾配に基づいて出力層群202c-B-α内の層の重みを更新し、計算部16cが第4のモデルの出力値とデータβのラベルとに基づいて計算した誤差関数の勾配に基づいて出力層群202c-B-β内の層の重みを更新する(ステップS609)。 The model update unit 18c of the client terminal 1c-B is a layer in the input layer group 200c-B-α based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-B-α. The weights of the layers in the input layer group 200c-B-β are updated based on the gradient of the error function calculated by the calculation unit 17b for the weights of the layers in the input group group 200c-B-β. Further, the model update unit 18c of the client terminal 1c-B has an output layer group 202c-B- based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the third model and the label of the data α. The layers in the output layer group 202c-B-β are updated based on the gradient of the error function calculated by the calculation unit 16c based on the output value of the fourth model and the label of the data β by updating the weights of the layers in α. The weight of is updated (step S609).
 ただし、クライアント端末1c-A,1c-Bは、サンプルデータを取得できなかった場合、あるいはサンプルデータにラベルが付加されていない場合には、自装置での誤差関数の計算結果を使用して入力層群と出力層群を更新することはできない。 However, if the sample data cannot be acquired or the sample data is not labeled, the client terminals 1c-A and 1c-B input using the calculation result of the error function in their own device. It is not possible to update layers and output layers.
 例えばクライアント端末1c-Aがデータβを取得できなかったか、あるいはクライアント端末1c-Aが取得したデータβにラベルが付加されていなかったとする。また、クライアント端末1c-Bがデータαを取得できなかったか、あるいはクライアント端末1c-Bが取得したデータαにラベルが付加されていなかったとする。この場合、クライアント端末1c-Aの計算部16cは、出力層群202c-A-β内の層の重みについて誤差関数の勾配を計算することができず、クライアント端末1c-Bの計算部16cは、出力層群202c-B-α内の層の重みについて誤差関数の勾配を計算することができない。クライアント端末1c-Aの計算部17bは、入力層群200c-A-β内の層の重みについて誤差関数の勾配を計算することができず、クライアント端末1c-Bの計算部17bは、入力層群200c-B-α内の層の重みについて誤差関数の勾配を計算することができない。 For example, it is assumed that the client terminal 1c-A could not acquire the data β, or the data β acquired by the client terminal 1c-A was not labeled. Further, it is assumed that the client terminal 1c-B could not acquire the data α, or the data α acquired by the client terminal 1c-B was not labeled. In this case, the calculation unit 16c of the client terminal 1c-A cannot calculate the gradient of the error function with respect to the weight of the layer in the output layer group 202c-A-β, and the calculation unit 16c of the client terminal 1c-B cannot calculate. , The gradient of the error function cannot be calculated for the weights of the layers in the output layer group 202c-B-α. The calculation unit 17b of the client terminal 1c-A cannot calculate the gradient of the error function for the weight of the layer in the input layer group 200c-A-β, and the calculation unit 17b of the client terminal 1c-B cannot calculate the gradient of the error function. The gradient of the error function cannot be calculated for the weights of the layers in the group 200c-B-α.
 したがって、入力層群200c-A-β,200c-B-αと出力層群202c-A-β,202c-B-αとを更新するためには、ラベル付きのデータα,βを取得できたクライアント端末から重みを送信する必要がある。 Therefore, in order to update the input layer groups 200c-A-β, 200c-B-α and the output layer groups 202c-A-β, 202c-B-α, the labeled data α, β could be acquired. It is necessary to send the weight from the client terminal.
 具体的には、クライアント端末1c-Aの送信部19cは、クライアント端末1c-Aのモデル更新部18cが入力層群200c-A-βと出力層群202c-A-βとを更新できなかったため、データβ用の入力層群内の層の重みの更新結果とデータβ用の出力層群内の層の重みの更新結果とを他のクライアント端末に対して要求する(図16ステップS610)。 Specifically, in the transmission unit 19c of the client terminal 1c-A, the model update unit 18c of the client terminal 1c-A could not update the input layer group 200c-A-β and the output layer group 202c-A-β. , The update result of the weight of the layer in the input layer group for data β and the update result of the weight of the layer in the output layer group for data β are requested from other client terminals (FIG. 16 step S610).
 クライアント端末1c-Bの送信部19cは、クライアント端末1c-Bのモデル更新部18cが入力層群200c-B-αと出力層群202c-B-αとを更新できなかったため、データα用の入力層群内の層の重みの更新結果とデータα用の出力層群内の層の重みの更新結果とを他のクライアント端末に対して要求する(ステップS610)。 The transmission unit 19c of the client terminal 1c-B is for data α because the model update unit 18c of the client terminal 1c-B could not update the input layer group 200c-B-α and the output layer group 202c-B-α. The update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for the data α are requested from the other client terminal (step S610).
 クライアント端末1c-Aの受信部30cは、クライアント端末1c-Bからの要求を受信する(図16ステップS611)。クライアント端末1c-Aの送信部19cは、クライアント端末1c-Bからの要求に応じて入力層群200c-A-α内の層の重みの更新結果と出力層群202c-A-α内の層の重みの更新結果とを、クライアント端末1c-Bに送信する(図16ステップS612)。 The receiving unit 30c of the client terminal 1c-A receives the request from the client terminal 1c-B (step S611 in FIG. 16). The transmission unit 19c of the client terminal 1c-A updates the weight of the layer in the input layer group 200c-A-α and the layer in the output layer group 202c-A-α in response to the request from the client terminal 1c-B. The update result of the weight of is transmitted to the client terminal 1c-B (step S612 in FIG. 16).
 クライアント端末1c-Bの受信部30cは、クライアント端末1c-Aからの要求を受信する(ステップS611)。クライアント端末1c-Bの送信部19cは、クライアント端末1c-Aからの要求に応じて入力層群200c-B-β内の層の重みの更新結果と出力層群202c-B-β内の層の重みの更新結果とを、クライアント端末1c-Aに送信する(ステップS612)。 The receiving unit 30c of the client terminal 1c-B receives the request from the client terminal 1c-A (step S611). The transmission unit 19c of the client terminal 1c-B updates the weight of the layer in the input layer group 200c-B-β and the layer in the output layer group 202c-B-β in response to the request from the client terminal 1c-A. The update result of the weight of is transmitted to the client terminal 1c-A (step S612).
 クライアント端末1c-Aの受信部30cは、クライアント端末1c-Bから入力層群200c-B-β内の層の重みの更新結果と出力層群202c-B-β内の層の重みの更新結果とを受信する(図16ステップS613)。クライアント端末1c-Aのモデル更新部18cは、入力層群200c-B-β内の層の重みの更新結果を用いて入力層群200c-A-β内の層の重みを更新し、出力層群202c-B-β内の層の重みの更新結果を用いて出力層群202c-A-β内の層の重みを更新する(図16ステップS614)。 The receiving unit 30c of the client terminal 1c-A updates the weight of the layer in the input layer group 200c-B-β and the update result of the weight of the layer in the output layer group 202c-B-β from the client terminal 1c-B. And is received (FIG. 16 step S613). The model update unit 18c of the client terminal 1c-A updates the weights of the layers in the input layer group 200c-A-β by using the update result of the weights of the layers in the input layer group 200c-B-β, and the output layer. The weight of the layer in the output layer group 202c-A-β is updated using the update result of the weight of the layer in the group 202c-B-β (step S614 of FIG. 16).
 クライアント端末1c-Bの受信部30cは、クライアント端末1c-Aから入力層群200c-A-α内の層の重みの更新結果と出力層群202c-A-α内の層の重みの更新結果とを受信する(ステップS613)。クライアント端末1c-Bのモデル更新部18cは、入力層群200c-A-α内の層の重みの更新結果を用いて入力層群200c-B-α内の層の重みを更新し、出力層群202c-A-α内の層の重みの更新結果を用いて出力層群202c-B-α内の層の重みを更新する(ステップS614)。
 クライアント端末1c-A,1c-Bのそれぞれがラベル付きのデータα,βを取得できた場合にはステップS610~S614の処理が不要になることは言うまでもない。
The receiving unit 30c of the client terminal 1c-B updates the weight of the layer in the input layer group 200c-A-α and the update result of the weight of the layer in the output layer group 202c-A-α from the client terminal 1c-A. And is received (step S613). The model update unit 18c of the client terminal 1c-B updates the weights of the layers in the input layer group 200c-B-α by using the update result of the weights of the layers in the input layer group 200c-A-α, and the output layer. The weight of the layer in the output layer group 202c-B-α is updated using the update result of the weight of the layer in the group 202c-A-α (step S614).
Needless to say, if the client terminals 1c-A and 1c-B can acquire the labeled data α and β, the processing of steps S610 to S614 becomes unnecessary.
 通常であれば、多様なデータと多様なタスクを深層学習するためには、それぞれのデータにラベル付けができる専門家が必要になる。本実施例を用いれば、それぞれのデータとラベル付けできる専門家とが離れた場所に存在する場合にも、データとラベルを通信せずに処理できる。よって、個人情報をより効果的に保護することができる。 Normally, in order to deep-learn various data and various tasks, an expert who can label each data is required. By using this embodiment, even when each data and a labelable expert exist in a distant place, the data and the label can be processed without communication. Therefore, personal information can be protected more effectively.
[第5の実施例]
 第3、第4の実施例における重みの通信方法には、一極集中型と分散型が存在する。一極集中型の構成を図17に示す。本実施例の分散深層学習システムは、クライアント端末1d-A,1d-Bと、クラウドサーバ2cと、ネットワークを介してクライアント端末1d-A,1d-Bと接続されたストレージサーバ3とから構成される。
[Fifth Example]
The weight communication method in the third and fourth embodiments includes a centralized type and a distributed type. A centralized configuration is shown in FIG. The distributed deep learning system of this embodiment is composed of client terminals 1d-A and 1d-B, a cloud server 2c, and a storage server 3 connected to client terminals 1d-A and 1d-B via a network. NS.
 図18はクライアント端末1d-A,1d-Bの構成を示すブロック図であり、図15と同一の構成には同一の符号を付してある。クライアント端末1d-A,1d-Bは、それぞれ記憶部10cと、データ取得部11と、計算部12b,15c,16c,17bと、送信部13と、受信部14と、モデル更新部18cと、書込部31と、読出部32とを備えている。 FIG. 18 is a block diagram showing the configurations of client terminals 1d-A and 1d-B, and the same configurations as those in FIG. 15 are designated by the same reference numerals. The client terminals 1d-A and 1d-B have a storage unit 10c, a data acquisition unit 11, a calculation unit 12b, 15c, 16c, 17b, a transmission unit 13, a reception unit 14, and a model update unit 18c, respectively. It includes a writing unit 31 and a reading unit 32.
 クライアント端末1d-A,1d-Bの推論の動作とクラウドサーバ2cの推論・学習の動作は、第4の実施例と同様である。
 図19はクライアント端末1d-A,1d-Bの学習動作を説明するフローチャートである。図19のステップS600~S609の処理は第4の実施例と同様である。
The inference operation of the client terminals 1d-A and 1d-B and the inference / learning operation of the cloud server 2c are the same as those in the fourth embodiment.
FIG. 19 is a flowchart illustrating the learning operation of the client terminals 1d-A and 1d-B. The processing of steps S600 to S609 of FIG. 19 is the same as that of the fourth embodiment.
 クライアント端末1d-Aの書込部31は、入力層群200c-A-α,200c-A-β内の層の重みの更新結果と出力層群202c-A-α,202c-A-β内の層の重みの更新結果とをネットワークを介してストレージサーバ3に書き込む(図19ステップS615)。 The writing unit 31 of the client terminal 1d-A has the update result of the layer weight in the input layer group 200c-A-α, 200c-A-β and the output layer group 202c-A-α, 202c-A-β. The update result of the weight of the layer is written to the storage server 3 via the network (step S615 in FIG. 19).
 クライアント端末1d-Bの書込部31は、入力層群200c-B-α,200c-B-β内の層の重みの更新結果と出力層群202c-B-α,202c-B-β内の層の重みの更新結果とをネットワークを介してストレージサーバ3に書き込む(ステップS615)。 The writing unit 31 of the client terminal 1d-B has the update result of the layer weight in the input layer group 200c-B-α, 200c-B-β and the output layer group 202c-B-α, 202c-B-β. The update result of the weight of the layer is written to the storage server 3 via the network (step S615).
 ただし、クライアント端末1d-A,1d-Bは、サンプルデータを取得できなかった場合、あるいはサンプルデータにラベルが付加されていない場合には、少なくとも一部の更新結果をストレージサーバ3に書き込むことはできない。 However, the client terminals 1d-A and 1d-B may write at least a part of the update result to the storage server 3 when the sample data cannot be acquired or the sample data is not labeled. Can not.
 例えばクライアント端末1d-Aがデータβを取得できなかったか、あるいはクライアント端末1d-Aが取得したデータβにラベルが付加されていなかったとする。また、クライアント端末1d-Bがデータαを取得できなかったか、あるいはクライアント端末1d-Bが取得したデータαにラベルが付加されていなかったとする。この場合、クライアント端末1d-Aは、入力層群200c-A-βと出力層群202c-A-βの更新結果をストレージサーバ3に書き込むことはできない。クライアント端末1d-Bは、入力層群200c-B-αと出力層群202c-B-αの更新結果をストレージサーバ3に書き込むことはできない。 For example, it is assumed that the client terminal 1d-A could not acquire the data β, or the data β acquired by the client terminal 1d-A was not labeled. Further, it is assumed that the client terminal 1d-B could not acquire the data α, or the data α acquired by the client terminal 1d-B was not labeled. In this case, the client terminal 1d-A cannot write the update results of the input layer group 200c-A-β and the output layer group 202c-A-β to the storage server 3. The client terminal 1d-B cannot write the update results of the input layer group 200c-B-α and the output layer group 202c-B-α to the storage server 3.
 クライアント端末1d-Aの読出部32は、クライアント端末1d-Aのモデル更新部18cが入力層群200c-A-βと出力層群202c-A-βとを更新できなかったため、データβ用の入力層群内の層の重みの更新結果とデータβ用の出力層群内の層の重みの更新結果とをストレージサーバ3から読み出す(図19ステップS616)。 The reading unit 32 of the client terminal 1d-A is for data β because the model updating unit 18c of the client terminal 1d-A could not update the input layer group 200c-A-β and the output layer group 202c-A-β. The update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for data β are read from the storage server 3 (FIG. 19, step S616).
 クライアント端末1d-Bの読出部32は、クライアント端末1d-Bのモデル更新部18cが入力層群200c-B-αと出力層群202c-B-αとを更新できなかったため、データα用の入力層群内の層の重みの更新結果とデータα用の出力層群内の層の重みの更新結果とをストレージサーバ3から読み出す(ステップS616)。 The reading unit 32 of the client terminal 1d-B is for data α because the model updating unit 18c of the client terminal 1d-B could not update the input layer group 200c-B-α and the output layer group 202c-B-α. The update result of the layer weight in the input layer group and the update result of the layer weight in the output layer group for data α are read from the storage server 3 (step S616).
 クライアント端末1d-Aのモデル更新部18cは、入力層群200c-B-β内の層の重みの更新結果を用いて入力層群200c-A-β内の層の重みを更新し、出力層群202c-B-β内の層の重みの更新結果を用いて出力層群202c-A-β内の層の重みを更新する(図19ステップS617)。 The model update unit 18c of the client terminal 1d-A updates the weights of the layers in the input layer group 200c-A-β by using the update result of the weights of the layers in the input layer group 200c-B-β, and the output layer. The weight of the layer in the output layer group 202c-A-β is updated using the update result of the weight of the layer in the group 202c-B-β (FIG. 19, step S617).
 クライアント端末1d-Bのモデル更新部18cは、入力層群200c-A-α内の層の重みの更新結果を用いて入力層群200c-B-α内の層の重みを更新し、出力層群202c-A-α内の層の重みの更新結果を用いて出力層群202c-B-α内の層の重みを更新する(ステップS617)。 The model update unit 18c of the client terminal 1d-B updates the weights of the layers in the input layer group 200c-B-α by using the update result of the weights of the layers in the input layer group 200c-A-α, and the output layer. The weight of the layer in the output layer group 202c-B-α is updated using the update result of the weight of the layer in the group 202c-A-α (step S617).
 こうして、サンプルデータを取得できなかったクライアント端末は、サンプルデータを取得できたクライアント端末の更新結果をストレージサーバ3から読み出すことにより、入力層群と出力層群を更新することができる。 In this way, the client terminal that could not acquire the sample data can update the input layer group and the output layer group by reading the update result of the client terminal that could acquire the sample data from the storage server 3.
 ストレージサーバ3は、入力層群の種類毎(データの種類毎)に重みデータを蓄積し、また出力層群の種類毎(データの種類毎)に重みデータを蓄積する。
 クライアント端末1d-A,1d-Bの書込部31は、重みの更新結果をストレージサーバ3に書き込む際に、既に同種の重みがストレージサーバ3に蓄積されている場合には、重みを上書きしてもよい。また、書込部31は、蓄積済みの同種の重みと新たに書き込む重みの平均値を計算して、この平均値で蓄積済みの重みを上書きするようにしてもよい。
The storage server 3 stores weight data for each type of input layer group (for each type of data), and also stores weight data for each type of output layer group (for each type of data).
When writing the weight update result to the storage server 3, the writing unit 31 of the client terminals 1d-A and 1d-B overwrites the weight if the same type of weight is already stored in the storage server 3. You may. Further, the writing unit 31 may calculate the average value of the accumulated weights of the same type and the newly written weights, and overwrite the accumulated weights with this average value.
 分散型の構成と動作は第3、第4の実施例で説明したとおりである。この場合は、ステップS510~S514,S610~S614の処理が2つのクライアント端末間の1対1通信で行われる。 The distributed configuration and operation are as described in the third and fourth embodiments. In this case, the processes of steps S510 to S514 and S610 to S614 are performed by one-to-one communication between the two client terminals.
 本実施例では、ラベル付きのサンプルデータを取得できないクライアント端末でも、様々な種類のデータ、タスクに対して推論・学習できるようになる。その際に、重みをクライアント端末間でやり取りするので、データに含まれる個人情報を保護することができる。 In this embodiment, even a client terminal that cannot acquire labeled sample data can infer / learn various types of data and tasks. At that time, since the weight is exchanged between the client terminals, the personal information contained in the data can be protected.
 一極集中型の場合には、クライアント端末を増やしたり、減らしたりしてもシステムが動作するので、通信障害に対して頑健である。
 分散型の場合には、クライアント端末同士で通信を行うので、通信負荷が少なく、また遅延も少ない。ただし、ネットワークが複雑になるためコストが高くなり、通信障害に対する頑健性が低下する。
In the case of the centralized type, the system operates even if the number of client terminals is increased or decreased, so it is robust against communication failures.
In the case of the distributed type, since the client terminals communicate with each other, the communication load is small and the delay is also small. However, as the network becomes complicated, the cost increases and the robustness against communication failure decreases.
 本実施例では、一極集中型の構成を第4の実施例に適用したが、第3の実施例に適用してもよいことは言うまでもない。この場合は、入力層群についてのみ更新結果のストレージサーバへの書き込みとストレージサーバからの読み出しとを行うことになる。
 また、第2~第5の本実施例では、クライアント端末を2台としたが、クライアント端末が3台以上あってもよいことは言うまでもない。
In this embodiment, the centralized configuration is applied to the fourth embodiment, but it goes without saying that the configuration may be applied to the third embodiment. In this case, the update result is written to the storage server and read from the storage server only for the input layer group.
Further, in the second to fifth embodiments, the number of client terminals is two, but it goes without saying that there may be three or more client terminals.
 第1~第5の実施例で説明したクライアント端末の各々は、CPU(Central Prodessing Unit)、記憶装置及びインタフェースを備えたコンピュータと、これらのハードウェア資源を制御するプログラムによって実現することができる。このコンピュータの構成例を図20に示す。 Each of the client terminals described in the first to fifth embodiments can be realized by a computer provided with a CPU (Central Prodessing Unit), a storage device, and an interface, and a program for controlling these hardware resources. An example of the configuration of this computer is shown in FIG.
 コンピュータは、CPU300と、記憶装置301と、インタフェース装置(I/F)302とを備えている。I/F302には、ネットワークなどが接続される。このようなコンピュータにおいて、本発明を実現させるためのプログラムは記憶装置301に格納される。クライアント端末の各々のCPU300は、各々の記憶装置301に格納されたプログラムに従って第1~第5の実施例で説明した処理を実行する。クラウドサーバ、ストレージサーバについても、図20と同様の構成を有するコンピュータによって実現することができる。 The computer includes a CPU 300, a storage device 301, and an interface device (I / F) 302. A network or the like is connected to the I / F 302. In such a computer, the program for realizing the present invention is stored in the storage device 301. Each CPU 300 of the client terminal executes the process described in the first to fifth embodiments according to the program stored in each storage device 301. The cloud server and the storage server can also be realized by a computer having the same configuration as in FIG.
 本発明は、深層学習をクライアント端末とクラウドサーバで分散協調して実行する分散深層学習システムに適用することができる。 The present invention can be applied to a distributed deep learning system that executes deep learning in a distributed and coordinated manner on a client terminal and a cloud server.
 1,1a,1b,1c,1d…クライアント端末、2,2a,2b,2c…クラウドサーバ、3…ストレージサーバ、10,20…記憶部、11…データ取得部、12,12b,15,15b,15c,16,16b,16c,17,17b,22,22a,22b,24,24a,24b…計算部、13,19,19c,23…送信部、14,21,30,30c…受信部、18,18b,25,25a,25b…モデル更新部、31…書込部、32…読出部、200,200a,200b,200c…入力層群、201,201a,201b,201c…中間層群、202,202a,202b,202c…出力層群。 1,1a, 1b, 1c, 1d ... Client terminal, 2,2a, 2b, 2c ... Cloud server, 3 ... Storage server, 10,20 ... Storage unit, 11 ... Data acquisition unit, 12, 12b, 15, 15b, 15c, 16, 16b, 16c, 17, 17b, 22, 22a, 22b, 24, 24a, 24b ... Calculation unit, 13, 19, 19c, 23 ... Transmission unit, 14, 21, 30, 30c ... Receiver unit, 18 , 18b, 25, 25a, 25b ... Model update unit, 31 ... Writing unit, 32 ... Reading unit, 200, 200a, 200b, 200c ... Input layer group, 201, 201a, 201b, 201c ... Intermediate layer group, 202, 202a, 202b, 202c ... Output layer group.

Claims (8)

  1.  クライアント端末と、
     前記クライアント端末とネットワークを介して接続されたクラウドサーバとを備え、
     前記クライアント端末は、
     サンプルデータをモデルの入力層群に入力した結果の出力値を計算するように構成された第1の計算部と、
     前記クラウドサーバによって計算された中間層群の出力値を前記モデルの出力層群に入力して前記モデルの出力値を計算するように構成された第2の計算部と、
     前記モデルの学習時に前記モデルの出力値と前記サンプルデータのラベルとに基づいて前記出力層群の重みの誤差関数を計算するように構成された第3の計算部と、
     前記モデルの学習時に前記クラウドサーバによって計算された前記中間層群の重みの誤差関数に基づいて前記入力層群の重みの誤差関数を計算するように構成された第4の計算部と、
     前記第4の計算部によって計算された誤差関数に基づいて前記入力層群の重みを更新し、前記第3の計算部によって計算された誤差関数に基づいて前記出力層群の重みを更新するように構成された第1のモデル更新部と、
     前記入力層群の出力値と前記出力層群の重みの誤差関数とを前記クラウドサーバに送信するように構成された第1の送信部と、
     前記クラウドサーバによって計算された前記中間層群の出力値と前記中間層群の重みの誤差関数とを受信するように構成された第1の受信部とを備え、
     前記クラウドサーバは、
     前記クライアント端末によって計算された前記入力層群の出力値を前記中間層群に入力した結果の出力値を計算するように構成された第5の計算部と、
     前記モデルの学習時に前記クライアント端末によって計算された前記出力層群の重みの誤差関数に基づいて前記中間層群の重みの誤差関数を計算するように構成された第6の計算部と、
     前記第6の計算部によって計算された誤差関数に基づいて前記中間層群の重みを更新するように構成された第2のモデル更新部と、
     前記中間層群の出力値と前記中間層群の重みの誤差関数とを前記クライアント端末に送信するように構成された第2の送信部と、
     前記クライアント端末によって計算された前記入力層群の出力値と前記出力層群の重みの誤差関数とを受信するように構成された第2の受信部とを備えることを特徴とする分散深層学習システム。
    With the client terminal
    A cloud server connected to the client terminal via a network is provided.
    The client terminal is
    A first calculator configured to calculate the output value of the result of inputting sample data into the input layer of the model,
    A second calculation unit configured to input the output value of the intermediate layer group calculated by the cloud server into the output layer group of the model and calculate the output value of the model.
    A third calculation unit configured to calculate the error function of the weights of the output layer group based on the output value of the model and the label of the sample data when training the model.
    A fourth calculation unit configured to calculate the error function of the weight of the input group based on the error function of the weight of the intermediate group calculated by the cloud server at the time of training the model.
    Update the weights of the input layer group based on the error function calculated by the fourth calculation unit, and update the weights of the output layer group based on the error function calculated by the third calculation unit. The first model update unit configured in
    A first transmission unit configured to transmit the output value of the input layer group and the error function of the weight of the output layer group to the cloud server, and
    It comprises a first receiver configured to receive the output value of the intermediate layer group calculated by the cloud server and the error function of the weight of the intermediate layer group.
    The cloud server is
    A fifth calculation unit configured to calculate the output value of the result of inputting the output value of the input layer group calculated by the client terminal into the intermediate layer group.
    A sixth calculation unit configured to calculate the weight error function of the intermediate layer group based on the weight error function of the output group group calculated by the client terminal during training of the model.
    A second model updater configured to update the weights of the intermediate group based on the error function calculated by the sixth calculator.
    A second transmission unit configured to transmit the output value of the intermediate layer group and the error function of the weight of the intermediate layer group to the client terminal, and
    A distributed deep learning system comprising a second receiving unit configured to receive an output value of the input layer group calculated by the client terminal and an error function of the weight of the output layer group. ..
  2.  請求項1記載の分散深層学習システムにおいて、
     複数の前記クライアント端末にそれぞれ前記入力層群と前記出力層群とが構築され、
     前記クラウドサーバの第5の計算部は、前記複数のクライアント端末によってそれぞれ計算された前記入力層群の出力値を時分割で処理して前記中間層群の出力値を計算し、
     前記クラウドサーバの第6の計算部は、前記複数のクライアント端末によってそれぞれ計算された前記出力層群の重みの誤差関数を時分割で処理して前記中間層群の重みの誤差関数を計算し、
     前記クラウドサーバの第2のモデル更新部は、前記第6の計算部によってクライアント端末毎に計算された誤差関数に基づいて前記中間層群の重みを更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 1,
    The input layer group and the output layer group are constructed on each of the plurality of client terminals, respectively.
    The fifth calculation unit of the cloud server processes the output value of the input layer group calculated by each of the plurality of client terminals in a time division manner to calculate the output value of the intermediate layer group.
    The sixth calculation unit of the cloud server processes the error function of the weight of the output layer group calculated by each of the plurality of client terminals in a time division to calculate the error function of the weight of the intermediate layer group.
    The second model update unit of the cloud server is a distributed deep learning system characterized in that the weights of the intermediate layer group are updated based on an error function calculated for each client terminal by the sixth calculation unit.
  3.  請求項1記載の分散深層学習システムにおいて、
     複数の前記クライアント端末にそれぞれ前記入力層群と前記出力層群とが構築され、かつ前記サンプルデータの種別毎に前記入力層群が構築され、
     各クライアント端末の第1の計算部は、サンプルデータをこのデータの種別用の前記入力層群に入力した結果の出力値を計算し、
     各クライアント端末の第2の計算部は、サンプルデータの種別毎に前記クラウドサーバによって計算された中間層群の出力値を前記出力層群に入力して前記モデルの出力値をサンプルデータの種別毎に計算し、
     各クライアント端末の第3の計算部は、サンプルデータの種別毎に前記出力層群の重みの誤差関数を計算し、
     各クライアント端末の第4の計算部は、前記クラウドサーバによってサンプルデータの種別毎に計算された前記中間層群の重みの誤差関数に基づいて前記入力層群の重みの誤差関数をサンプルデータの種別毎に計算し、
     各クライアント端末の第1のモデル更新部は、前記第4の計算部によってサンプルデータの種別毎に計算された誤差関数に基づいて前記入力層群の重みをサンプルデータの種別毎に更新し、前記第3の計算部によってサンプルデータの種別毎に計算された誤差関数に基づいて前記出力層群の重みを更新し、
     前記クラウドサーバの第5の計算部は、前記複数のクライアント端末によってサンプルデータの種別毎にそれぞれ計算された前記入力層群の出力値を時分割で処理して前記中間層群の出力値を計算し、
     前記クラウドサーバの第6の計算部は、前記複数のクライアント端末によってサンプルデータの種別毎にそれぞれ計算された前記出力層群の重みの誤差関数を時分割で処理して前記中間層群の重みの誤差関数を計算し、
     前記クラウドサーバの第2のモデル更新部は、前記第6の計算部によってクライアント端末毎およびサンプルデータの種別毎に計算された誤差関数に基づいて前記中間層群の重みを更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 1,
    The input layer group and the output layer group are constructed on each of the plurality of client terminals, and the input layer group is constructed for each type of the sample data.
    The first calculation unit of each client terminal calculates the output value of the result of inputting the sample data into the input layer group for this data type.
    The second calculation unit of each client terminal inputs the output value of the intermediate layer group calculated by the cloud server for each type of sample data into the output layer group, and inputs the output value of the model for each type of sample data. Calculate to
    The third calculation unit of each client terminal calculates the error function of the weight of the output layer group for each type of sample data.
    The fourth calculation unit of each client terminal uses the error function of the weight of the input layer group as the type of sample data based on the error function of the weight of the intermediate layer group calculated for each type of sample data by the cloud server. Calculated for each
    The first model update unit of each client terminal updates the weight of the input layer group for each type of sample data based on the error function calculated for each type of sample data by the fourth calculation unit. The weight of the output layer group is updated based on the error function calculated for each type of sample data by the third calculation unit.
    The fifth calculation unit of the cloud server processes the output value of the input layer group calculated for each type of sample data by the plurality of client terminals in a time division to calculate the output value of the intermediate layer group. death,
    The sixth calculation unit of the cloud server processes the error function of the weight of the output layer group calculated for each type of sample data by the plurality of client terminals in a time division to obtain the weight of the intermediate layer group. Calculate the error function,
    The second model update unit of the cloud server is characterized in that the weight of the intermediate layer group is updated based on the error function calculated for each client terminal and each type of sample data by the sixth calculation unit. Distributed deep learning system.
  4.  請求項3記載の分散深層学習システムにおいて、
     各クライアント端末は、
     前記サンプルデータの取得不可により前記入力層群が更新不可となった場合に前記入力層群の重みの更新結果を他のクライアント端末に要求し、前記サンプルデータの取得により前記入力層群を更新済みの場合に前記入力層群の重みの更新結果を他のクライアント端末からの要求に応じて送信するように構成された第3の送信部と、
     前記サンプルデータの取得不可により前記入力層群が更新不可となった場合に前記入力層群の重みの更新結果を他のクライアント端末から受信し、前記サンプルデータの取得により前記入力層群を更新済みの場合に他のクライアント端末からの要求を受信するように構成された第3の受信部とをさらに備え、
     各クライアント端末の第1のモデル更新部は、更新不可となっていた前記入力層群の重みを他のクライアント端末から受信した入力層群の重みに基づいて更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 3,
    Each client terminal
    When the input layer group cannot be updated due to the inability to acquire the sample data, the other client terminal is requested to update the weight of the input layer group, and the input layer group has been updated by acquiring the sample data. In the case of, a third transmitter configured to transmit the update result of the weight of the input layer group in response to a request from another client terminal, and
    When the input layer group cannot be updated due to the inability to acquire the sample data, the update result of the weight of the input layer group is received from another client terminal, and the input layer group has been updated by acquiring the sample data. Further provided with a third receiver configured to receive requests from other client terminals in the case of
    The first model update unit of each client terminal updates the weights of the input layer groups that could not be updated based on the weights of the input layer groups received from other client terminals. system.
  5.  請求項3記載の分散深層学習システムにおいて、
     各クライアント端末とネットワークを介して接続されたストレージサーバをさらに備え、
     各クライアント端末は、
     前記サンプルデータの取得により前記入力層群を更新済みの場合に前記入力層群の重みの更新結果を前記ストレージサーバに書き込むように構成された書込部と、
     前記サンプルデータの取得不可により前記入力層群が更新不可となった場合に前記入力層群の重みの更新結果を前記ストレージサーバから読み出すように構成された読出部とをさらに備え、
     各クライアント端末の第1のモデル更新部は、更新不可となっていた前記入力層群の重みを前記ストレージサーバから読み出した入力層群の重みに基づいて更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 3,
    Further equipped with a storage server connected to each client terminal via a network,
    Each client terminal
    A writing unit configured to write the update result of the weight of the input layer group to the storage server when the input layer group has been updated by acquiring the sample data.
    Further, it is provided with a reading unit configured to read the update result of the weight of the input layer group from the storage server when the input layer group cannot be updated due to the inability to acquire the sample data.
    The first model update unit of each client terminal is a distributed deep learning system characterized in that the weight of the input layer group that cannot be updated is updated based on the weight of the input layer group read from the storage server. ..
  6.  請求項1記載の分散深層学習システムにおいて、
     複数の前記クライアント端末にそれぞれ前記入力層群と前記出力層群とが構築され、かつ前記サンプルデータの種別毎に前記入力層群と前記出力層群とが構築され、
     各クライアント端末の第1の計算部は、サンプルデータをこのデータの種別用の前記入力層群に入力した結果の出力値を計算し、
     各クライアント端末の第2の計算部は、サンプルデータの種別毎に前記クラウドサーバによって計算された中間層群の出力値をこのデータの種別用の前記出力層群に入力して前記モデルの出力値をサンプルデータの種別毎に計算し、
     各クライアント端末の第3の計算部は、サンプルデータの種別毎に前記出力層群の重みの誤差関数を計算し、
     各クライアント端末の第4の計算部は、前記クラウドサーバによってサンプルデータの種別毎に計算された前記中間層群の重みの誤差関数に基づいて前記入力層群の重みの誤差関数をサンプルデータの種別毎に計算し、
     各クライアント端末の第1のモデル更新部は、前記第4の計算部によってサンプルデータの種別毎に計算された誤差関数に基づいて前記入力層群の重みをサンプルデータの種別毎に更新し、前記第3の計算部によってサンプルデータの種別毎に計算された誤差関数に基づいて前記出力層群の重みをサンプルデータの種別毎に更新し、
     前記クラウドサーバの第5の計算部は、前記複数のクライアント端末によってサンプルデータの種別毎にそれぞれ計算された前記入力層群の出力値を時分割で処理して前記中間層群の出力値を計算し、
     前記クラウドサーバの第6の計算部は、前記複数のクライアント端末によってサンプルデータの種別毎にそれぞれ計算された前記出力層群の重みの誤差関数を時分割で処理して前記中間層群の重みの誤差関数を計算し、
     前記クラウドサーバの第2のモデル更新部は、前記第6の計算部によってクライアント端末毎およびサンプルデータの種別毎に計算された誤差関数に基づいて前記中間層群の重みを更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 1,
    The input layer group and the output layer group are constructed in each of the plurality of client terminals, and the input layer group and the output layer group are constructed for each type of the sample data.
    The first calculation unit of each client terminal calculates the output value of the result of inputting the sample data into the input layer group for this data type.
    The second calculation unit of each client terminal inputs the output value of the intermediate layer group calculated by the cloud server for each type of sample data into the output layer group for this data type, and outputs the output value of the model. Is calculated for each type of sample data,
    The third calculation unit of each client terminal calculates the error function of the weight of the output layer group for each type of sample data.
    The fourth calculation unit of each client terminal uses the error function of the weight of the input layer group as the type of sample data based on the error function of the weight of the intermediate layer group calculated for each type of sample data by the cloud server. Calculated for each
    The first model update unit of each client terminal updates the weight of the input layer group for each type of sample data based on the error function calculated for each type of sample data by the fourth calculation unit. The weight of the output layer group is updated for each type of sample data based on the error function calculated for each type of sample data by the third calculation unit.
    The fifth calculation unit of the cloud server processes the output value of the input layer group calculated for each type of sample data by the plurality of client terminals in a time division to calculate the output value of the intermediate layer group. death,
    The sixth calculation unit of the cloud server processes the error function of the weight of the output layer group calculated for each type of sample data by the plurality of client terminals in a time division to obtain the weight of the intermediate layer group. Calculate the error function,
    The second model update unit of the cloud server is characterized in that the weight of the intermediate layer group is updated based on the error function calculated for each client terminal and each type of sample data by the sixth calculation unit. Distributed deep learning system.
  7.  請求項6記載の分散深層学習システムにおいて、
     各クライアント端末は、
     前記サンプルデータの取得不可により前記入力層群と前記出力層群とが更新不可となった場合に前記入力層群と前記出力層群の重みの更新結果を他のクライアント端末に要求し、前記サンプルデータの取得により前記入力層群と前記出力層群とを更新済みの場合に前記入力層群と前記出力層群の重みの更新結果を他のクライアント端末からの要求に応じて送信するように構成された第3の送信部と、
     前記サンプルデータの取得不可により前記入力層群と前記出力層群とが更新不可となった場合に前記入力層群と前記出力層群の重みの更新結果を他のクライアント端末から受信し、前記サンプルデータの取得により前記入力層群と前記出力層群とを更新済みの場合に他のクライアント端末からの要求を受信するように構成された第3の受信部とをさらに備え、
     各クライアント端末の第1のモデル更新部は、更新不可となっていた前記入力層群の重みを他のクライアント端末から受信した入力層群の重みに基づいて更新し、更新不可となっていた前記出力層群の重みを他のクライアント端末から受信した出力層群の重みに基づいて更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 6,
    Each client terminal
    When the input layer group and the output layer group cannot be updated due to the inability to acquire the sample data, the other client terminal is requested to update the weights of the input layer group and the output layer group, and the sample is obtained. When the input layer group and the output layer group have been updated by acquiring data, the update result of the weights of the input layer group and the output layer group is transmitted in response to a request from another client terminal. With the third transmitter
    When the input layer group and the output layer group cannot be updated due to the inability to acquire the sample data, the update result of the weights of the input layer group and the output layer group is received from another client terminal, and the sample is received. Further, a third receiving unit configured to receive a request from another client terminal when the input layer group and the output layer group have been updated by acquiring data is provided.
    The first model update unit of each client terminal updates the weight of the input layer group that could not be updated based on the weight of the input layer group received from the other client terminal, and the update is not possible. A distributed deep learning system characterized in that the weights of the output layer group are updated based on the weights of the output layer group received from other client terminals.
  8.  請求項6記載の分散深層学習システムにおいて、
     各クライアント端末とネットワークを介して接続されたストレージサーバをさらに備え、
     各クライアント端末は、
     前記サンプルデータの取得により前記入力層群と前記出力層群とを更新済みの場合に前記入力層群と前記出力層群の重みの更新結果を前記ストレージサーバに書き込むように構成された書込部と、
     前記サンプルデータの取得不可により前記入力層群と前記出力層群とが更新不可となった場合に前記入力層群と前記出力層群の重みの更新結果を前記ストレージサーバから読み出すように構成された読出部とをさらに備え、
     各クライアント端末の第1のモデル更新部は、更新不可となっていた前記入力層群の重みを前記ストレージサーバから読み出した入力層群の重みに基づいて更新し、更新不可となっていた前記出力層群の重みを前記ストレージサーバから読み出した出力層群の重みに基づいて更新することを特徴とする分散深層学習システム。
    In the distributed deep learning system according to claim 6,
    Further equipped with a storage server connected to each client terminal via a network,
    Each client terminal
    A writing unit configured to write the update result of the weights of the input layer group and the output layer group to the storage server when the input layer group and the output layer group have been updated by acquiring the sample data. When,
    When the input layer group and the output layer group cannot be updated due to the inability to acquire the sample data, the update result of the weights of the input layer group and the output layer group is read from the storage server. Further equipped with a reading unit,
    The first model update unit of each client terminal updates the weight of the input layer group that could not be updated based on the weight of the input layer group read from the storage server, and the output that could not be updated. A distributed deep learning system characterized in that the weights of the layer groups are updated based on the weights of the output layer groups read from the storage server.
PCT/JP2020/020708 2020-05-26 2020-05-26 Distributed deep learning system WO2021240636A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2022527309A JP7464118B2 (en) 2020-05-26 2020-05-26 Distributed Deep Learning Systems
PCT/JP2020/020708 WO2021240636A1 (en) 2020-05-26 2020-05-26 Distributed deep learning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/020708 WO2021240636A1 (en) 2020-05-26 2020-05-26 Distributed deep learning system

Publications (1)

Publication Number Publication Date
WO2021240636A1 true WO2021240636A1 (en) 2021-12-02

Family

ID=78723038

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/020708 WO2021240636A1 (en) 2020-05-26 2020-05-26 Distributed deep learning system

Country Status (2)

Country Link
JP (1) JP7464118B2 (en)
WO (1) WO2021240636A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018208939A1 (en) * 2017-05-09 2018-11-15 Neurala, Inc. Systems and methods to enable continual, memory-bounded learning in artificial intelligence and deep learning continuously operating applications across networked compute edges
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation
CN111091182A (en) * 2019-12-16 2020-05-01 北京澎思科技有限公司 Data processing method, electronic device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0692914B2 (en) * 1989-04-14 1994-11-16 株式会社日立製作所 Equipment / facility condition diagnosis system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018208939A1 (en) * 2017-05-09 2018-11-15 Neurala, Inc. Systems and methods to enable continual, memory-bounded learning in artificial intelligence and deep learning continuously operating applications across networked compute edges
CN110942147A (en) * 2019-11-28 2020-03-31 支付宝(杭州)信息技术有限公司 Neural network model training and predicting method and device based on multi-party safety calculation
CN111091182A (en) * 2019-12-16 2020-05-01 北京澎思科技有限公司 Data processing method, electronic device and storage medium

Also Published As

Publication number Publication date
JPWO2021240636A1 (en) 2021-12-02
JP7464118B2 (en) 2024-04-09

Similar Documents

Publication Publication Date Title
JP6825138B2 (en) Decentralized multi-party security model training framework for privacy protection
CN110929886B (en) Model training and predicting method and system
JP6921233B2 (en) Logistic regression modeling method using secret sharing
Qiao et al. Finite-time synchronization of fractional-order gene regulatory networks with time delay
WO2023124296A1 (en) Knowledge distillation-based joint learning training method and apparatus, device and medium
CN111460528B (en) Multi-party combined training method and system based on Adam optimization algorithm
JP7354463B2 (en) Data protection methods, devices, servers and media
CN113469373A (en) Model training method, system, equipment and storage medium based on federal learning
US20210150351A1 (en) Isa-based compression in distributed training of neural networks
US9412068B2 (en) Distributed factor graph system
Liu et al. Fault-tolerant control for uncertain linear systems via adaptive and LMI approaches
JP2017207839A (en) Neural network system, share computing device, neural network learning method, and program
WO2021240636A1 (en) Distributed deep learning system
CN115455476A (en) Longitudinal federal learning privacy protection method and system based on multi-key homomorphic encryption
CN112948885A (en) Method, device and system for realizing privacy protection of multi-party collaborative update model
Hovakimyan et al. An adaptive observer design methodology for bounded nonlinear processes
Tekle et al. Norwegian e-infrastructure for life sciences (NeLS)
Danciu et al. Delays and propagation: Control liapunov functionals and computational issues
JP2021022231A (en) Management device, management method, and management program
CN116415692A (en) Federal learning method and system, local node device, and storage medium
WO2022249436A1 (en) Variable optimization system
Xie et al. Algebraic stability criteria of reaction diffusion genetic regulatory networks with discrete and distributed delays
Shao et al. Nonfragile Estimator Design for Fractional‐Order Neural Networks under Event‐Triggered Mechanism
WO2022081539A1 (en) Systems and methods for providing a modified loss function in federated-split learning
CN113792784B (en) Method, electronic device and storage medium for user clustering

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20938278

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022527309

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20938278

Country of ref document: EP

Kind code of ref document: A1