WO2022146050A1

WO2022146050A1 - Federated artificial intelligence training method and system for depression diagnosis

Info

Publication number: WO2022146050A1
Application number: PCT/KR2021/020216
Authority: WO
Inventors: 김현승; 최준희; 이종민; 최민규
Original assignee: 성균관대학교산학협력단
Priority date: 2020-12-29
Filing date: 2021-12-29
Publication date: 2022-07-07
Also published as: KR20220094967A; KR102562377B1

Abstract

The present invention relates to a federated artificial intelligence training method and system for depression diagnosis, and the federated artificial intelligence training method for depression diagnosis, according to one embodiment of the present invention, comprises the steps of: pre-training a global model by using global training data; pre-training a local model on the basis of weight parameters of the pre-trained global model; updating the weight parameters of the pre-trained local model by using a feature vector extracted from pre-stored local training data; and updating the weight parameters of the pre-trained global model on the basis of the updated weight parameters of the local model.

Description

Artificial intelligence joint learning method and system for diagnosing depression

The present invention relates to an artificial intelligence joint learning method and system for diagnosing depression.

Artificial intelligence technology is expected to have a significant impact on various medical fields in the near future. Artificial intelligence-based medical treatment is expected to improve reading accuracy and contribute to disease prediction and prevention. Artificial intelligence-based medical treatment has the characteristic of being able to improve performance and efficiency compared to existing medical treatment. In particular, a convolutional neural network in the field of computer vision can be directly applied to medical image analysis.

The previously published artificial intelligence model for diagnosing depression (Tuka Alhanai, Mohammad Ghassemi, & James Glass, Detecting Depression with Audio/Text Sequence modeling of Interviews, Interspeech 2018) diagnoses depression only with words and intonation through an interview with a clinician. The AI model developed in this way utilizes text and voice data and responds more quickly to text information. In addition, in the conventional independent AI model, it is difficult to guarantee the accuracy of the AI model for diagnosing depression due to lack of data. In fact, there is no system in place to smoothly apply big data and artificial intelligence technology in the medical field. In addition, there is a problem of personal information leakage when the data required for AI model learning is shared between each institution. There is a limit to creating an artificial intelligence model for diagnosing depression by directly using data owned by a number of institutions.

Embodiments of the present invention are intended to provide an artificial intelligence joint learning method and system for diagnosing depression in order to improve the accuracy of an artificial intelligence model through joint learning between a global model and a plurality of local models for diagnosing depression.

Embodiments of the present invention are to provide a joint learning method of an artificial intelligence model for diagnosing depression using brain wave data and brain imaging (fMRI) data.

In addition, embodiments of the present invention prevent the risk of personal information leakage due to not sharing the patient personal information data held by each institution and improve the accuracy of the global artificial intelligence model, artificial intelligence joint learning for depression diagnosis It is intended to provide a method and system.

However, the problem to be solved by the present invention is not limited thereto, and may be variously expanded in an environment within the scope not departing from the spirit and scope of the present invention.

According to an embodiment of the present invention, there is provided an artificial intelligence federated learning method performed by an artificial intelligence federated learning apparatus, the method comprising: pre-learning a global model using global learning data; pre-training a local model based on the weight parameters of the pre-trained global model; updating a weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data; and updating the weight parameter of the pre-trained global model based on the updated weight parameter of the local model, the artificial intelligence joint learning method for diagnosing depression may be provided.

The method may further include retraining the local model based on a weight parameter of the updated global model.

The global model may include any one neural network from a support vector machine (SVM), a convolutional neural network (CNN), and a recurrent neural network (RNN).

The local model may be configured with the same neural network as that of the global model.

The global model is configured as a support vector machine if the number of data of the global learning data is less than a preset number of data, and if the number of data of the global learning data is greater than or equal to the preset number of data, it is completely at the end of a convolutional neural network or a recurrent neural network. The fully-connected layer may be configured as a fully-connected neural network (NN).

The updating of the weight parameter of the local model includes extracting a feature vector using a convolutional neural network if the pre-stored local learning data is brain image data, or using a recurrent neural network if the pre-stored local learning data is time series data. Feature vectors can be extracted.

The updating of the weight parameter of the local model may include updating the weight parameter of the pre-trained local model by using individual feature vectors for each of the plurality of local models.

The updating of the weight parameter of the local model may include updating the weight parameter of the pre-trained local model based on the extracted feature vector using stochastic gradient descent.

In the step of updating the weight parameter of the local model, in the stochastic gradient descent method, the weight and bias are updated as much as the gradient value for the loss by the step size indicating the learning rate. can

The updating of the weight parameters of the global model includes individually receiving the weight parameters of the updated local models for each of the plurality of local models, and integrating the weight parameters of the plurality of individually received local models to obtain the previously learned weight parameters. You can update the weight parameters of the global model.

On the other hand, according to another embodiment of the present invention, a global federated learning apparatus for pre-learning a global model using global learning data; and a local federated learning apparatus that pre-trains a local model based on the weight parameter of the pre-trained global model, and updates the weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data. Including, wherein the global federated learning apparatus updates the weight parameter of the pre-trained global model based on the weight parameter of the updated local model, an artificial intelligence federated learning system for diagnosing depression may be provided.

The local federated learning apparatus may re-learn the local model based on the weight parameter of the updated global model.

The global model is composed of a support vector machine if the number of data of the global learning data is less than a preset number of data, and if the number of data of the global learning data is greater than or equal to the preset number of data, it is completely at the end of a convolutional neural network or a recurrent neural network. The fully-connected layer may be configured as a fully-connected neural network (NN).

The local federated learning apparatus may extract a feature vector using a convolutional neural network if the pre-stored local learning data is image data, or extract a feature vector using a recurrent neural network if the pre-stored local learning data is time series data. .

The local federated learning apparatus may update each of the weight parameters of the pre-trained local model by using individual feature vectors for each of the plurality of local models.

The local federated learning apparatus may update the weight parameter of the pre-trained local model based on the extracted feature vector using stochastic gradient descent.

The local federated learning apparatus may update weights and biases as much as a gradient value for a loss in the stochastic gradient descent method by a step size indicating a learning rate.

The global federated learning apparatus individually receives the weight parameters of the updated local model for each of the plurality of local models, and integrates the weight parameters of the individually received plurality of local models to obtain the weight parameters of the pre-trained global model. can be updated.

Meanwhile, according to another embodiment of the present invention, there is provided a non-transitory computer-readable storage medium for storing instructions that, when executed by a processor, cause the processor to execute a method, the method comprising: using global learning data pre-training the global model; pre-training a local model based on the weight parameters of the pre-trained global model; updating a weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data; and updating the weight parameter of the pre-trained global model based on the updated weight parameter of the local model.

The disclosed technology may have the following effects. However, this does not mean that a specific embodiment should include all of the following effects or only the following effects, so the scope of the disclosed technology should not be understood as being limited thereby.

Embodiments of the present invention can accurately diagnose depression by using brain wave data and brain imaging (fMRI) data in each institution.

In addition, embodiments of the present invention allow each institution to learn an artificial intelligence model within each institution by using such data, and improve the accuracy of the global artificial intelligence model by using the weights of the learned artificial intelligence model.

In addition, embodiments of the present invention can ensure privacy protection by individually managing the patient's personal data by each institution.

1 is a block diagram of an artificial intelligence joint learning system for diagnosing depression according to an embodiment of the present invention.

2 is a diagram illustrating a process of extracting a feature vector from a brain image used in an embodiment of the present invention.

3 is a diagram showing the configuration of a CNN model used in an embodiment of the present invention.

4 is a diagram illustrating a feature vector extraction process from EEG data according to an embodiment of the present invention.

5 is a diagram illustrating a learning process of a global model according to an embodiment of the present invention.

6 is a diagram illustrating a process of transmitting a weight parameter of a global model according to an embodiment of the present invention.

7 is a diagram illustrating an update process of a local model according to an embodiment of the present invention.

8 is a diagram illustrating a process of transmitting a weight parameter of a local model according to an embodiment of the present invention.

9 and 10 are diagrams showing a learning result according to an embodiment of the present invention.

Since the present invention can apply various transformations and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to a specific embodiment, it can be understood to include all transformations, equivalents or substitutes included in the spirit and scope of the present invention. In describing the present invention, if it is determined that a detailed description of a related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.

Terms such as first, second, etc. may be used to describe various elements, but the elements are not limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

The terms used in the present invention are only used to describe specific embodiments, and are not intended to limit the present invention. The terms used in the present invention have been selected as currently widely used general terms as possible while considering the functions in the present invention, but these may vary depending on the intention, precedent, or emergence of new technology of those of ordinary skill in the art. In addition, in a specific case, there is a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the corresponding invention.

Therefore, the term used in the present invention should be defined based on the meaning of the term and the overall content of the present invention, rather than the name of a simple term.

The singular expression includes the plural expression unless the context clearly dictates otherwise. In the present invention, terms such as "comprises" or "have" are intended to designate that the features, numbers, steps, operations, components, parts, or combinations thereof described in the specification exist, but one or more other features To exclude in advance the possibility of the existence or addition of figures, numbers, steps, operations, components, parts, or combinations thereof.

should be understood as not

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, and in the description with reference to the accompanying drawings, the same or corresponding components are given the same reference numerals, and the overlapping description thereof will be omitted. do.

As shown in FIG. 1 , the artificial intelligence federated learning system 100 for diagnosing depression according to an embodiment of the present invention includes a global federated learning device 120 and a plurality of local federated learning devices. However, not all illustrated components are essential components. The artificial intelligence federated learning system 100 may be implemented by more components than the illustrated components, and the artificial intelligence federated learning system 100 may be implemented by fewer components than that.

The artificial intelligence combined learning system 100 according to an embodiment of the present invention aims to diagnose depression using brain image (fMRI) data and EEG data. Although the size of brain imaging (fMRI) data and EEG data is large, the number of related data is not large, so it is difficult to implement an artificial intelligence model for diagnosing depression with high accuracy.

To solve this, the artificial intelligence joint learning system 100 extracts feature vectors from a Convolutional Neural Network (CNN) model and a Recurrent Neural Network (RNN) model from brain image data and EEG data, respectively, to diagnose depression from the feature vectors. can

According to an embodiment of the present invention, an artificial intelligence model for diagnosing depression using CNN, RNN, or Support Vector Machine (SVM) may be trained by using brain wave data and fMRI data.

In addition, an embodiment of the present invention may receive and update a weight parameter of a local AI model of each institution using a global AI model.

Also, according to an embodiment of the present invention, based on the received weight parameters, the weight parameters of the artificial intelligence models of each institution may be updated by utilizing the brain wave data and fMRI data owned by each institution.

Also, according to an embodiment of the present invention, the weight parameter of the global AI model may be updated by transmitting the weight parameter of the local AI model of each institution to the global AI model.

In addition, according to an embodiment of the present invention, the weight parameters of the updated global artificial intelligence model may be transmitted back to the local artificial intelligence models of various institutions.

For example, in an embodiment of the present invention, a support vector machine (SVM) technique capable of effectively performing binary classification may be used among machine learning techniques.

The global federated learning apparatus 120 pre-trains the global SVM model using global learning data (S101).

The global federated learning apparatus 120 updates the weight of the global model (S102).

Thereafter, the local federated learning devices A, B, and C (111, 112, 113) implemented in each institution receive the weight parameters of the pre-trained global SVM model (S103).

The local federated learning devices A, B, and C (111, 112, 113) start learning the local SVM model of each institution based on the received weight parameter (S104, S106, S108).

Local federated learning devices A, B, and C (111, 112, 113) put the data owned by each institution into a 3D CNN model commonly used by all institutions to extract feature vectors, and then use this to determine the weight parameters of the local model as a probability. It is updated using the stochastic gradient descent method (S105, S107, S109).

The local federated learning devices A, B, and C (111, 112, 113) transfer the local SVM model weights of each institution learned to the global SVM model again when the local model learning of each institution is completed (S110, S111, S112) ).

The global federated learning apparatus 120 updates the weights of the global SVM model with an arithmetic average value or a weighted average value of the weights received from each institution to create a high-accuracy global SVM model (S113) ).

On the other hand, the artificial intelligence combined learning system 100 according to an embodiment of the present invention may improve the accuracy of diagnosis of depression by performing joint learning using EEG data using a similar method.

Hereinafter, a detailed configuration and operation of each component of the artificial intelligence federated learning system 100 according to an embodiment of the present invention of FIG. 1 will be described.

The global federated learning apparatus 120 learns the global model in advance by using the global learning data.

The local federated learning device pre-learns a local model based on the weight parameters of the global model trained in advance in the global federated learning device 120, and uses a feature vector extracted from pre-stored local training data of the pre-trained local model. Update the weight parameter.

The global federated learning apparatus 110 updates the weight parameter of the pre-trained global model based on the updated weight parameter of the local model.

According to embodiments, the local federated learning apparatus may relearn the local model based on the updated weight parameter of the global model.

According to embodiments, the global model may be composed of any one neural network among a support vector machine (SVM), a convolutional neural network (CNN), and a recurrent neural network (RNN).

According to embodiments, the local model may be configured with the same neural network as that of the global model.

According to embodiments, the global model is configured as a support vector machine when the number of data of the global training data is less than the preset number of data, and when the number of data of the global learning data is greater than or equal to the preset number of data, the last stage of the convolutional neural network or the recurrent neural network It may be composed of a fully-connected neural network (NN) with a fully-connected layer connected to .

In an embodiment of the present invention, when the number of medical data currently possessed is small, the SVM may be used as a global model and a local model by extracting a feature vector.

This is because SVM has a high classification success rate when there is little training data.

In addition, in another embodiment of the present invention, if there is enough medical data, a fully-connected layer is attached to the last stage of a CNN or RNN, and this part can be used as a global model and a local model instead of SVM. .

A fully connected layer contains a large number of parameters, so it is impossible to learn with a small number of data. Accordingly, in another embodiment of the present invention, if a sufficient amount of data is retained rather than the format of the data, a fully connected layer may be used instead of the SVM.

In an embodiment of the present invention, the CNN and RNN in the preceding stage may vary depending on the format of the medical data (3D-CNN is used in the case of a brain image image, and RNN is used because EEG data is time series data).

In an embodiment of the present invention, when classifying a feature extracted through CNN or RNN, whether to use SVM as a global model or a local model classifier model, a fully-connected neural network (CNN) is selected. Whether to use it may depend on the amount of data.

On the other hand, according to embodiments, the local federated learning apparatus extracts a feature vector using a convolutional neural network if the pre-stored local learning data is image data, or uses a recurrent neural network if the pre-stored local learning data is time series data. can be extracted.

According to embodiments, the local federated learning apparatus may update each of the weight parameters of the pre-trained local model by using individual feature vectors for each of the plurality of local models.

According to embodiments, the local federated learning apparatus may update the weight parameter of the pre-trained local model based on the extracted feature vector using stochastic gradient descent.

According to embodiments, the local federated learning apparatus may update the weight and bias as much as the gradient value for the loss in the stochastic gradient descent method by the step size indicating the learning rate. .

Specifically, the stochastic gradient descent method will be referred to as w, the bias as b, the feature vector as x, and the feature label as y.

In order to make (wx-b) greater than 1 when class is 1(+) so that (wx-b) is less than -1 when class is -1(-), y(wx-b) is Since b) can be learned to be greater than 1, it can be learned so that 1-y(wx-b)<0　.

Therefore, the stochastic gradient descent method is taught when 1-y(wx-b)>0　, and the square value of 1-y(wx-b) is set as a Loss value.

Thereafter, the stochastic gradient descent method is a method of updating w and b by the step size (step_size (learning rate)) indicating the learning rate by the gradient value for the loss.

According to embodiments, the global federated learning apparatus 120 individually receives the weight parameters of the local models updated for each of the plurality of local models, and integrates the weight parameters of the individually received plurality of local models to obtain a pre-trained global You can update the weight parameters of the model.

An embodiment of the present invention extracts a feature vector from a brain image (fMRI image) through a convolutional neural network.

Brain imaging (fMRI) data is data in the form of a 3D image. A 3D Convolutional Neural Network (CNN) model pre-trained with fMRI data is used for feature extraction of fMRI data, and the detailed model configuration is shown in FIG. 3 same as

Using the final convolution layer (conv_layer) result of the 3D CNN model, a feature vector consisting of 128 features is extracted. This 3D CNN model makes it common to all institutions.

An embodiment of the present invention extracts the EEG data through a recurrent neural network to extract a feature vector.

An embodiment of the present invention obtains a corresponding feature vector through global training data, and learns a global support vector machine model based on the feature vector.

6 is a diagram illustrating a process of transmitting a weight parameter of a global model according to an embodiment of the present invention. An embodiment of the present invention transmits weight parameters of a pre-trained global model, that is, weight & bias, to each local support vector machine model (Local SVM model). .

In an embodiment of the present invention, a pre-trained weight parameter received from a global model as a feature vector from local training data possessed by each local federated learning device is used. update

In this case, in an embodiment of the present invention, a weight parameter of a local model is updated using a stochastic gradient descent method.

An embodiment of the present invention transfers the weight parameters of the local support vector machine model (Local SVM model) learned by each local federated learning apparatus to the global model (Global model).

Here, the update method of the weight parameter in the global support vector machine model may be updated using an arithmetic average or a weighted average method.

As shown in FIGS. 9 and 10 , the combined learning result was improved according to the number of learning data. This will be described in detail.

An experiment according to an embodiment of the present invention was conducted using brain imaging (fMRI) data. Brain image (fMRI) data is put into a pre-trained 3D CNN model to extract a feature vector of the 3D image, and the last convolution layer value of the 3D CNN model was used for the feature vector. Depression was diagnosed by applying the extracted feature vector to a machine learning technique. In this experiment, a linear support vector machine (SVM) was used among the machine learning techniques.

The federated learning method of the SVM model is as follows. After training the global SVM model using the global training data set, the weights and bias values of the global SVM model are transferred to Model A, Model B, and Model C. Model A, model B, and model C train each model using the data available to each model based on the received SVM weights and biases. The stochastic gradient descent method is used as a method for learning the SVM.

When training in each model is completed, the global SVM model receives weights and bias values from each model and updates the weights and biases of the global SVM model. For the update method, an arithmetic average value of the weight and bias values of each model or a weighted average value according to the number of training data was used.

The training epoch used in the stochastic gradient descent method was set to 10, and the step size was set to 10e-3, and the federated learning results for the SVM model according to the size of the training data set. Is as follows.

After federated learning, it can be seen that when the number of initial global learning data is 9, accuracy is improved by 12.2%, and when the number of initial global learning data is 14, accuracy is improved by 6.8%. Both the arithmetic mean and weighted mean methods improved accuracy. The arithmetic average method showed better results than the weighted average method according to the number of training data.

On the other hand, in an embodiment of the present invention, by applying federated learning to a machine learning technique, the patient's personal information data held by each institution is not directly used, but only the weight value of each model is used for global classification (classification). ) to improve the accuracy of the model.

Therefore, the risk of personal information leakage can be reduced by not sharing patient personal information data, and the accuracy of the depression diagnosis model can be greatly improved, which will be useful in the field of depression diagnosis.

Meanwhile, a non-transitory computer-readable storage medium for storing instructions that, when executed by a processor, cause the processor to execute a method, the method comprising: pre-training a global model using global training data; pre-training a local model based on the weight parameters of the pre-trained global model; updating a weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data; and updating the weight parameter of the pre-trained global model based on the updated weight parameter of the local model.

Meanwhile, according to an embodiment of the present invention, the various embodiments described above are implemented as software including instructions stored in a machine-readable storage media readable by a machine (eg, a computer). can be

The device is a device capable of calling a stored command from a storage medium and operating according to the called command, and may include an electronic device (eg, the electronic device A) according to the disclosed embodiments. When the instruction is executed by the processor, the processor may perform a function corresponding to the instruction by using other components directly or under the control of the processor.

Instructions may include code generated or executed by a compiler or interpreter. The device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' means that the storage medium does not include a signal and is tangible, and does not distinguish that data is semi-permanently or temporarily stored in the storage medium.

In addition, according to an embodiment of the present invention, the methods according to the various embodiments described above may be provided by being included in a computer program product. Computer program products may be traded between sellers and buyers as commodities. The computer program product may be distributed in the form of a device-readable storage medium (eg, compact disc read only memory (CD-ROM)) or online through an application store (eg, PlayStore™). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily generated in a storage medium such as a memory of a server of a manufacturer, a server of an application store, or a relay server.

In addition, according to an embodiment of the present invention, the various embodiments described above are stored in a recording medium readable by a computer or a similar device using software, hardware, or a combination thereof. can be implemented in In some cases, the embodiments described herein may be implemented by the processor itself. According to the software implementation, embodiments such as the procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein.

Meanwhile, computer instructions for performing the processing operation of the device according to the above-described various embodiments may be stored in a non-transitory computer-readable medium. The computer instructions stored in the non-transitory computer-readable medium, when executed by the processor of the specific device, cause the specific device to perform the processing operation in the device according to the various embodiments described above. The non-transitory computer-readable medium refers to a medium that stores data semi-permanently, not a medium that stores data for a short moment, such as a register, cache, memory, etc., and can be read by a device. Specific examples of the non-transitory computer-readable medium may include a CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

In addition, each of the components (eg, a module or a program) according to the above-described various embodiments may be composed of a single or a plurality of entities, and some sub-components of the above-described corresponding sub-components may be omitted, or other Sub-components may be further included in various embodiments. Alternatively or additionally, some components (eg, a module or a program) may be integrated into a single entity to perform the same or similar functions performed by each corresponding component prior to integration. According to various embodiments, operations performed by a module, program, or other component are sequentially, parallel, repetitively or heuristically executed, or at least some operations are executed in a different order, are omitted, or other operations are added. can be

In the above, preferred embodiments of the present invention have been illustrated and described, but the present invention is not limited to the specific embodiments described above, and is commonly used in the technical field pertaining to the present disclosure without departing from the gist of the present invention as claimed in the claims. Various modifications may be made by those having the knowledge of, of course, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

Claims

In the artificial intelligence joint learning method performed by the artificial intelligence joint learning device,

pre-training a global model using global learning data;

pre-training a local model based on the weight parameters of the pre-trained global model;

updating a weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data; and

An artificial intelligence joint learning method for diagnosing depression, comprising the step of updating a weight parameter of the pre-trained global model based on the weight parameter of the updated local model.
According to claim 1,

The artificial intelligence joint learning method for diagnosing depression further comprising the step of re-learning the local model based on the weight parameter of the updated global model.
According to claim 1,

The global model is

A support vector machine (SVM), a convolutional neural network (CNN), and a recurrent neural network (RNN) consisting of any one neural network, an artificial intelligence federated learning method for diagnosing depression.
4. The method of claim 3,

The local model is

An artificial intelligence joint learning method for diagnosing depression, which is composed of the same neural network as the neural network of the global model.
4. The method of claim 3,

The global model is

If the number of data of the global learning data is less than a preset number of data, it is configured as a support vector machine,

If the number of data of the global learning data is greater than or equal to the preset number of data, it is composed of a fully-connected neural network (fully-connected NN) in which a fully-connected layer is connected to the last end of a convolutional neural network or a recurrent neural network, depression diagnosis Artificial intelligence federated learning method for
According to claim 1,

Updating the weight parameter of the local model includes:

Artificial intelligence for diagnosing depression, extracting a feature vector using a convolutional neural network if the pre-stored local learning data is brain image data, or extracting a feature vector using a recurrent neural network if the pre-stored local learning data is time series data Coalition Learning Method.
The method of claim 1,

Updating the weight parameter of the local model includes:

An artificial intelligence joint learning method for diagnosing depression, wherein each of the weight parameters of the pre-trained local model is updated using individual feature vectors for each of the plurality of local models.
The method of claim 1,

Updating the weight parameter of the local model includes:

An artificial intelligence joint learning method for diagnosing depression, for updating the weight parameters of the pre-trained local model based on the extracted feature vector using stochastic gradient descent.
9. The method of claim 8,

Updating the weight parameter of the local model includes:

An artificial intelligence federated learning method for diagnosing depression, in which the weight and bias are updated as much as the gradient value for the loss in the stochastic gradient descent method by the step size indicating the learning rate.
According to claim 1,

Updating the weight parameter of the global model comprises:

Depression diagnosis, which individually receives the weight parameters of the updated local models for each of the plurality of local models, and updates the weight parameters of the pre-trained global model by integrating the weight parameters of the individually received plurality of local models artificial intelligence federated learning method for
a global federated learning device that pre-trains a global model using global learning data; and

A local federated learning apparatus that pre-trains a local model based on the weight parameter of the pre-trained global model, and updates the weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data do,

The global federated learning apparatus is an artificial intelligence federated learning system for diagnosing depression that updates the weight parameter of the pre-trained global model based on the weight parameter of the updated local model.
12. The method of claim 11,

The local federated learning device,

An artificial intelligence joint learning system for diagnosing depression, which re-learns the local model based on the weight parameters of the updated global model.
12. The method of claim 11,

The global model is

Consists of any one of a support vector machine (SVM), a convolutional neural network (CNN), and a recurrent neural network (RNN),

The local model is

An artificial intelligence joint learning system for diagnosing depression, which is composed of the same neural network as the neural network of the global model.
14. The method of claim 13,

The global model is

If the number of data of the global learning data is less than a preset number of data, it is composed of a support vector machine, and if the number of data of the global learning data is greater than or equal to the preset number of data, it is fully connected to the last end of a convolutional neural network or a recurrent neural network. -Connected layer) is composed of a fully connected neural network (Fully-connected NN), artificial intelligence joint learning system for diagnosing depression.
12. The method of claim 11,

The local federated learning device,

If the pre-stored local learning data is image data, extracting a feature vector using a convolutional neural network, or extracting a feature vector using a recurrent neural network if the pre-stored local learning data is time series data. system.
12. The method of claim 11,

The local federated learning device,

An artificial intelligence joint learning system for diagnosing depression, each updating weight parameters of the pre-trained local model using individual feature vectors for each of the plurality of local models.
12. The method of claim 11,

The local federated learning device,

An artificial intelligence federated learning system for diagnosing depression that updates the weight parameters of the pre-trained local model based on the extracted feature vector using stochastic gradient descent.
18. The method of claim 17,

The local federated learning device,

An artificial intelligence federated learning system for diagnosing depression that updates the weight and bias as much as the gradient value for the loss in the stochastic gradient descent method by the step size indicating the learning rate.
12. The method of claim 11,

The global federated learning device is

Depression diagnosis, which individually receives the weight parameters of the updated local models for each of the plurality of local models, and updates the weight parameters of the pre-trained global model by integrating the weight parameters of the individually received plurality of local models artificial intelligence federated learning system for
A non-transitory computer-readable storage medium for storing instructions that, when executed by a processor, cause the processor to execute a method, the method comprising:

pre-training a global model using global training data;

pre-training a local model based on the weight parameters of the pre-trained global model;

updating a weight parameter of the pre-trained local model using a feature vector extracted from pre-stored local training data; and

and updating a weight parameter of the pre-trained global model based on the weight parameter of the updated local model.