CN115965782A

CN115965782A - Efficient communication federation learning method for semantic segmentation of small sample medical image

Info

Publication number: CN115965782A
Application number: CN202211612180.XA
Authority: CN
Inventors: 赵萌; 吴俊鹏; 石凡; 张欣鹏; 陈胜勇; 薛超
Original assignee: Tianjin University of Technology; Tiandy Technologies Co Ltd
Current assignee: Tianjin University of Technology; Tiandy Technologies Co Ltd
Priority date: 2022-12-14
Filing date: 2022-12-14
Publication date: 2023-04-14

Abstract

The invention provides a high-efficiency communication federal learning method for semantic segmentation of small-sample medical images. The client module takes the divided data set as input, local training is carried out to obtain a segmentation result, the obtained local prototypes are transmitted to the server module, the server module splices the received local prototypes to generate a global prototype, the global prototype is used as a communication medium to be sent to each client to calculate and compare loss in the next round of communication, local training of the clients is continuously normalized, and an optimal segmentation result is obtained. The invention is suitable for multi-modal medical data, such as MRI or CT images of various organs such as liver, kidney, spleen and the like, and can be used for image-level semantic segmentation of the organs.

Description

Efficient communication federation learning method for semantic segmentation of small sample medical image

Technical Field

The invention relates to a high-efficiency communication federation learning method for semantic segmentation of small-sample medical images, belonging to the field of computer vision and medical image processing.

Background

Computer vision and artificial intelligence have been widely used in the field of medical image processing, and medical data tends to be more difficult to acquire and less numerous than natural data. Automatic segmentation can be realized by utilizing a deep convolutional neural network, generally, a fully supervised training mode is mostly adopted in the automatic segmentation mode, and a large amount of labeled data is used for training. However, image acquisition differences between medical devices and hospitals result in lack of sufficiently professional annotations for training in actual training, and also do not generalize to new classes that are not visible.

Due to the lack of a large number of labeled medical data sets, many medical image segmentation tasks adopt a small sample training mode to upload data and huge models to a central server for centralized training. Although the learning mode can obtain a good learning effect to a certain extent and is used for clinical experiments and related medical applications including disease diagnosis and the like, because medical data often adopts real data from patients, the traditional centralized training mode also easily causes the problem of information leakage in the data transmission process and the attack in the centralized training process, and finally causes the leakage of privacy of patients. Therefore, a federate learning framework for semantic segmentation of small sample medical images is provided, so that the privacy of a patient is protected to a certain extent, the loss of precision is reduced, and the communication efficiency between a client and a server is improved.

The federated learning can carry out collaborative learning with other clients under the condition of not carrying out data transmission, and a global model is trained. The data transmission is avoided, the data leakage risk can be reduced, but most of the traditional federal learning modes are based on communication model parameters, a server transmits global model parameters to initialize a client, the client trains by using the model parameters, and after the training is finished, the respective model weights are retransmitted to the server to perform average aggregation to update the global model parameters, but the mode of the communication model parameters is low in efficiency and high in cost. Therefore, abstract feature prototypes extracted from a small sample segmentation network are used as communication media, each client side carries out local training, obtained feature prototypes are transmitted to a server after the training is finished to be spliced to obtain global prototypes, and the global prototypes are sent to each client side for standardizing the local training of the client side in the next round of communication. By means of the efficient communication federation learning method for small sample medical image semantic segmentation, privacy and efficient communication efficiency can be considered on the premise of ensuring the effect.

Disclosure of Invention

The invention aims to provide an efficient communication federal learning method for small sample medical image semantic segmentation, which avoids the privacy disclosure problem possibly caused during data transmission and improves the communication efficiency between a server and a client.

In order to achieve the purpose, the scheme of the invention is as follows:

a general small sample medical image segmentation network is used as a basic framework, a federal learning framework applicable to small sample medical image segmentation is designed, and privacy and high-efficiency communication efficiency can be considered on the premise of ensuring the effect. The specific steps at present are as follows:

(1) Acquiring data sets of MRI and CT modalities, dividing the data sets according to organ types of the data sets, and designing an isomeric scene;

(2) Respectively constructing a client module, a server module and a communication mode between the client module and the server module, wherein the client module has the data set;

(3) The local client module trains and extracts local prototypes by utilizing the data set and transmits the extracted local prototypes to the server module;

(4) And the server module splices the received local prototypes to obtain a global prototype, transmits the global prototype to each client module, calculates the comparison loss with the local prototypes, and finally normalizes the local training of the client.

The method for dividing the data set according to the organ types of the data set in the step (1) comprises the steps of copying 4 data samples of an MRI (magnetic resonance imaging) modality and a CT (computed tomography) modality respectively, reserving only one organ type for each data sample, and constructing a heterogeneous client scene with only a single type under the MRI modality and the CT modality, wherein the organ types of each sample are different.

Wherein, according to the step (2), the client module, the server module and the communication mode between the client module and the server module are constructed, which mainly comprises the following steps:

(1) The client module takes the divided heterogeneous client data set as input, firstly utilizes a Deeplab v3 network architecture based on the input to perform feature extraction, and then obtains an abstract class feature prototype through MaxPoint processing;

(2) Using the abstract class feature prototype to better represent the feature information of the image, and then calculating the abstract class feature prototype and the image feature f to be segmented _θ (x _q ) Cosine similarity and softmax probability fraction between the two groups to finally obtain a segmentation result; wherein x _q Meaning that the image is to be segmented, f _θ () refers to the feature extraction network deep v3;

(3) Carrying out average aggregation on the abstract class feature prototypes obtained by each client to obtain a local prototype, and using the local prototype as a communication medium between a client module and a server module;

(4) And the server module splices the local prototypes of the clients to obtain a global prototype and sends the global prototype to the clients in the next iteration.

Extracting local prototypes according to the local client module training in the step (3), wherein the local prototypes P extracted by the client modules _i ^(j) Comprises the following steps:

wherein D _i,j Indicates the number of j-type slice data of the ith client module, p _k Representing that each slice is extracted through a Deeplab v3 network architecture to obtain features, and then carrying out Maxpanning processing to obtain an abstract class feature prototype, wherein k represents a subscript of the abstract class feature prototype, and n is _j And representing the number of the extracted abstract type feature prototypes, and transmitting the extracted local prototypes to the server module.

Wherein, the server module according to the step (4) receives the local prototype

Splicing and obtaining a global prototype->

And sending the global prototype to each client module participating in training in the next iteration process, calculating contrast loss for standardizing the training of the local client modules, taking the local prototypes of the same category in the global prototype as positive samples, taking the local prototypes of different categories with the current client module as negative samples, and taking each positive sample pair as a negative sample pair

Is compared against a loss function->

/>

Wherein [ ] indicates the cosine similarity

< a, b > represents the inner product of the two calculation, | | · | | non-calculation ₂ Means for calculating a 2-norm>

Is represented by the formula p _k Global prototype of the same class,. Sup.>

Representing a global prototype containing all classes, B representing the batch size of the data read each time, τ representing the temperature coefficient, exp (-) representing the computational exponent, log (-) representing the computational logarithm, and the final prototype contrast loss L _proto For the average contrast loss of all positive sample pairs, the formula is as follows:

wherein

The values of (1) are 0, and (1) indicate a negative sample pair and a positive sample pair, respectively.

The invention has the beneficial effects that: the invention provides a federate learning method for small sample medical image semantic segmentation, which can reduce precision loss while ensuring privacy, ensure good communication efficiency and greatly reduce communication cost. Firstly, an extremely heterogeneous scene is constructed, then a federal learning framework for small sample medical image segmentation is designed, and the risk of information leakage is reduced by data transmission. The abstract prototypes extracted by local training of each client are used as communication media, the server obtains the global prototypes by aggregating the local prototypes transmitted by the clients, the global prototypes are sent to the clients participating in the training during the next iteration to calculate the comparison loss and finally standardize the local training together with the segmentation loss of the local training, and the method guarantees the segmentation effect and simultaneously protects the privacy and improves the communication efficiency.

Brief description of the drawings

FIG. 1: the invention relates to a flow chart of an efficient communication federal learning method for semantic segmentation of small sample medical images;

FIG. 2: is a method diagram of a client module, a server module and a communication mode of the client module and the server module.

Detailed Description

As shown in fig. 1, the method of the present invention first obtains a preprocessed data set in MRI (Nuclear Magnetic Resonance Imaging) and CT (Computed Tomography) 3D nii formats, where there are a total of samples from 20 different patients including four organ regions of liver, left kidney, right kidney and spleen, 15 for training and 5 for testing. On the CT modality there are a total of 30 samples from visceral areas of different patients containing 13 types of organs, of which 23 samples are used for training and 7 samples are used for testing.

The data samples of the MRI mode and the CT mode are copied to be multiple, each sample only keeps one visceral organ type, and the visceral organ types of each sample are different, so that extremely heterogeneous scenes are manually marked out on the two data sets respectively, and each client only has one classified data set. In both of these two modality data sets, a single type of organ is used as an experimental type. Therefore, heterogeneous client scenarios using only a single category are respectively divided according to the experimental categories in both modalities, and for example, data samples of the MRI modality and the CT modality may be copied into 4 copies respectively, for a total of 4 categories of clients.

The method is characterized in that a high-efficiency communication federal learning method for semantic segmentation of small-sample medical images is constructed, the high-efficiency communication federal learning method is designed based on a general small-sample medical image segmentation model, and a client module, a server module and a communication mode between the client module and the server module are respectively constructed. The client-side module has data sets and model parameters, and performs extraction and local training of local prototypes by using the data and the model parameters, and the server-side module is responsible for splicing the local prototypes extracted by the client-side module in the local training process to obtain a global prototype, and sending the global prototype to each client side for normative local training in the next iteration, as shown in fig. 2.

In the process of local training, each client takes a well-divided heterogeneous client data set as input. Based on a general small sample medical image segmentation network, after data processing, slices of each depth level of a 3D image are used as input of the network, a support image and a query image are subjected to feature extraction through a Deeplab v3 network architecture, a mask of the support image and the extracted support features are used as input of a prototype extraction module together, and are subjected to Max Pooling (maximum pooling) processing to obtain an abstract class feature prototype, and the abstract class feature prototype is used for better representing feature information of the image. In the abstract class characteristic prototype and the characteristic f of the image to be segmented _θ (x _q ) And after cosine similarity and softmax probability score are calculated, the client module obtains a segmentation result. Wherein x _q Meaning that the image is to be segmented, f _θ (. Cndot.) refers to the feature extraction network Deeplab v3. Meanwhile, each client module participating in training extracts the abstract prototype

Sending the abstract class characteristic prototypes to a server module, carrying out average aggregation on the abstract class characteristic prototypes obtained by each client module to obtain a local prototype, wherein the local prototype extracted by each client is P _i ^(j) ：

Wherein D _i,j Indicates the number of j-type slice data of the ith client module, p _k Representing that each slice is extracted through a Deeplab v3 network architecture to obtain features, and then the features are processed through Maxplating to obtain an abstract class feature prototype, k represents the subscript of the abstract class feature prototype, n _j Show and carryAnd the extracted abstract type feature prototypes are counted, and the extracted local prototypes are transmitted to the server module to serve as a communication medium between the client module and the server module.

The server module receives the local prototype

Splicing and obtaining a global prototype->

And sending the global prototype to each client-side module participating in training in the process of the next iteration, and calculating the comparison loss with the local prototype to be used for standardizing the training of the local client-side module. After each client participating in training receives the global prototype, for the currently received client, the local prototypes of the same category in the global prototype serve as positive samples, and the local prototypes of different categories of the current client serve as negative samples, so that the contrast loss is calculated. And weighting the calculated contrast loss and the loss of each client local training segmentation to finally obtain the total loss of the local training. First counting each pair of samples->

Is compared against a loss function->

Wherein [ ] indicates a cosine similarity

< a, b > represents the inner product of the two calculation, | | · calcualting ₂ Means for calculating a 2-norm>

Is represented by the formula p _k Global prototypes in the same class, or>

A global prototype containing all classes is represented, B represents the batch size of each read, τ represents the temperature coefficient, exp (. Cndot.) represents the computational exponent, and log (. Cndot.) represents the computational logarithm. Final prototype loss L _proto For the average loss of all positive sample pairs, the formula is as follows:

the values of (1) are 0 and (1) respectively indicate a negative sample pair and a positive sample pair. The local training of the client is normalized by the prototype loss and the segmentation loss of the local training together.

The mode of communicating the abstract prototype between the server and the client can greatly improve the communication efficiency, and the abstract prototype serving as the one-dimensional tensor can also ensure the privacy. The high-efficiency communication federal learning method for semantic segmentation of small-sample medical images can be applied to real medical research, enables the segmentation of the small-sample medical images to be applied to various fields under the condition of ensuring the privacy of patients, and contributes to the thinness of the medical research.

It should be noted that the above-mentioned embodiments are only examples of the present invention, and are only illustrative of the present invention, and therefore do not limit the scope of the present invention. Only obvious modifications which belong to the technical idea of the invention are within the protective scope of the invention.

Claims

1. A high-efficiency communication federal learning method for semantic segmentation of small-sample medical images is characterized by comprising the following steps:

(3) The local client module trains and extracts a local prototype by using the data set and transmits the extracted local prototype to the server module;

(4) And the server module splices the received local prototypes to obtain a global prototype, transmits the global prototype to each client module, calculates the comparison loss with the local prototypes and finally regulates the local training of the client modules.

2. The method for efficient communication federal learning for small-sample medical image semantic segmentation as claimed in claim 1, wherein the method for dividing the data set according to the visceral organ categories of the data set in step (1) is to copy 4 data samples of the MRI modality and the CT modality, respectively, each data sample only retains one visceral organ category, the visceral organ categories of each sample are different, and a heterogeneous client scene with only a single category is constructed in the MRI modality and the CT modality.

3. The method for efficient communication federal learning for semantic segmentation of small sample medical images as claimed in claim 1, wherein the step of constructing the client module, the server module and the communication mode therebetween according to the step (2) mainly comprises the following steps:

(1) The client module takes the divided heterogeneous client data set as input, firstly utilizes a Deeplab v3 network architecture based on the input to carry out feature extraction, and then obtains an abstract feature prototype through Max Pooling processing;

(2) Using the abstract class feature prototype to better represent the feature information of the image, and then calculating the abstract class feature prototype and the image feature f to be segmented _θ (x _q ) Cosine similarity and softmax probability fraction between the two groups to finally obtain a segmentation result; wherein x _q Meaning that the image is to be segmented, f _θ (. To a feature extraction network Deeplab) v3；

4. The method for efficient communication federal learning for semantic segmentation of small-sample medical images as claimed in claim 1, wherein the local prototype is extracted according to the local client module training in step (3), wherein the local prototype extracted by each client module

Comprises the following steps:

wherein D _i，j Indicates the number of j-type slice data of the ith client module, p _k Representing that each slice is extracted through a Deeplab v3 network architecture to obtain features, and then the features are processed through Maxplating to obtain an abstract class feature prototype, k represents the subscript of the abstract class feature prototype, n _j And representing the number of the extracted abstract type feature prototypes, and transmitting the extracted local prototypes to the server module.

5. The method for efficient communication federal learning for semantic segmentation of small-sample medical images as claimed in claim 1, wherein the server module according to step (4) is configured to receive local prototypes

Splicing and obtaining a global prototype->

And sending the global prototype to each client module participating in training in the process of the next iteration, calculating contrast loss for standardizing the training of the local client modules, taking the local prototypes of the same category in the global prototypes as positive samples, taking the local prototypes of different categories with the current client modules as negative samples, and taking each positive sample pair as a negative sample

Is compared against a loss function->

Wherein [ ] indicates a cosine similarity

<a，b>Represents the calculation of the inner product of the two, | ·| non-conducting phosphor ₂ Means for calculating a 2-norm>

Represents a group of _k Global prototype of the same class,. Sup.>

Representing a global prototype containing all classes, B representing the batch size of each read, τ representing the temperature coefficient, exp (. Cndot.) representing the computational index, log (. Cndot.) representing the computational logarithm, and the final prototype contrast loss L _proto For the average contrast loss of all positive sample pairs, the formula is as follows:

wherein

The values of (1) are 0, and (1) indicate a negative sample pair and a positive sample pair, respectively. />