CN114399637A

CN114399637A - Federal learning image segmentation method based on model similarity measurement

Info

Publication number: CN114399637A
Application number: CN202111471753.7A
Authority: CN
Inventors: 许京爽; 朱皞罡; 杨汀阳
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2021-12-02
Filing date: 2021-12-02
Publication date: 2022-04-26

Abstract

The invention discloses a federal learning image segmentation method based on model similarity measurement, which comprises the following steps: selecting a U-Net network architecture and a Deeplab v3+ network architecture as network models adopted by each participant during local training; step two: providing a model similarity concept, designing a model similarity measurement algorithm, and applying the concept of federal learning to the segmentation of the aorta image; step three: and carrying out network training according to the selected loss function interaction mode. Compared with the traditional federal learning method, the method has higher robustness and more obvious convergence effect, and improves the accuracy, smoothness and universality; moreover, the situations of different data distribution and overfitting can be better dealt with, a new idea is provided for the application of subsequent federal learning in various scenes, and a firmer foundation is laid.

Description

Federal learning image segmentation method based on model similarity measurement

Technical Field

The invention provides a federal learning image segmentation method based on model similarity measurement, relates to the application of federal learning in image data processing, in particular to the technical field of medical image segmentation, and provides a new model similarity measurement algorithm in a federal learning mode to realize image segmentation.

Background

Over sixty years of deep learning development, a lot of new opportunities are brought to the medical field, and the medical auxiliary diagnostic system, the information analysis, the medical image identification, the gene and medical big data and the like are respectively applied, and the medical auxiliary diagnostic system, the information analysis, the medical image identification, the gene and medical big data and the like have obvious breakthroughs and innovations in the aspects of radioactivity, pathology, genetics and the like. Deep learning can perform many goals and tasks that humans cannot accomplish, and learning large data sets is not straightforward. The network security law of the people's republic of China in 2017 has strict management and control requirements on the collection and processing of network data, that is, a large amount of data cannot be obtained through data merging between mechanisms, particularly in the medical field, the problem of privacy security of patients is involved, and the data island problem begins to appear, so that the data island problem becomes a great resistance and bottleneck for the development of the field.

For such data islanding problems, federal learning arises. With the rapid development of the times, people pay more attention to whether personal information can be properly stored to avoid leakage while meeting self requirements, particularly in the medical field, privacy and safety of data related to people need to be considered, a patient needs to consider whether self check data and image data can be protected or not besides basic disease diagnosis requirements, whether risks exist or not can be exposed so as to infringe privacy of the patient, and the problem is a very typical data island. The patient data and the disease types of a certain medical unit are limited, the number is limited, the coverage is not wide, but due to the problem of privacy and safety, the data cannot be mixed together through cooperation of multiple medical units to obtain a large data sample size, the performance effect of some deep learning methods cannot be influenced by the large sample size, the requirement on the performance effect precision in the medical field is high, and therefore the bottleneck of the data island problem in the medical field is particularly obvious.

The objective of federal learning is to build a model capable of learning a distributed data set, the data set can be dispersed in different organization units, interaction does not exist among data, the data are respectively stored in the organization local, namely, the risk of data leakage is avoided, the privacy and the safety of data participants are protected, meanwhile, the model of related tasks can be learned, and the model training effect with a large data set is achieved.

With the development of CT imaging technology, the aorta CT Angiography (CTA) plays a positive role in common aorta disease diagnosis, providing abundant positioning and positioning data for clinical research, and under such a technical development, more machines can be used to replace manual preliminary preparation, so that many relevant deep learning semantic segmentation models are applied to image segmentation. The task of image segmentation aiming at the CT image of the aorta is very meaningful, the aorta is the largest arterial blood vessel in the human body and contains a plurality of types of diseases, more common is aneurysm of the aorta and aortic dissection, especially aortic dissection, the incidence rate is one hundred thousand to one twenty ten thousand per year, the disease number of each medical unit is not large, 65% -70% of the disease in the acute stage is dead from cardiac tamponade, arrhythmia and the like, and therefore, early diagnosis and treatment are very necessary. The aorta CT images of a single patient are large in quantity, and the image segmentation is carried out on the aorta CT images, so that a doctor can be helped to reduce the early positioning screening work, and more attention is paid to the discovery and diagnosis and treatment of diseases.

In order to solve the data islanding predicament in the medical scene, the invention uses the mode of federal learning to carry out the learning of deep learning tasks, and provides a method for measuring the similarity of the model in the traditional mode of federal learning, thereby further improving the accuracy, smoothness, robustness and universality of the method.

Disclosure of Invention

The invention aims to provide a federate learning image segmentation method based on model similarity measurement, and aims to solve the problems of accuracy and universality of data dispersion learning in the prior art.

The invention discloses a federal learning image segmentation method based on model similarity measurement, which is characterized in that a general design flow chart is shown in fig. 2, an application scene is a medical scene, a task is aorta image segmentation, data input of the method is a medical image map, and the expected output is a segmentation map of an aorta part. The algorithm applies the federal learning thought in the model training of the neural network, is further improved on the traditional federal learning mode, and has the key point that a new model similarity measurement method based on interactive training is provided, and is a system inner loop part in fig. 2.

A federal learning image segmentation method based on model similarity measurement comprises the following steps:

the method comprises the following steps: selecting a network structure

Selecting a U-Net network architecture and a Deeplab v3+ network architecture as network models adopted by all participants during local training, wherein the U-Net comprises a full convolution-deconvolution network layer, and the Deeplab v3+ adopts a structure of an encoder-decoder;

step two: selecting interaction means between participants

The invention provides a model similarity concept, designs a model similarity measurement algorithm, and applies the concept of federal learning to the segmentation of the aorta image;

the model similarity measurement algorithm measures the similarity of the models through the difference of parameters among the models, and approaches to other models through learning the model parameters of other participant subjects to enhance the similarity of the models; the method is mainly embodied in the calculation mode of the loss function, and the calculation form of the loss function is as follows:

here Loss_lAs a loss function for the participant/current model training,

as a cross-entropy loss function, x_iLocal training data for participant l, y_iIs the group-Truth of the data, alpha is the parameter of the current model, alpha_kFor the model parameter of the kth participant, μ₁And L is the total number of the current participants.

Wherein, for the participant l, the above calculation formula is divided into two parts in total, the first part

The local model training loss value is a cross entropy loss function; the second part

The method is characterized in that a formula of a heavy interest region part is constructed, the function is constructed by combining related tasks, and parameter difference learning is introduced, wherein alpha is_kFor the training parameter of the kth other participant local thereto, α_lFor the training parameters local to the participant l, the larger the parameter difference, the larger the value of the second part of the conference, μ₁Is a coefficient, the magnitude of the partial and participant local cross entropy loss values is learned for parameter difference equalization. During model training, participantsAnd l, enhancing the similarity of the model and improving the data distribution range of the model adaptation by learning the model parameters of other participants.

Preferably, the invention further provides a method for improving the adaptability learning, namely a model similarity measurement method.

Because the objective of federal learning is to expect that the principal model of the participant can adapt to the data distribution of other principal subjects, on the basis of measuring the similarity of the model by the difference of model parameters, the invention also provides another new model similarity measurement algorithm, namely, measuring the similarity degree of the model by the difference of the model expression results in another construction mode of the important attention area part, the method focuses more on the data domain itself and aims to draw the distribution of the result data domain, and the calculation formula is as follows:

where f is_kAnd the current model of the participant principal k is obtained because the parameters of the model are transmitted and the model is determined, so that the current model trained by the participant principal k can be obtained, the performance of the current training data on other participant models k can be obtained by using the training data of the participant principal k, and the cross entropy of the current training data and the result of the model of the participant principal k is performed

Calculating;

step three: and carrying out deep learning network training according to the selected loss function interaction mode.

And according to the network framework selected in the first two steps and the loss function interaction mode, each participant respectively carries out local training.

The invention relates to a federated learning image segmentation method based on model similarity measurement, which has the advantages and effects that: compared with the traditional federal learning method, the method has higher robustness and more obvious convergence effect, and improves the accuracy, smoothness and universality; moreover, the situations of different data distribution and overfitting can be better dealt with, a new idea is provided for the application of subsequent federal learning in various scenes, and a firmer foundation is laid.

Drawings

FIG. 1 is a diagram illustrating a relationship between a research data domain and a parameter domain

FIG. 2 is a schematic view of the finishing process of the present invention

FIGS. 3a-d are the original image of aorta and the labeling result; where figure 3a is a frame of an aorta image,

FIG. 3b is the aorta region of the frame, FIG. 3c is the aorta image of the frame, and FIG. 3d is the aorta region of the frame

FIG. 4 is a diagram illustrating the separation effect of deep v3 +according to an embodiment of the present invention

FIG. 5 is a diagram illustrating the U-Net segmentation effect according to an embodiment of the present invention

FIGS. 6a-d show the upper and lower bounds of the training loss trend for each model

FIGS. 7a-d illustrate convergence of data separation and data mixture model training

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, it will be appreciated by those of ordinary skill in the art that numerous technical details are set forth in order to provide a better understanding of the present invention in its various embodiments. However, the technical solution claimed in the present invention can be implemented without these technical details and various changes and modifications based on the following embodiments.

The invention relates to a federal learning image segmentation method based on model similarity measurement, which comprises the following steps as shown in figure 2:

the method comprises the following steps: selecting a network structure

The U-Net network architecture and the Deeplab v3+ network architecture are used, the U-Net comprises a full convolution-deconvolution network layer, the Deeplab v3+ adopts an Encoder (Encoder) -Decoder (Decoder) structure, and the Deeplab v3+ is taken as an image segmentation model, so that two network structures with representativeness are compared, and better segmentation performance can be realized aiming at different semantic segmentation problems.

Step two: selecting interaction means between participants

The invention provides a model similarity concept, designs two new model similarity measurement algorithms, selects one of the two new model similarity measurement algorithms in the using process, combines the idea of federal learning, and is respectively applied to the aorta image segmentation, belongs to an image segmentation algorithm based on scattered and non-aggregated data, can prevent the data from leaving the local and interacting with other hospitals, and ensures the privacy of the data.

The concept of model similarity is specifically introduced here, where the model similarity is a measure of similarity between models in multiple training processes or after training is completed, and the higher the model similarity is, the closer the two models are, the lower the model similarity is, and the more separate the models are. Two methods for improving adaptive learning proposed herein, including two model similarity measures, are described below.

The first method is to measure the similarity of models through the difference of parameters between models, and to approach other models by learning the model parameters of other participant subjects, thereby enhancing the similarity of models. The method aims to reduce the spatial fluctuation of the model parameters, so that the model parameters have smoothness and stability. The method is mainly embodied in a loss function calculation mode and is suitable for various deep learning model frameworks, and the loss function calculation form is as follows:

here Loss_lAs a loss function for the participant/current model training,

in order to be a function of the cross-entropy loss,x_ilocal training data for participant l, y_iIs the group-Truth of the data, alpha is the parameter of the current model, alpha_kFor the model parameter of the kth participant, μ₁And L is the total number of the current participants.

The method is characterized in that a formula of a heavy interest region part is constructed, the function is constructed by combining related tasks, and parameter difference learning is introduced, wherein alpha is_kFor the training parameter of the kth other participant local thereto, α_lFor the training parameters local to the participant l, the larger the parameter difference, the larger the value of the second part of the conference, μ₁Is a coefficient, the magnitude of the partial and participant local cross entropy loss values is learned for parameter difference equalization. In the process of model training, the participant l enhances the similarity of the model and improves the data distribution range of model adaptation by learning the model parameters of other participants.

Calculating;

compared with the previous first model similarity measurement method, the only change is the formula construction in the second part, namely the important attention area part, the parameter difference between the models is not taken as the key for measuring the similarity of the models, but the model performance result difference is taken as the key for measuring the similarity of the models, and the change is well understood because the aim of decentralized learning is to learn the data distribution of other participants, and the measurement mode of the similarity of the first model tends to the similarity of the final model more. It is believed that the second way of using the difference in segmentation results as a measure of model similarity can learn more data distribution definitions than contemplated by the present invention.

Step three: network training is performed according to the selected loss function interaction mode

As shown in fig. 3a-d, fig. 3a and c are original CT images of aorta, fig. 3b and d are group-trout labeled aorta regions, and there are many other organ components besides the aorta regions in the original images, which increases the difficulty in labeling.

In order to simulate the actual condition of a data island and apply the idea of federal learning in the training process, data are distributed by taking different participants as units and are divided into three groups, the specific number is shown in the following table, the data volume of each group is about 6000, test data of each group are the same, 5097 CT data are obtained, and the purpose of test data consistency is to verify whether a model has better universality. Meanwhile, in order to ensure the difference and similarity of data distribution, the number of patients is used as a partition, and all CT images of the same patient completely belong to one group, which is also a characteristic to be considered when segmenting image data.

Table 1 data distribution quantity of tables

The method comprises the steps of performing model training by respectively using a U-Net model frame and a Deeplab v3+ model frame, performing local training on each group to obtain an optimal result of the local training of each group, performing training by respectively using related loss function improvement strategies of formulas 1 and 2 to obtain different results, and finally adding an information interaction strategy between participants properly in the model training process with the optimal result, namely introducing a model average method of a certain strategy to obtain training results of the modes.

The results are shown in the following fig. 4 and 5, which are based on the separation results of the deepab v3+ and the U-Net model, wherein the leftmost side is group-Truth, each line is different participant, there are three groups in total, the result is four groups, and A, B, C, D is the performance of the four training results on the same test data. The result A is the best result achieved by the three groups of data only in local training, the result B is the training result of the three groups of data by using model parameter difference as the similarity of the measurement model, the result C is the training result of the three groups of data by using model expression result difference as the similarity of the measurement model, and the result D is the result of each participant main body using a model average method for the local participant main body with a certain strategy on the basis of the result C.

Firstly, from the longitudinal view, in the result segmentation chart of the test data, it can be seen that the four training results show a gradual optimization process no matter U-Net or Deeplab v3+, wherein the data in the group D most accord with the original group-Truth; in a transverse view, each group is different in the performance of some images which are good at, and the performance effects of the local training A group on the same graph are different, which is also caused by the difference of the training data of each group, meanwhile, the performance effects of the three groups of data of the two models in the D group experiment are almost consistent, because the final models of the groups are close to each other continuously along with the interaction of the model averaging method and the continuous action of the participation main bodies on the local participation main bodies, and the universality on the distribution of other data is provided to a greater extent.

The specific results of each training set, including cross entropy results and IoU results, can be seen in table 2 below, where the overall performance of deep lab v3+ in the four experiments sets is better than that of U-Net to some extent, and in the application of deep lab v3+ model, the four experiments sets have a gradually increasing effect for the three experiments sets, and the final performance effect is the experiment of Group2 in Group D; when the model parameter difference is used as a training result for measuring the similarity, namely a B group experiment, under a U-Net model, the result of local training of each group in the A group is not completely exceeded; under the two models, the training results using the model expression result difference as the measure of similarity are improved to a certain extent compared with the training results of each group during local training, and on the basis of the training of the group C, the interaction of parameters among the participants is increased to obtain the experimental results of the group D, and the results show that the model expression result difference also has a certain positive expression effect on the basis of the group C.

TABLE 2 Algorithm results

The embodiment of the invention further analyzes the superiority of two algorithms:

after four groups of experimental tests are evaluated on the test set, Training loss of the four groups of experiments in the Training process is analyzed to measure the robustness and superiority of the model method used by the current experiment.

The following figure shows A, B, C, D the fluctuation of the loss value in four experiments, which is visually increased by using an error bar to more intuitively analyze the tendency. The four sets of experimental settings are the same as the previous section, the selected basic model is Deeplab v3+, the performance effect on U-Net is similar, only the experimental result graph of Group1 on Deeplab v3+ is shown here for illustration, the results of other groups are similar, and the training round of each Group is set as 150 rounds.

The overall downward trend of the four graphs is similar, as shown in fig. 6a-d, with the first 20 training rounds falling rapidly and the later tending to converge smoothly. The A group is that the models are trained independently, the whole is stable, but basically, some fluctuation occurs about every 20 training rounds, and the fluctuation phenomenon of later training still exists. The B group learns among the groups by using the model similarity measurement algorithm based on the model parameter difference, and as 5 training intervals are set for one-time interactive learning, the training loss has large fluctuation and obvious jitter when the groups interact, so that the model parameter difference is considered to be poor in robustness in the interaction process. The group C is learned by model expression result difference through a model similarity measurement algorithm based on the model expression result provided by the method, and interactive learning is performed by setting 5 training intervals, so that the training loss does not fluctuate greatly in the interactive period. And the group D introduces parameter interaction of a certain strategy on the basis of the group C, namely a model averaging method, neutralizes the jitter of each group during the difference interaction of model expression results, and converges in a very stable trend in the middle and later stages of the whole training, so that the performance of the group D is smoother than that of the group A in the middle and later stages. Therefore, the stability, smoothness and convergence of the group D in the whole training process are all best shown and have superiority.

Three groups are separately and locally trained on both the Deeplab v3+ model and the U-Net model, the loss convergence condition in the training process is shown as the condition of data separation training in FIGS. 7a-d, and the experimental result diagram on the Deeplab v3+ model is taken as an example here, and the result of the training on the U-Net model is consistent. Meanwhile, three groups of data are integrated, a new group is set, the new group comprises all the data of the current three groups, the group mixed with all the data can be regarded as a comparison group, the convergence degree of the group in the training process is considered to be consistent with the convergence degree of the neural network under the condition of a large number of data sets, and the convergence condition is the expression of group all in fig. 7 d.

The convergence of the group mixing all the data is compared with that of the other data which are respectively separated into three groups, the three groups of data which are separated from the local training are converged the fastest in the training period of 0-10, the training period of 10-20 gradually becomes slow, the training period of 20 gradually becomes slow, and the shaking condition can occur in the middle. The convergence rate of the set mixing all data is slower than that of the set trained alone in the period of 0-10, faster than that of the set trained alone in the training period of 10-20, and gradually becomes flat after the training period of 20, and slight jitter occurs. The experimental result can show that after 30-40 periods of training, the convergence degree of the data separation training group and the data mixing training group can be considered to be the same, and the convergence condition of the neural network under the condition of a large number of data sets is met.

The invention mainly provides a novel model similarity measurement algorithm, which uses a federal learning mode to carry out interaction among participants, carries out aorta region segmentation experiment under the image segmentation scene of an actual data island, compares with the traditional methods, and verifies the effectiveness, superiority and universality of the model.

Claims

1. A federal learning image segmentation method based on model similarity measurement is characterized in that: the method comprises the following steps:

the method comprises the following steps: selecting a network structure

step two: selecting interaction means between participants

Providing a model similarity concept, designing a model similarity measurement algorithm, and applying the model similarity measurement algorithm to the aorta image segmentation by combining with the idea of federal learning;

the model similarity measurement algorithm measures the similarity of the models through the difference of parameters among the models, and approaches to other models through learning the model parameters of other participant subjects to enhance the similarity of the models; the method is mainly embodied in a calculation mode of a loss function, and the calculation formula of the loss function is as follows:

here Loss_lAs a loss function for the participant/current model training,

as a cross-entropy loss function, x_iLocal training data for participant l, y_iIs the group-Truth of the data, alpha is the parameter of the current model, alpha_kFor the model parameter of the kth other participant, μ₁The number is a hyper-parameter, and L is the total number of the current participants; alpha is alpha_lFor the training parameters local to the participant, the larger the parameter differences,

the greater the value of (A);

step three: and carrying out network training according to the selected loss function interaction mode.

2. A federal learning image segmentation method based on model similarity measurement is characterized in that:

the method comprises the following steps: selecting a network structure

step two: selecting interaction means between participants

Providing a model similarity concept, designing a model similarity measurement algorithm, and applying the model similarity measurement algorithm to the aorta image segmentation by combining with the idea of federal learning; model parameters of other participant subjects are learned to approach other models, so that the similarity of the models is enhanced; the model similarity measurement algorithm measures the similarity degree of the model through the difference of the model expression results, and the loss function calculation formula is as follows:

wherein f is_kThe current model of the participant main body k is obtained because the parameters of the model are transmitted and the model is determined, and the current training model of the participant main body k can be obtained

Calculating;