CN114169339A - Medical named entity recognition model training method, recognition method and federal learning system - Google Patents
Medical named entity recognition model training method, recognition method and federal learning system Download PDFInfo
- Publication number
- CN114169339A CN114169339A CN202210131792.0A CN202210131792A CN114169339A CN 114169339 A CN114169339 A CN 114169339A CN 202210131792 A CN202210131792 A CN 202210131792A CN 114169339 A CN114169339 A CN 114169339A
- Authority
- CN
- China
- Prior art keywords
- medical
- model
- named entity
- entity recognition
- global model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a medical named entity recognition model training method, a medical named entity recognition method and a federal learning system. The medical named entity recognition model training method comprises the following steps: receiving a global model which is sent by a central server and used for identifying medical named entities; training a global model based on local medical text labeling data, and calculating to obtain corresponding gradient data; sending the gradient data to a central server so that the central server trains the global model based on each gradient data received by the federal learning system to obtain a new global model, and distributing the converged global model if the new global model is converged currently; receiving a converged global model; and carrying out localization fine tuning processing on the converged global model based on a local prompt template to form a localization medical named entity recognition model. The technical scheme of the invention realizes privacy protection of medical data and local personalization of the medical named entity recognition model.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a medical named entity recognition model training method, a medical named entity recognition method and a federal learning system.
Background
Named Entity Recognition (MNER) in the medical field is a foundation for constructing medical knowledge graph construction and medical big data, is an important foundation for realizing medical record intelligent analysis and medical intellectualization, is also a key technology for medical record structuralization, medical knowledge graph construction, medical record retrieval and other applications, and has important value for medical intellectualization, auxiliary diagnosis and other applications.
The existing medical named entity recognition technology mainly depends on large-scale labeled data, but due to the privacy of medical data, the difficulty and the cost for acquiring a large amount of medical named entity labeled data are high, and high-quality labeled medical professionals can be produced very rarely at present; and may create data privacy risks during data transfer, the medical institution does not want to share data with other institutions by transmitting the data to an external server. In addition, medical data has the characteristic of strong individuation, and different and uniform models for describing partial diseases and symptoms in different regions and hospitals cannot well solve the problem of localization. In summary, the traditional deep learning method has difficulty in achieving ideal performance in the medical named entity recognition task.
Disclosure of Invention
In order to solve the problems that the medical data sharing is low and the localization of medical vocabulary description is not considered in model learning in the prior art, the invention provides the following technical scheme.
The invention provides a medical named entity recognition model training method in a first aspect, which comprises the following steps:
receiving a global model which is sent by a central server and used for identifying medical named entities;
training the global model based on local medical text labeling data, and calculating to obtain corresponding gradient data;
sending the gradient data to the central server so that the central server trains the global model based on each gradient data received by the federal learning system to obtain a new global model, and distributing the converged global model if the new global model is converged currently;
receiving the converged global model from the central server in the federated learning system;
and carrying out localization fine adjustment processing on the converged global model based on a local prompt template to form a localization medical named entity recognition model.
Preferably, before the performing the localized fine-tuning process on the converged global model based on the local prompt template, the method further includes:
receiving localized medical example sentence data;
and automatically generating a prompt template corresponding to the localized medical example sentence data based on a preset template generation model.
Preferably, the sending the gradient data to the central server includes:
and encrypting the gradient data through a preset encryption algorithm, and sending the encrypted gradient data to the central server.
The invention provides a medical named entity recognition model training method in a second aspect, which comprises the following steps:
distributing a global model for identifying medical named entities to each medical institution node in a federal learning system, so that each medical institution node trains the global model based on local medical text marking data of each medical institution node and calculates to obtain corresponding gradient data of each medical institution node;
receiving gradient data respectively sent by each medical institution node, and training the global model based on each gradient data to obtain a new global model;
if the new global model is converged currently, the converged global model is distributed to each medical institution node in the federated learning system, so that each medical institution node performs localization fine tuning processing on the converged global model based on a corresponding local prompt template to form a localization medical named entity recognition model which is applicable to each medical institution node.
Preferably, the method further comprises: if the new global model is not converged currently, the global model is distributed to all medical institution nodes in the federal learning system again, and the global model is trained based on the gradient data sent by all medical institution nodes respectively to obtain the new global model until the new global model is converged currently.
Preferably, before the distributing the global model for identifying medical named entities to each medical institution node in the federal learning system, the method further comprises:
and training the sequence labeling model according to preset medical text labeling data to obtain a corresponding initial global model for identifying the medical named entity, and taking the initial global model as a current global model.
The third aspect of the invention provides a medical named entity identification method, which comprises the following steps:
acquiring medical data;
inputting the medical data into a localized medical named entity recognition model so that the localized medical named entity recognition model outputs a medical named entity recognition result corresponding to the medical data;
the localization medical named entity recognition model is obtained in advance based on the medical named entity recognition model training method in the first aspect.
The fourth aspect of the present invention provides a medical named entity recognition model training device, including:
the system comprises a first receiving module, a second receiving module and a third receiving module, wherein the first receiving module is used for receiving a global model which is sent by a central server and used for identifying a medical named entity;
the gradient calculation module is used for training the global model based on local medical text labeling data and calculating to obtain corresponding gradient data;
the feedback module is used for sending the gradient data to the central server so that the central server trains the global model based on each gradient data received by the federal learning system to obtain a new global model, and if the new global model is converged currently, the converged global model is distributed;
a second receiving module for receiving the converged global model from the central server in the federated learning system;
and the fine tuning module is used for carrying out localization fine tuning processing on the converged global model based on a local prompt template so as to form a localization medical named entity recognition model.
A fifth aspect of the present invention provides a medical named entity recognition apparatus, comprising:
the acquisition module is used for acquiring medical data;
the input module is used for inputting the medical data into a localized medical named entity recognition model so as to enable the localized medical named entity recognition model to output a medical named entity recognition result corresponding to the medical data;
the localization medical named entity recognition model is obtained in advance based on the medical named entity recognition model training method.
In still another aspect, the invention provides a federated learning system, which comprises a central server and a plurality of medical institution nodes,
the central server is configured to execute the medical named entity recognition model training method according to the first aspect, and the central server is further configured to execute the medical named entity recognition method according to the third aspect;
the medical institution node is configured to execute the medical named entity recognition model training method of the second aspect.
Yet another aspect of the present invention provides an electronic device comprising a processor and a memory, the memory storing a plurality of instructions, the processor being configured to read the instructions and execute the method of the first, second or third aspect.
Yet another aspect of the present invention provides a computer readable storage medium storing a plurality of instructions readable by a processor and performing the method of the first, second or third aspect.
The invention has the beneficial effects that: the medical named entity recognition model is trained under a federal learning framework, so that the data safety distributed in different medical institutions is guaranteed, and the risk of data leakage caused by data transmission is avoided. The method has the advantages that the data privacy problem is solved, the commonalities of data among different medical institutions can be fully mined, the models can be more conveniently understood and used by locally deployed medical personnel through an interactive Prompt template design framework, the output of the models can be better controlled and expected, and the locally personalized medical entities can be more accurately identified and described by the models.
Drawings
FIG. 1 is a flow chart of a localized medical named entity recognition model training method according to a first embodiment of the present invention.
FIG. 2 is a block diagram of a federated learning framework for the training of a localized medical named entity recognition model in accordance with the present invention.
FIG. 3 is a schematic diagram of the localized medical named entity recognition logic according to the present invention.
FIG. 4 is a flowchart of a localized medical named entity recognition model training method according to a second embodiment of the invention.
Fig. 5 is a flowchart of a localized medical named entity identification method according to the present invention.
FIG. 6 is a flowchart of a localized medical named entity recognition model training method based on federated learning according to the present invention.
FIG. 7 is a schematic diagram of a localized recognition logic of the medical named entity recognition model according to the present invention.
Detailed Description
For better understanding of the above technical solutions, the following detailed descriptions will be provided in conjunction with the drawings and the detailed description of the embodiments.
The method provided by the invention can be implemented in the following terminal environment, and the terminal can comprise one or more of the following components: a processor, a memory, and a display screen. Wherein the memory has stored therein at least one instruction that is loaded and executed by the processor to implement the methods described in the embodiments described below.
A processor may include one or more processing cores. The processor connects various parts within the overall terminal using various interfaces and lines, performs various functions of the terminal and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory, and calling data stored in the memory.
The Memory may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). The memory may be used to store instructions, programs, code sets, or instructions.
The display screen is used for displaying user interfaces of all the application programs.
In addition, those skilled in the art will appreciate that the above-described terminal configurations are not intended to be limiting, and that the terminal may include more or fewer components, or some components may be combined, or a different arrangement of components. For example, the terminal further includes a radio frequency circuit, an input unit, a sensor, an audio circuit, a power supply, and other components, which are not described herein again.
The invention provides a medical named entity recognition method and system based on federal learning and Prompt-NER, wherein the system utilizes the federal learning to solve the problems that medical data cannot be shared and model learning is not available, and utilizes the Prompt method to solve the problem that medical vocabulary description is localized. The method can fully utilize the distributed data in different medical institutions under the condition of not sharing the data, and train a general model with better performance. The gradient of each node is obtained through distributed deployment, the gradient is transmitted back to the central server, the central server learns and updates the model by using the gradient, and privacy and safety of data are guaranteed through the method. After the medical institution downloads the latest model, the model automatically generates a template (template) according to the characteristics of local data to adapt to the local medical term description, so that the localization is realized, and the data privacy and the model personalization are considered.
Example one
As shown in FIG. 1, a first aspect of the present invention provides a medical named entity recognition model training method. The method may be executed by a medical institution node, where the medical institution node may refer to an institution that performs model training based on a distributed server in the federal learning system, and may also refer to the distributed server itself in the federal learning system, and specifically includes:
and S101, receiving the global model which is sent by the central server and used for identifying the medical named entity.
The overall federal learning framework of the present invention is shown, for example, in fig. 2, and includes a central server and distributed servers. The distributed servers correspond to medical named entity recognition systems distributed in various medical institutions and are used for carrying out model training and prediction locally, and the central server is responsible for training of initial models, issuing of the models and recovering and training of gradients.
It is to be understood that the distributed server described above may also be replaced by a client device having data processing capability, that is, the medical institution node mentioned in one or more embodiments of the present application may be a distributed server or a client device. The selection may be specifically performed according to the processing capability of the client device, the limitation of the user usage scenario, and the like. This is not a limitation of the present application.
If all operations are completed in the client device, the client device may further include a processor for performing specific processing of the medical named entity recognition model training.
It is understood that the client device may include any device capable of loading an application and having data processing capability, such as a home service robot, an indoor medical robot, a smart car, a smart phone, a tablet electronic device, a network set-top box, a portable computer, a Personal Digital Assistant (PDA), an in-vehicle device, a smart wearable device, and the like. Wherein, intelligence wearing equipment can include intelligent glasses, intelligent wrist-watch, intelligent bracelet etc..
In addition, in step S101, the global model may refer to an initial global model obtained by the first training of the central server, or may refer to a new global model generated by the central server after the last training of the global model according to the gradient data fed back by each medical institution node.
S102, training the global model based on local medical text labeling data, and calculating to obtain corresponding gradient data.
The gradient data is used for enabling the central server to carry out updating training on the global model. The converged global model generation process for the central server is shown in fig. 3.
S103, the gradient data are sent to the central server, so that the central server trains the global model based on the gradient data received by the federal learning system to obtain a new global model, and if the new global model is converged currently, the converged global model is distributed. In S103, if the global model is converged currently, the central server distributes the converged global model as result data to each medical institution node in the federal learning system; however, if the global model has not converged, the loop needs to be executed from S101 again until the global model converges.
S104, receiving the converged global model from the central server in the federal learning system. After the medical named entity recognition model of the central server converges through an early training process, when the central server distributes the converged global model to each medical institution node, the central server can send content or identification for explaining the convergence of the model to each medical institution node, so that the medical institution node receiving the content or the identification can know that the current global model is a final result model, the current global model can be directly applied on line, the medical institution node downloads the latest model file, and the intelligence degree of the whole model training process is further improved.
In addition, the central server can only send the converged content or identification for explaining the model to each medical institution node, so that the medical institution nodes with use requirements can download themselves as required, the use experience of the medical institution nodes can be further improved, and the waste of resources is avoided.
S105, performing localization fine tuning processing on the converged global model based on a local prompt template to form a localization medical named entity recognition model.
And the medical institution node finely adjusts the updated global labeling model (also called as a global model) by using the local data of the distributed server to form a local medical named entity recognition model.
In order to enhance transmission security, when the gradient data are sent to the central server, the gradient data can be encrypted through a preset encryption algorithm, the encrypted gradient data are sent to the central server, and then the central server can decrypt the received encrypted data based on a key corresponding to the pre-obtained encryption algorithm to obtain the gradient data. It is understood that, in order to further improve the security of data transmission, different encryption algorithms may be adopted for different medical institution nodes; and the central server may encrypt a packet corresponding to the global model when transmitting the global model to each medical institution node.
In an embodiment of the medical named entity recognition model training method executed by the medical institution node, the step S105 may further include the following steps:
s001: localized medical example sentence data is received.
In S001, the medical institution node receives the medical example sentence uploaded by the user or sent by another terminal, that is, the medical example sentence is: localizing medical example sentence data; and then, based on preset preprocessing rules, data cleaning, invalid data screening, data format normalization processing and the like can be performed on the localized medical example sentence data, so as to further improve the efficiency and accuracy of generating the prompt template in the S002.
S002: and automatically generating a prompt template corresponding to the localized medical example sentence data based on a preset template generation model.
It can be understood that the template generation model may be a template generation rule file preset by a user, or may be set by using an existing template generation tool, and the like, specifically according to an actual application situation.
In addition, in a specific implementation manner of S001, in order to further improve convenience of setting a localized medical example sentence for a user, a visual interaction platform may be further constructed in advance, the user may edit or upload localized medical example sentence data on the interaction platform, and then the medical institution node extracts the localized medical example sentence data and the like from the interaction platform.
Meanwhile, based on the visual interaction platform, after the medical institution node generates the prompt template in the S002, the prompt template can be sent to the interaction platform for visual display, so that a user can visually check the adaptation condition of the prompt template, and the application reliability of the prompt template is effectively improved.
Furthermore, the editing function of the interaction platform can also be applied to the prompt template, when the user considers that the prompt template automatically generated by the medical institution node needs to be modified, the user can directly edit the prompt template on the interaction platform, and the medical institution node can extract the prompt template again after the prompt template is updated and replace the original locally stored prompt template, so that the application reliability of the prompt template is further improved, and the user experience is improved.
Different medical institutions usually have different description modes for different medical nouns, terms and conclusions, in order to better adapt to the characteristics of local language description, the method disclosed by the invention utilizes a Prompt-based method to identify medical named entities, and utilizes a Prompt mechanism to process the localization and personalization phenomena described by different medical institutions, so that the model can better process the local personalized medical entity description.
Example two
As shown in FIG. 4, a second aspect of the present invention provides a medical named entity recognition model training method. The method can be executed by a central server, and specifically comprises the following steps:
s201, distributing a global model for identifying the medical named entities to each medical institution node in the federal learning system, so that each medical institution node trains the global model based on local medical text labeling data of each medical institution node, and calculating to obtain corresponding gradient data of each medical institution node.
S202, receiving gradient data respectively sent by each medical institution node, and training the global model based on each gradient data to obtain a new global model.
And S203, if the new global model is converged currently, distributing the converged global model to each medical institution node in the federated learning system, so that each medical institution node performs localized fine tuning processing on the converged global model based on a corresponding local prompt template to form a localized medical named entity recognition model which is suitable for each medical institution node.
The distributed servers correspond to medical named entity recognition systems distributed in various medical institutions and are used for carrying out model training and prediction locally, and the central server is responsible for training of initial models, issuing of the models, and recovering and training of gradients. When the medical named entity recognition model of the central server converges through the early training process, the medical institution node downloads the latest model file. And the medical institution node finely adjusts the updated global labeling model by using the local data of the distributed server to form a local medical named entity recognition model. That is, there may be a data interaction process between the medical named entity recognition model training method performed by the central server of the present application and the medical named entity recognition model training method performed by the aforementioned medical institution node.
For example: the central server executes S201, and distributes the initial global model to each medical institution node in the federal learning system, and each medical institution node executes S101 and S102 respectively; then, each medical institution node respectively executes step S103 to send the obtained gradient data to the central server, and the central server executes step S202 to receive the gradient data sent by each medical institution node respectively and trains the initial global model based on each gradient data to generate a current global model. If the global model is converged currently, the central server executes S203, and each medical institution node executes S104 and S105, respectively. However, if the central server determines that the current global model is in an unconverged state, the global model is determined as a new initial model again, and then the execution is started from S201 for the initial model again until the central server determines that the generated global model has converged at S202.
In addition, other contents executed by the central server and the medical institution node can also be applied to the interaction process between the central server and the medical institution node, and specific reference is made to the description of the contents executed by the central server and the medical institution node in the application, which is not described herein again.
Different medical institutions usually have different description modes for different medical nouns, terms and conclusions, in order to better adapt to the characteristics of local language description, the method disclosed by the invention utilizes a Prompt-based method to identify medical named entities, and utilizes a Prompt mechanism to process the localization and personalization phenomena described by different medical institutions, so that the model can better process the local personalized medical entity description.
In a further embodiment, the medical named entity recognition model training method executed by the central server further includes the following steps:
after S202, if the new global model is not currently converged, the non-converged global model is redistributed to each medical institution node in the federal learning system, and the global model is trained based on the gradient data respectively sent by each medical institution node to obtain the new global model until the new global model is currently converged.
Namely: after S202, if the new global model is not converged currently, the unconverged global model is used as the global model to be distributed currently, and then S201 is executed again to enter the iterative process. And (3) continuing to execute the step (S203) until the central server confirms that the new global model is converged after the step (S202) of a certain time in the iteration process, so as to effectively ensure the application reliability of the global model finally distributed to each medical institution node by the central server, and further improve the reliability and the accuracy of online application after the medical institution nodes are subjected to localization processing by adopting the global model.
In a further embodiment, before the global model for identifying the medical named entities is distributed to each medical institution node in the federal learning system, the sequence tagging model may be trained according to local medical text tagging data to obtain a corresponding initial global model for identifying the medical named entities, and the initial global model is used as the current global model. In a preferred embodiment, a named entity recognition model of MedBert + Span is used, where MedBert is a model obtained by pre-training a language model on large-scale unsupervised medical data. Span refers to a block of text. MedBert + Span means: named entity recognition model based on text block Span labeling implemented in the pre-trained language model MedBert. MedBert may be implemented using Wudao, Erine, Pangu, Bert, GPT, T5, among other pre-trained language models. Those skilled in the art will appreciate that the global labeling model can also be implemented by using a mainstream named entity recognition model such as Bert + Bilstm + CRF.
EXAMPLE III
As shown in fig. 5, the present invention provides, in a third aspect, a method for training a localized medical named entity recognition model, where the method is performed by a medical institution node, and specifically includes:
s301, acquiring medical data.
In S301, the medical institution node may perform data cleaning, invalid data screening, and data format normalization processing on the medical data based on preset preprocessing rules, so as to further improve the execution efficiency and accuracy of S302.
S302, inputting the medical data into a localized medical named entity recognition model so that the localized medical named entity recognition model outputs a medical named entity recognition result corresponding to the medical data.
The localization medical named entity recognition model is obtained in advance based on the medical named entity recognition model training method of the first aspect.
After S302, the medical institution node may further output a medical named entity recognition result corresponding to the medical data. The specific way may be to send the medical named entity recognition result to the aforementioned visual interaction platform, or directly send the medical named entity recognition result to a client device held by a user (a doctor in a medical institution and/or a patient corresponding to the medical data), so that the user can timely and intuitively recognize the medical named entity recognition result. Certainly, the medical institution node can also perform subsequent medical named entity data analysis, sorting and filing treatment according to the medical named entity recognition result.
The localized medical named entity recognition model may be a Prompt-NER model. And when the medical institution node reads the medical data to be locally identified and processed, performing localized medical named entity identification on the medical data by using the Prompt-NER model to obtain a medical named entity identification result.
Example four
As shown in FIG. 6, the present invention provides, in a fourth aspect, a method for training a localized medical named entity recognition model. The method is executed by a central server and a distributed server together, and specifically comprises the following steps:
s401, training the medical named entity labeling model at the central server according to preset medical text labeling data to form a global labeling model.
The overall federal learning framework of the present invention is shown in fig. 2 and includes a central server and distributed servers. The distributed servers correspond to medical named entity recognition systems distributed in various medical institutions and are used for carrying out model training and prediction locally, and the central server is responsible for training of initial models, issuing of the models and recovering and training of gradients.
The central server firstly trains a medical named entity labeling model according to the existing labeling data to form a global labeling model. In a preferred embodiment, a named entity recognition model of MedBert + Span is used, where MedBert is a model obtained by pre-training a language model on large-scale unsupervised medical data. MedBert may be implemented using Wudao, Erine, Pangu, Bert, GPT, T5, among other pre-trained language models. Those skilled in the art will appreciate that the global labeling model can also be implemented by using a mainstream named entity recognition model such as Bert + Bilstm + CRF.
S402, remotely obtaining the global annotation model through a plurality of distributed servers, calculating gradient information of each distributed server according to the global annotation model, and uploading the calculated gradient information to the central server;
the distributed server, namely the local server of the medical institution is responsible for downloading the model and calculating the local labeled data to obtain the gradient, and uploads the gradient to the central server, and the effect of local named entity recognition is trained and optimized based on a Prompt method. In this step, the servers distributed at the various medical institutions download the trained up-to-date global model from the central server. Then, each distributed server utilizes local training data to predict the model and calculate the gradient, and can further encrypt the gradient through an encryption algorithm and upload the gradient to a central server.
And S403, updating parameters of the global labeling model at the central server based on the gradient information of each distributed server to form an updated global labeling model.
Specifically, after the plurality of distributed servers upload the calculated gradient information to the central server, the central server may summarize the gradients distributed in different medical structures, and then perform parameter updating to form an updated global labeling model. To achieve model stability, step 103 is repeated until the global annotation model converges. The invention realizes the unified training of the model by utilizing the encryption gradient on the data of different medical institutions, can avoid the problem of data privacy, fully excavates the commonalities of the data among different medical institutions and trains the model with the capability of identifying the commonly used standard medical named entities.
S404, the updated global labeling model is obtained again through the distributed servers, and fine tuning is carried out on the updated global labeling model through the local medical text training data of each distributed server, so that a local entity recognition model is obtained.
And after the medical named entity recognition model of the central server is converged, the medical institution node downloads the latest model file, and fine-tunes the updated global labeling model by using the local data of the distributed server to form a localized medical named entity recognition model.
S405, optimizing the localization entity recognition model in each distributed server to obtain a Prompt-NER model for localization medical named entity recognition.
Different medical institutions usually have different description modes for different medical nouns, terms and conclusions, in order to better adapt to the characteristics of local language description, the method disclosed by the invention utilizes a Prompt-based method to identify medical named entities, and utilizes a Prompt mechanism to process the localization and personalization phenomena described by different medical institutions, so that the model can better process the local personalized medical entity description. In a further preferred embodiment, in the process of optimizing the localization entity recognition model, example sentences uploaded by deployment personnel of the medical structure can be received by constructing a visual interaction platform, and the model generates a Prompt template in an automatic or manual mode and can receive user edits. Preferably, the final Prompt template can also be determined by receiving manual selection on the basis of the system recommendation template.
Compared with the prior art, the local medical named entity recognition model training method based on the federal learning ensures the data safety of different medical institutions and avoids the risk of data leakage caused by data transmission by training the medical named entity recognition model under the federal learning framework. The method has the advantages that the data privacy problem is solved, the commonalities of data among different medical institutions can be fully mined, the models can be more conveniently understood and used by locally deployed medical personnel through an interactive Prompt template design framework, the output of the models can be better controlled and expected, and the locally personalized medical entities can be more accurately identified and described by the models.
In practical cases, referring to fig. 7, after the model training is completed, the model finally obtained by the distributed server may include the global annotation model updated in step S103, for example, the sequence annotation model of MedBert + Span, and further include the Prompt-NER model optimized in step S105, that is, the generative model based on Prompt. The MedBert + Span model mainly solves the problem of relatively universal medical named entity identification, and the method based on the Prompt mainly solves the problem of local personalized entity description. For example, when the local medical institution has no too much localization description, that is, there is no local data for model fine tuning, the MedBert + Span model can also be used as a localization entity recognition model, that is, the distribution server directly applies the updated global labeling model for localization medical named entity recognition. The two methods form complementation with each other, so that medical named entity identification can be better realized, and data leakage caused by data transmission is avoided.
EXAMPLE five
Another aspect of the present invention further includes a functional module architecture completely corresponding to and consistent with the medical named entity recognition model training method in the first embodiment, that is, a medical named entity recognition model training apparatus is provided, including:
a first receiving module 501, configured to receive a global model for identifying a medical named entity sent by a central server;
a gradient calculation module 502, configured to train the global model based on local medical text labeling data, and calculate to obtain corresponding gradient data;
a feedback module 503, configured to send the gradient data to the central server, so that the central server trains the global model based on each gradient data received by the federal learning system to obtain a new global model, and if the new global model is currently converged, distributes the converged global model;
a second receiving module 504, configured to receive the converged global model from the central server in the federated learning system;
and a fine-tuning module 505, configured to perform localized fine-tuning processing on the converged global model based on a local prompt template, so as to form a localized medical named entity recognition model.
EXAMPLE six
Another aspect of the present invention further includes a functional module architecture completely corresponding and consistent with the medical named entity recognition method of the third embodiment, that is, a medical named entity recognition apparatus, including:
an obtaining module 601, configured to obtain medical data;
an input module 602, configured to input the medical data into a localized medical named entity recognition model, so that the localized medical named entity recognition model outputs a medical named entity recognition result corresponding to the medical data;
the localization medical named entity recognition model is obtained in advance based on the medical named entity recognition model training method of the first aspect.
EXAMPLE seven
Another aspect of the present invention further includes a functional module architecture completely corresponding to and consistent with the flow of the aforementioned model training method, that is, an embodiment of the present invention further provides a localized medical named entity recognition model training apparatus, including:
the initialization module 701 is used for training the medical named entity labeling model at the central server according to preset labeling data to form a global labeling model;
the distributed feedback module 702 is configured to remotely obtain the global annotation model through a plurality of distributed servers, calculate gradient information of each distributed server according to the global annotation model, and then upload the calculated gradient information to the central server;
an updating module 703, configured to perform parameter updating on the global labeling model at the central server based on the gradient information of each distributed server, so as to form an updated global labeling model;
the localization module 704 is configured to obtain the updated global labeling model again through the plurality of distributed servers, and perform fine tuning on the updated global labeling model by using the local training data of each distributed server to obtain a localization entity identification model;
an optimizing module 705, configured to optimize the localized entity identification model in each distributed server, to obtain a Prompt-NER model for localized medical named entity identification.
The device can be implemented by the localization medical named entity recognition model training method provided in the fourth embodiment, and specific implementation methods can be referred to the description in the fourth embodiment and are not described herein again.
Example eight
As shown in fig. 2, another aspect of the present invention further includes a federated learning system, which includes a central server and a plurality of medical institution nodes, where the medical institution nodes may be distributed servers, and each of the distributed servers is communicatively connected to the central server.
The central server is used for distributing the initial global model for identifying the medical named entity to each distributed server in the federal learning system; receiving gradient data respectively sent by each distributed server, and training the initial global model based on each gradient data until the global model converges; distributing the converged global model to respective distributed servers in the federated learning system;
the distributed servers are used for respectively calculating to obtain respective corresponding gradient data based on respective local medical text labeling data and the initial global model; sending the gradient data to the central server; receiving a converged global model from the central server; carrying out localization fine tuning processing on the global model based on the corresponding local prompt template to form localization medical named entity identification models respectively suitable for the distributed servers; acquiring medical data; and inputting the medical data into the localized medical named entity recognition model so that the localized medical named entity recognition model outputs a medical named entity recognition result corresponding to the medical data.
Example nine
The invention also provides an electronic device, which comprises a processor and a memory, wherein the memory stores a plurality of instructions, and the processor is used for reading the instructions and executing the method of any one of the first to third embodiments.
Example ten
The invention also provides an electronic device comprising a processor and a memory connected with the processor, wherein the memory stores a plurality of instructions which can be loaded and executed by the processor to enable the processor to execute the method of any one of the first to third embodiments.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention. It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (10)
1. A medical named entity recognition model training method is characterized by comprising the following steps:
receiving a global model which is sent by a central server and used for identifying medical named entities;
training the global model based on local medical text labeling data, and calculating to obtain corresponding gradient data;
sending the gradient data to the central server so that the central server trains the global model based on each gradient data received by the federal learning system to obtain a new global model, and distributing the converged global model if the new global model is converged currently;
receiving the converged global model from the central server in the federated learning system;
and carrying out localization fine adjustment processing on the converged global model based on a local prompt template to form a localization medical named entity recognition model.
2. The method according to claim 1, further comprising, before the performing a localized fine-tuning process on the converged global model based on the local prompt template:
receiving a localized medical example sentence;
and automatically generating a prompt template corresponding to the localized medical example sentence based on a preset template generation model.
3. The method of claim 1, wherein sending the gradient data to the central server comprises:
and encrypting the gradient data through a preset encryption algorithm, and sending the encrypted gradient data to the central server.
4. A medical named entity recognition model training method is characterized by comprising the following steps:
distributing a global model for identifying medical named entities to each medical institution node in a federal learning system, so that each medical institution node trains the global model based on local medical text marking data of each medical institution node and calculates to obtain corresponding gradient data of each medical institution node;
receiving gradient data respectively sent by each medical institution node, and training the global model based on each gradient data to obtain a new global model;
if the new global model is converged currently, the converged global model is distributed to each medical institution node in the federated learning system, so that each medical institution node performs localization fine tuning processing on the converged global model based on a corresponding local prompt template to form a localization medical named entity recognition model which is applicable to each medical institution node.
5. The medical named entity recognition model training method of claim 4, further comprising:
if the new global model is not converged currently, the non-converged global model is distributed to all medical institution nodes in the federal learning system again, and the global model is trained based on the gradient data sent by all the medical institution nodes respectively to obtain the new global model until the new global model is converged currently.
6. The medical named entity recognition model training method of claim 4, further comprising, prior to the distributing the global model for recognizing medical named entities to the various medical institution nodes in the federated learning system:
and training the sequence labeling model according to preset medical text labeling data to obtain a corresponding initial global model for identifying the medical named entity, and taking the initial global model as a current global model.
7. A medical named entity recognition method, comprising:
acquiring medical data;
inputting the medical data into a localized medical named entity recognition model so that the localized medical named entity recognition model outputs a medical named entity recognition result corresponding to the medical data;
wherein, the localization medical named entity recognition model is obtained in advance based on the medical named entity recognition model training method of any one of claims 1 to 3.
8. A federated learning system is characterized in that the system comprises a central server and a plurality of medical institution nodes,
the central server is used for executing the medical named entity recognition model training method of any one of claims 1 to 3, and the central server is also used for executing the medical named entity recognition method of claim 7;
the medical institution node is adapted to perform the medical named entity recognition model training method of any of claims 4 to 6.
9. An electronic device comprising a processor and a memory, the memory storing a plurality of instructions, the processor configured to read the instructions and perform the medical named entity recognition model training method of any one of claims 1 to 3, the medical named entity recognition model training method of any one of claims 4 to 6, or the medical named entity recognition method of claim 7.
10. A computer-readable storage medium storing a plurality of instructions readable by a processor for performing the medical named entity recognition model training method of any one of claims 1 to 3, the medical named entity recognition model training method of any one of claims 4 to 6, or the medical named entity recognition method of claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210131792.0A CN114169339B (en) | 2022-02-14 | 2022-02-14 | Medical named entity recognition model training method, recognition method and federal learning system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210131792.0A CN114169339B (en) | 2022-02-14 | 2022-02-14 | Medical named entity recognition model training method, recognition method and federal learning system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114169339A true CN114169339A (en) | 2022-03-11 |
CN114169339B CN114169339B (en) | 2022-05-17 |
Family
ID=80489887
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210131792.0A Active CN114169339B (en) | 2022-02-14 | 2022-02-14 | Medical named entity recognition model training method, recognition method and federal learning system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114169339B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114822866A (en) * | 2022-07-01 | 2022-07-29 | 北京惠每云科技有限公司 | Medical data learning system |
CN114943308A (en) * | 2022-07-04 | 2022-08-26 | 北京交通大学 | Data classification method and device based on federal learning |
CN116564535A (en) * | 2023-05-11 | 2023-08-08 | 之江实验室 | Central disease prediction method and device based on local graph information exchange under privacy protection |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190065460A1 (en) * | 2017-08-31 | 2019-02-28 | Ebay Inc. | Deep hybrid neural network for named entity recognition |
CN113239972A (en) * | 2021-04-19 | 2021-08-10 | 温州医科大学 | Artificial intelligence auxiliary diagnosis model construction system for medical images |
CN113627085A (en) * | 2021-08-20 | 2021-11-09 | 深圳前海微众银行股份有限公司 | Method, apparatus, medium, and program product for optimizing horizontal federated learning modeling |
CN113901799A (en) * | 2021-12-07 | 2022-01-07 | 苏州浪潮智能科技有限公司 | Model training method, text prediction method, model training device, text prediction device, electronic equipment and medium |
CN114023412A (en) * | 2021-11-23 | 2022-02-08 | 大连海事大学 | ICD code prediction method and system based on joint learning and denoising mechanism |
-
2022
- 2022-02-14 CN CN202210131792.0A patent/CN114169339B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190065460A1 (en) * | 2017-08-31 | 2019-02-28 | Ebay Inc. | Deep hybrid neural network for named entity recognition |
CN113239972A (en) * | 2021-04-19 | 2021-08-10 | 温州医科大学 | Artificial intelligence auxiliary diagnosis model construction system for medical images |
CN113627085A (en) * | 2021-08-20 | 2021-11-09 | 深圳前海微众银行股份有限公司 | Method, apparatus, medium, and program product for optimizing horizontal federated learning modeling |
CN114023412A (en) * | 2021-11-23 | 2022-02-08 | 大连海事大学 | ICD code prediction method and system based on joint learning and denoising mechanism |
CN113901799A (en) * | 2021-12-07 | 2022-01-07 | 苏州浪潮智能科技有限公司 | Model training method, text prediction method, model training device, text prediction device, electronic equipment and medium |
Non-Patent Citations (1)
Title |
---|
SUYU GE等: "FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning", 《ARXIV.ORG》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114822866A (en) * | 2022-07-01 | 2022-07-29 | 北京惠每云科技有限公司 | Medical data learning system |
CN114822866B (en) * | 2022-07-01 | 2022-09-02 | 北京惠每云科技有限公司 | Medical data learning system |
CN114943308A (en) * | 2022-07-04 | 2022-08-26 | 北京交通大学 | Data classification method and device based on federal learning |
CN116564535A (en) * | 2023-05-11 | 2023-08-08 | 之江实验室 | Central disease prediction method and device based on local graph information exchange under privacy protection |
CN116564535B (en) * | 2023-05-11 | 2024-02-20 | 之江实验室 | Central disease prediction method and device based on local graph information exchange under privacy protection |
Also Published As
Publication number | Publication date |
---|---|
CN114169339B (en) | 2022-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114169339B (en) | Medical named entity recognition model training method, recognition method and federal learning system | |
CN108984683B (en) | Method, system, equipment and storage medium for extracting structured data | |
CN110797124A (en) | Model multi-terminal collaborative training method, medical risk prediction method and device | |
CN111814985A (en) | Model training method under federated learning network and related equipment thereof | |
JP2021108094A (en) | Method and device for generating interactive models | |
US20160078188A1 (en) | Evidence based medical record | |
CN110009059B (en) | Method and apparatus for generating a model | |
CN107291692B (en) | Artificial intelligence-based word segmentation model customization method, device, equipment and medium | |
EP4141786A1 (en) | Defect detection method and apparatus, model training method and apparatus, and electronic device | |
CN110929094A (en) | Video title processing method and device | |
CN110795939A (en) | Text processing method and device | |
EP4174714A1 (en) | Text sequence generation method, apparatus and device, and medium | |
CN108304376B (en) | Text vector determination method and device, storage medium and electronic device | |
CN111144140B (en) | Zhongtai bilingual corpus generation method and device based on zero-order learning | |
CN118250288B (en) | AIoT middle station data processing method, device and system based on scene assembly | |
CN111582360A (en) | Method, apparatus, device and medium for labeling data | |
CN117874794A (en) | Training method, system and device for large language model and readable storage medium | |
CN118504679A (en) | Method and related device for constructing vertical domain knowledge graph | |
CN112836013B (en) | Data labeling method and device, readable storage medium and electronic equipment | |
CN109597827A (en) | Medical data processing method and processing device, storage medium, electronic equipment | |
CN117195903A (en) | Generating type multi-mode entity relation extraction method and system based on noise perception | |
CN117252161A (en) | Model training and text generation method in specific field | |
CN115016823A (en) | Target software upgrading method, device, electronic equipment, medium and program product | |
CN115249361A (en) | Instructional text positioning model training, apparatus, device, and medium | |
CN110780850B (en) | Requirement case auxiliary generation method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |