CN117332784A - Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning - Google Patents

Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning Download PDF

Info

Publication number
CN117332784A
CN117332784A CN202311278469.7A CN202311278469A CN117332784A CN 117332784 A CN117332784 A CN 117332784A CN 202311278469 A CN202311278469 A CN 202311278469A CN 117332784 A CN117332784 A CN 117332784A
Authority
CN
China
Prior art keywords
learning
attention
learning model
layer
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311278469.7A
Other languages
Chinese (zh)
Inventor
屠静
王亚
苏岳
万晶晶
李伟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuoshi Future Beijing technology Co ltd
Original Assignee
Zhuoshi Future Beijing technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuoshi Future Beijing technology Co ltd filed Critical Zhuoshi Future Beijing technology Co ltd
Priority to CN202311278469.7A priority Critical patent/CN117332784A/en
Publication of CN117332784A publication Critical patent/CN117332784A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Chemical & Material Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides an intelligent knowledge enhancement method based on hierarchical graph attention and dynamic element learning, which comprises the following steps: s101, performing data preprocessing to obtain structured data; s102, training an adaptive transfer learning model based on the structured data; s103, performing feature extraction by adopting a self-adaptive transfer learning model after training; s104, constructing a plurality of hierarchical graph attention networks based on different feature extraction results according to different subtasks, and optimizing the hierarchical graph attention networks by adopting a dynamic element learning algorithm; s105, outputting results of different subtasks is achieved by adopting the optimized hierarchical graph attention network, and the method has three aspects of accurately capturing a complex knowledge structure, flexibly achieving inter-domain knowledge migration and self-adaption and efficiently responding to dynamic environment changes, so that the method has wide application prospect and practical value in the field of intelligent knowledge enhancement, and provides a powerful and flexible tool for solving practical problems.

Description

Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning
Technical Field
The invention relates to the technical field of knowledge enhancement learning, in particular to an intelligent knowledge enhancement method based on hierarchical diagram attention and dynamic element learning.
Background
Today, with the rapid development of big data and artificial intelligence technology, knowledge acquisition, representation and enhancement are becoming important research directions. In particular, in the fields of medicine, finance, education, etc., how to accurately and efficiently extract knowledge from huge data sources, and how to enhance and optimize the knowledge through intelligent algorithms are currently important challenges. The Graphic Neural Network (GNN), transfer learning, meta learning, which are currently employed alone, generally have the following problems when enhancing and optimizing these knowledge: (1) Graph Neural Network (GNN): the graphic neural network is a powerful structured data learning framework which emerges in recent years. The method can capture complex relations between objects, but the traditional GNN lacks support for a hierarchical structure, and some methods learn relations and entity representations in a knowledge graph by using the GNN, but usually only pay attention to a static graph structure, and neglect the hierarchy and the dynamics of knowledge; (2) transfer learning: migration learning aims at migrating knowledge in one field to another field, reduces the workload of manual feature engineering, and although some field-adaptive migration learning schemes exist, many existing methods cannot adapt well to the characteristics of different fields, and mainly focus on specific tasks, such as image classification or text analysis, and lack versatility and expandability; (3) meta learning: meta learning is a method of learning how to learn. While it can provide flexibility between tasks, conventional meta-learning schemes often lack adaptability to dynamic environments and are often complex, require a large number of manual adjustments, and are not directly combined with knowledge representation and enhancement.
Disclosure of Invention
The invention aims to provide an intelligent knowledge enhancement method based on hierarchical graph attention and dynamic element learning, so as to solve the problems in the background technology.
The invention is realized by the following technical scheme: an intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta learning, the method comprising the steps of:
s101, acquiring general original data in any main field, and preprocessing the data to obtain structured data;
s102, training an adaptive transfer learning model based on the structured data;
s103, performing feature extraction by adopting a self-adaptive transfer learning model after training based on subtasks related to the main field;
s104, constructing a plurality of hierarchical graph attention networks based on different feature extraction results according to different subtasks, and optimizing the hierarchical graph attention networks by adopting a dynamic element learning algorithm;
s105, outputting results of different subtasks by adopting the optimized hierarchical graph attention network.
Optionally, the data preprocessing process includes: inputting the relevant original data into a named entity recognition model, and changing the relevant original data into a structural representation by the named entity recognition model, wherein the named entity recognition model comprises an embedding layer, a self-attention mechanism layer, a multi-head attention layer and an output layer.
Optionally, the adaptive migration learning model includes a Bert language pre-training layer, a bidirectional long and short term memory network BiLSTM layer, a conditional random field CRF layer, and a migration learning module, where the Bert language pre-training model is used to vectorize the structured data and convert the structured data into a machine-readable form, the bidirectional long and short term memory network BiLSTM layer is used to further process the vectorized data and extract vectorized feature data, the conditional random field CRF layer is used to decode an output result of the bidirectional long and short term memory network BiLSTM to obtain a prediction labeling sequence, and the migration learning module is used to migrate parameters of the adaptive migration learning model trained based on general raw data to a new model in a specific target field.
Optionally, based on subtasks related to the main field, the feature extraction is performed by adopting a self-adaptive transfer learning model after training, which specifically comprises:
based on a subtask related to a main field, obtaining structured data of a specific target field related to the subtask, and constructing a deep migration learning model based on the target field;
migrating training parameters of a Bert language pre-training layer in the self-adaptive migration learning model to the target domain-based deep migration learning model to perform word embedding on the input structured data of a specific target domain to obtain each word vector in all sentences;
transferring training parameters of a BiLSTM layer in the self-adaptive transfer learning model to a deep transfer learning model based on a target domain, and then inputting the word vector into the deep transfer learning model based on the target domain for training;
and transferring training parameters of a conditional random field CRF layer in the self-adaptive transfer learning model to a target domain-based deep transfer learning model, and decoding an output result of the target domain by the target domain deep transfer learning model to obtain a characteristic output result.
Optionally, for each subtask, a hierarchical graph attention network related to the subtask is constructed, wherein nodes in the hierarchical graph attention network represent different object entities in the subtask, and the characteristics output by the deep migration learning model based on the target domain are used as attributes of the nodes in the corresponding hierarchical graph attention network.
Optionally, the dynamic meta-learning algorithm adopts a meta-learning scheme based on FOMAML.
Optionally, in the FOMAML-based meta-learning scheme, parameters of the hierarchical attention network are dynamically adjusted according to the loss functions of different subtasks.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides an intelligent knowledge enhancement method based on hierarchical graph attention and dynamic element learning, which comprises the steps of 1. Capturing and representing the hierarchical structure and complexity of knowledge by adopting a hierarchical graph attention network. Compared with the simple relation mining in the prior art, the method can draw a richer and more accurate knowledge graph. The importance of different layers and relations is reasonably measured by introducing an attention mechanism, so that a real-world complex knowledge structure is reflected more accurately;
2. more flexible cross-domain migration capability: through BERT-based field self-adaptive migration learning, the invention realizes flexible migration and self-adaptation among different fields. The existing migration learning method is poor in effect when large differences exist between the source field and the target field, the characteristic of the embodiment of the application is beneficial to breaking barriers among the fields and promoting knowledge sharing and integration among different fields, and therefore the universality and expansibility of the model are improved; 3. more efficient dynamic adaptation: the meta learning scheme based on FOMAML can be adopted to quickly adapt to the change of task demands and environmental conditions. Compared with the existing static model and complex meta-learning scheme, the scheme not only enables the model to respond to changes in real time, but also reduces the calculation and storage requirements. The method provides an effective solution for maintaining the real-time performance and accuracy of the model in a dynamic and continuously-changing environment, and the advantages enable the method to have wide application prospect and practical value in the field of intelligent knowledge enhancement, and provide a powerful and flexible tool for solving the practical problem.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only preferred embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an intelligent knowledge enhancement method based on hierarchical attention and dynamic meta-learning.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.
In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced without one or more of these details. In other instances, well-known features have not been described in detail in order to avoid obscuring the invention.
It should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of the associated listed items.
In order to provide a thorough understanding of the present invention, detailed structures will be presented in the following description in order to illustrate the technical solutions presented by the present invention. Alternative embodiments of the invention are described in detail below, however, the invention may have other implementations in addition to these detailed descriptions.
Referring to fig. 1, a method for intelligent knowledge enhancement based on hierarchical attention and dynamic meta learning, the method comprising the steps of:
s101, acquiring general original data in any main field, and preprocessing the data to obtain structured data;
specifically, the data preprocessing process includes: inputting relevant original data into a named entity recognition model, and changing the relevant original data into a structural representation by the named entity recognition model, wherein the named entity recognition model comprises an embedded layer, a self-attention mechanism layer, a multi-head attention layer and an output layer, and the core of the self-attention mechanism layer is to capture the dependency relationship inside a sequence by calculating the relationship between each element and other elements in an input sequence. It relates to the following formula:
Attention(Q,K,V)=softmax(QK^T/sqrt(d_k))*V
q, K, V represent the query, key and value, respectively, and d_k is the dimension of the key. The softmax function is used to convert the attention score into a probability distribution. The formula describes determining weights for different values by query and key similarity calculations, allowing the model to give different attention to different parts of the input sequence.
While multi-headed self-attention means that the self-attention process described above does not proceed only once, but rather in parallel multiple times, each "head" focusing on different information. Each header has its own weight matrix of queries, keys and values that capture information in different aspects.
Through a multi-headed self-attention mechanism, the model can understand complex relationships between individual elements in the input sequence. This is particularly important for Named Entity Recognition (NER) because the recognition of an entity often depends on its context. For example, a word may have different meanings in different contexts, and multiple self-attention can effectively capture these dependencies, improving the accuracy of NER.
Illustratively, the specific steps of the named entity recognition model to transform the relevant raw data into a structured representation are as follows:
(1) Text cleaning: removing irrelevant symbols, punctuations, spaces, etc. in the text, converting the text into a normalized form, setting one sample in the original data set D as D, and cleaning the sample as D ', wherein the process can be expressed as D' =clean (D), and Clean represents a cleaning function;
(2) Word segmentation: dividing the text into words or phrases, taking d 'as an input unit of the model, taking a cleaned sample as a w, and taking the segmented sample as w, wherein the process can be expressed as w=token (d'), and the token represents a word segmentation function;
(3) Word drying and word shape reduction: converting words into basic forms thereof to reduce the number of words required to be processed by a model, setting w as a word-segmented sample, and setting a word-Stem-shaped sample as s, wherein the process can be expressed as s=stem (w), wherein Stem represents a word drying function;
(4) Part-of-speech tagging and word sense disambiguation: determining the part of speech of each word and solving the ambiguity of the word sense according to the context, setting s as a sample after word stem, and setting a sample after part of speech tagging and word sense disambiguation as p, wherein the process can be expressed as p=pos(s), and POS represents a part of speech tagging and word sense disambiguation function;
constructing a word vector: converting each word or phrase into a vector in high dimensional space allows the model to process the text data, assuming p as a part-of-speech tagging and word sense disambiguation sample, and v as a word vectorized sample, which can be expressed as v=vector (p), where vector represents a function that constructs a word vector. In summary, through the above 5 steps, the named entity recognition model changes the relevant raw data into a structured representation, wherein the outputs of the multiple heads are combined and processed through a linear layer to form the final output of the self-attention layer, which includes the complex interactions of the parts of the input sequence, providing rich information for the subsequent steps. In summary, the data preprocessing step effectively captures the internal dependency of the input sequence through a multi-head self-attention mechanism, and provides powerful support for Named Entity Recognition (NER). The design is exquisite, and the strong attention capability is combined with the diversity of the multi-head mechanism, so that deep understanding of input data is obtained. It is understood that in the medical community, a multi-headed self-attention mechanism based on a transducer is used for Named Entity Recognition (NER) to identify key entities such as diseases, drugs, symptoms, etc.
S102, training an adaptive transfer learning model based on the structured data;
specifically, the self-adaptive migration learning model comprises a Bert language pre-training layer, a bidirectional long-short-term memory network BiLSTM layer, a conditional random field CRF layer and a migration learning module, wherein the Bert language pre-training model is used for vectorizing the structured data and converting the structured data into a machine-readable form, the bidirectional long-short-term memory network BiLSTM layer is used for further processing the vectorized data and extracting vectorized characteristic data, the conditional random field CRF layer is used for decoding an output result of the bidirectional long-short-term memory network BiLSTM to obtain a prediction labeling sequence, and the migration learning module is used for migrating parameters of the self-adaptive migration learning model trained based on general original data to a new model in a specific target field.
Illustratively, the Bert language pre-training layer serves as a feature extractor to migrate learned features from the source task to the target task. The specific process is as follows:
pre-training stage: the BERT is pre-trained on a large amount of unlabeled data, learning a generic language representation.
Fine tuning: fine tuning is performed for specific source and target tasks. The fine tuning process of the model can be expressed by the following formula:
L(θ)=ΣL_t(y_t,f(x_t;θ))
wherein: l (θ) is the total loss function. L_t (y_t, f (x_t; θ)) is the loss function of the target task, where y_t is the target variable, x_t is the input feature, and θ is the model parameter.
Domain adaptation: the domain adaptation is mainly to align domain differences between the source task and the target task, so that knowledge of the source task can be better migrated to the target task. This may involve some domain-aligned techniques, such as domain-invariant feature learning.
S103, performing feature extraction by adopting a self-adaptive transfer learning model after training based on subtasks related to the main field;
further, the process specifically comprises the following steps:
based on a subtask related to a main field, obtaining structured data of a specific target field related to the subtask, and constructing a deep migration learning model based on the target field;
migrating training parameters of a Bert language pre-training layer in the self-adaptive migration learning model to the target domain-based deep migration learning model to perform word embedding on the input structured data of a specific target domain to obtain each word vector in all sentences;
transferring training parameters of a BiLSTM layer in the self-adaptive transfer learning model to a deep transfer learning model based on a target domain, and then inputting the word vector into the deep transfer learning model based on the target domain for training;
and transferring training parameters of a conditional random field CRF layer in the self-adaptive transfer learning model to a target domain-based deep transfer learning model, and decoding an output result of the target domain by the target domain deep transfer learning model to obtain a characteristic output result.
S104, constructing a plurality of hierarchical graph attention networks based on different feature extraction results according to different subtasks, and optimizing the hierarchical graph attention networks by adopting a dynamic element learning algorithm.
S105, outputting results of different subtasks by adopting the optimized hierarchical graph attention network.
It can be understood that for constructing a plurality of hierarchical graph attention networks based on different feature extraction results according to different subtasks, wherein nodes in the hierarchical graph attention networks represent different object entities in the subtasks, and features output by the deep migration learning model based on the target domain serve as attributes of the nodes in the corresponding hierarchical graph attention networks.
In the invention, the hierarchical graph attention mechanism adopts a multi-layer graph convolution, and each layer can capture different range of neighborhood information. The specific operation is as follows:
attention calculation: the attention weight of each node to its neighbors is calculated as:
α_{ij}=softmax_j(LeakyReLU(a^T[W*x_i||W*x_j]))
wherein: α_ { ij } is the attention weight of node i to node j. W is a weight matrix for linearly transforming node characteristics. a is a weight vector used to calculate the importance of a node pair. The i indicates a connection operation.
Feature update: the new feature representation for each node is calculated as:
h_i^{(l)}=σ(Σ_jα_{ij}*W*x_j)
wherein: h_i { (l) } is a new feature representation of node i in the first layer. Sigma is a nonlinear activation function, such as ReLU. Σj is the sum of all neighbors j of node i.
Layered structure: through multi-layer graph rolling operation, neighborhood information in a farther range can be captured layer by layer, so that deep understanding of the whole graph is realized, and through a hierarchical graph attention mechanism, the model can capture local characteristics of nodes and understand the global structure of the whole graph. This is critical to many knowledge driven tasks such as recommendation systems, knowledge graph mining, etc.
Illustratively, in the medical field, the multi-level medical graph model includes the following
A first layer: the individual layer of the patient comprises basic information, medical history, experimental results and the like of the individual.
A second layer: and a disease classification layer for classifying and connecting different diseases and symptoms.
Third layer: treatment protocol layers contain various drugs, surgery and other treatment methods.
Further by way of example, for a diabetic patient, a personalized treatment regimen is needed whose map model is:
individual layer: at the individual level, HGAT first analyzes the patient's personal information including age, gender, weight, blood glucose level, family history, etc.
Disease classification layer: HGAT would link the patient to a disease classification of diabetes and further identify other underlying diseases or symptoms associated with diabetes.
Treatment plan layer: at this level, the HGAT will analyze various possible treatment regimens, including medication, diet control, exercise programs, etc., and recommend a personalized treatment regimen based on the patient's personal information and disease classification.
Knowledge enhancement: through the hierarchical attention mechanism, the HGAT is able to automatically mine useful information from various layers and use this information to augment existing medical knowledge bases for more accurate diagnosis and treatment in the future.
For optimizing the hierarchical graph attention network by adopting a dynamic meta-learning algorithm, the dynamic meta-learning algorithm in the embodiment of the application adopts a meta-learning scheme based on FOMAML, and the aim of meta-learning is to train a model by training a series of tasks, so that the model can be quickly adapted to new tasks through a small number of samples and a small number of gradient updates. Assume that there is a set of tasks T, each task t_i being defined by a loss function l_i. The goal of meta-learning is to find a set of initialization parameters θ that are quickly adaptable to all tasks.
Whereas FOMAML is a model-agnostic meta-learning algorithm, applicable to any differentiable model. The key is that only one step of the degree information is used for meta-updating. The method comprises the following specific steps:
intra-task update: for each task t_i, the parameters are first adjusted by several gradient updates. Specifically, the gradient is calculated with the training set of task t_i and the parameters are updated:
wherein: θ' _i is the parameter updated by task t_i, α is the learning rate,is the gradient of the loss function L i with respect to θ.
Meta-update: and calculating the meta-update according to the updated parameters of all the tasks. Specifically, the gradient is calculated with the test set of tasks t_i and the meta-parameters are updated:
wherein: beta is the meta-learning rate. Σi is the sum of all tasks.
Illustratively, in the medical field, the disease type and treatment regimen are very diverse, and at the same time, the individual differences of patients are also large. Thus, a highly flexible and adaptable model is needed to handle different types of medical tasks, such as disease diagnosis, personalized treatment regimen generation, drug recommendation, etc.
Task definition:
task one: disease diagnosis (e.g. cancer diagnosis, diabetes diagnosis, etc.)
Task two: drug recommendation (for specific diseases or symptoms)
Task three: personalized treatment plan generation (e.g., sports and diet plans)
Task four: high risk group prediction (e.g., heart disease high risk patients)
Meta learning using FOMAML:
first, the model may be pre-trained over multiple tasks. The meta-learning scheme of FOMAML is used to dynamically adjust model parameters according to the loss functions of different tasks.
Dynamic task adjustment:
for example, in a cancer diagnosis task (task one), the model findings have high accuracy, but perform poorly in a drug recommendation task (task two). Based on the FOMAML algorithm, the model may automatically adjust to optimize performance on task two without significantly affecting performance of task one.
Task fine tuning and personalization:
the model can be fine-tuned quickly for newly added tasks or patient-specific data. For example, a personalized treatment regimen is required for a particular patient (task three), and the model can be quickly trimmed based on the patient's data to generate a targeted treatment regimen.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims (7)

1. An intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta learning is characterized by comprising the following steps:
s101, acquiring general original data in any main field, and preprocessing the data to obtain structured data;
s102, training an adaptive transfer learning model based on the structured data;
s103, performing feature extraction by adopting a self-adaptive transfer learning model after training based on subtasks related to the main field;
s104, constructing a plurality of hierarchical graph attention networks based on different feature extraction results according to different subtasks, and optimizing the hierarchical graph attention networks by adopting a dynamic element learning algorithm;
s105, outputting results of different subtasks by adopting the optimized hierarchical graph attention network.
2. The method for intelligent knowledge enhancement based on hierarchical attention and dynamic meta-learning of claim 1, wherein the data preprocessing process comprises: inputting the relevant original data into a named entity recognition model, and changing the relevant original data into a structural representation by the named entity recognition model, wherein the named entity recognition model comprises an embedding layer, a self-attention mechanism layer, a multi-head attention layer and an output layer.
3. The method for intelligent knowledge enhancement based on hierarchical attention and dynamic element learning according to claim 2, wherein the adaptive migration learning model comprises a Bert language pre-training layer, a two-way long and short term memory network BiLSTM layer, a conditional random field CRF layer and a migration learning module, the Bert language pre-training model is used for vectorizing the structured data to be machine-readable form, the two-way long and short term memory network BiLSTM layer is used for further processing the vectorized data and extracting vectorized characteristic data, the conditional random field CRF layer is used for decoding the output result of the two-way long and short term memory network BiLSTM to obtain a prediction labeling sequence, and the migration learning module is used for migrating parameters of the adaptive migration learning model trained based on general original data to a new model in a specific target field.
4. The method for intelligent knowledge enhancement based on hierarchical attention and dynamic meta-learning according to claim 3, wherein the feature extraction is performed by adopting a self-adaptive migration learning model after training based on subtasks related to a main domain, and specifically comprises the following steps:
based on a subtask related to a main field, obtaining structured data of a specific target field related to the subtask, and constructing a deep migration learning model based on the target field;
migrating training parameters of a Bert language pre-training layer in the self-adaptive migration learning model to the target domain-based deep migration learning model to perform word embedding on the input structured data of a specific target domain to obtain each word vector in all sentences;
transferring training parameters of a BiLSTM layer in the self-adaptive transfer learning model to a deep transfer learning model based on a target domain, and then inputting the word vector into the deep transfer learning model based on the target domain for training;
and transferring training parameters of a conditional random field CRF layer in the self-adaptive transfer learning model to a target domain-based deep transfer learning model, and decoding an output result of the target domain by the target domain deep transfer learning model to obtain a characteristic output result.
5. The intelligent knowledge enhancement method based on hierarchical attention and dynamic meta-learning according to claim 4, wherein for each subtask, a hierarchical attention network related to the subtask is constructed, nodes in the hierarchical attention network represent different object entities in the subtask, and features output by a deep migration learning model based on a target domain serve as attributes of the nodes in the corresponding hierarchical attention network.
6. The method for intelligent knowledge enhancement based on hierarchical attention and dynamic meta-learning of claim 1, wherein the dynamic meta-learning algorithm employs a meta-learning scheme based on FOMAML.
7. The method for intelligent knowledge enhancement based on hierarchical attention and dynamic meta-learning of claim 6 wherein parameters of hierarchical attention network are dynamically adjusted according to loss functions of different subtasks in FOMAML-based meta-learning scheme.
CN202311278469.7A 2023-09-28 2023-09-28 Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning Pending CN117332784A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311278469.7A CN117332784A (en) 2023-09-28 2023-09-28 Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311278469.7A CN117332784A (en) 2023-09-28 2023-09-28 Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning

Publications (1)

Publication Number Publication Date
CN117332784A true CN117332784A (en) 2024-01-02

Family

ID=89278562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311278469.7A Pending CN117332784A (en) 2023-09-28 2023-09-28 Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning

Country Status (1)

Country Link
CN (1) CN117332784A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967266A (en) * 2020-09-09 2020-11-20 中国人民解放军国防科技大学 Chinese named entity recognition model and construction method and application thereof
US20220147836A1 (en) * 2020-11-06 2022-05-12 Huazhong University Of Science And Technology Method and device for text-enhanced knowledge graph joint representation learning
US20220198276A1 (en) * 2020-12-17 2022-06-23 Zhejiang Lab Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation
CN116167378A (en) * 2023-02-16 2023-05-26 广东工业大学 Named entity recognition method and system based on countermeasure migration learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111967266A (en) * 2020-09-09 2020-11-20 中国人民解放军国防科技大学 Chinese named entity recognition model and construction method and application thereof
US20220147836A1 (en) * 2020-11-06 2022-05-12 Huazhong University Of Science And Technology Method and device for text-enhanced knowledge graph joint representation learning
US20220198276A1 (en) * 2020-12-17 2022-06-23 Zhejiang Lab Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation
CN116167378A (en) * 2023-02-16 2023-05-26 广东工业大学 Named entity recognition method and system based on countermeasure migration learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
闫美阳 等: "多源域混淆的双流深度迁移学习", 中国图象图形学报, no. 12, 16 December 2019 (2019-12-16), pages 191 - 202 *

Similar Documents

Publication Publication Date Title
Soydaner Attention mechanism in neural networks: where it comes and where it goes
CN112214995B (en) Hierarchical multitasking term embedded learning for synonym prediction
WO2023000574A1 (en) Model training method, apparatus and device, and readable storage medium
CN112015868B (en) Question-answering method based on knowledge graph completion
CN116682553B (en) Diagnosis recommendation system integrating knowledge and patient representation
CN112182154B (en) Personalized search model for eliminating keyword ambiguity by using personal word vector
CN110264372B (en) Topic community discovery method based on node representation
CN112489769A (en) Intelligent traditional Chinese medicine diagnosis and medicine recommendation system for chronic diseases based on deep neural network
CN111582506A (en) Multi-label learning method based on global and local label relation
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN111666762B (en) Intestinal cancer diagnosis electronic medical record attribute value extraction method based on multitask learning
Zhu et al. Big data image classification based on distributed deep representation learning model
Wang et al. Neighbor matching for semi-supervised learning
Xu et al. Weakly supervised facial expression recognition via transferred DAL-CNN and active incremental learning
Qian Exploration of machine algorithms based on deep learning model and feature extraction
CN110299194B (en) Similar case recommendation method based on comprehensive feature representation and improved wide-depth model
Soleimani et al. Generic semi-supervised adversarial subject translation for sensor-based activity recognition
CN116561314B (en) Text classification method for selecting self-attention based on self-adaptive threshold
CN117349494A (en) Graph classification method, system, medium and equipment for space graph convolution neural network
CN114139531B (en) Medical entity prediction method and system based on deep learning
CN117332784A (en) Intelligent knowledge enhancement method based on hierarchical graph attention and dynamic meta-learning
Fan et al. Large margin nearest neighbor embedding for knowledge representation
Cui et al. Deep hashing with multi-central ranking loss for multi-label image retrieval
Suder et al. Bayesian Transfer Learning
CN113535928A (en) Service discovery method and system of long-term and short-term memory network based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination