CN117476184A - Knowledge service system and service platform for clinical trial of traditional Chinese medicine - Google Patents
Knowledge service system and service platform for clinical trial of traditional Chinese medicine Download PDFInfo
- Publication number
- CN117476184A CN117476184A CN202311222119.9A CN202311222119A CN117476184A CN 117476184 A CN117476184 A CN 117476184A CN 202311222119 A CN202311222119 A CN 202311222119A CN 117476184 A CN117476184 A CN 117476184A
- Authority
- CN
- China
- Prior art keywords
- data
- unit
- knowledge
- entity
- relationship
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 239000003814 drug Substances 0.000 title claims abstract description 39
- 238000000605 extraction Methods 0.000 claims abstract description 38
- 238000012545 processing Methods 0.000 claims abstract description 28
- 238000013210 evaluation model Methods 0.000 claims abstract description 13
- 238000013139 quantization Methods 0.000 claims abstract description 11
- 238000006243 chemical reaction Methods 0.000 claims abstract description 10
- 238000007405 data analysis Methods 0.000 claims abstract description 7
- 230000003993 interaction Effects 0.000 claims description 10
- 238000001914 filtration Methods 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000013523 data management Methods 0.000 claims description 6
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000002474 experimental method Methods 0.000 claims description 5
- 238000004140 cleaning Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000007726 management method Methods 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 2
- 239000002699 waste material Substances 0.000 abstract description 4
- 238000012360 testing method Methods 0.000 abstract description 2
- 238000000034 method Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000011160 research Methods 0.000 description 7
- 208000024891 symptom Diseases 0.000 description 7
- 238000005192 partition Methods 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 102100032202 Cornulin Human genes 0.000 description 3
- 101000920981 Homo sapiens Cornulin Proteins 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 230000000007 visual effect Effects 0.000 description 3
- 208000007342 Diabetic Nephropathies Diseases 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 208000033679 diabetic kidney disease Diseases 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011049 filling Methods 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 239000008194 pharmaceutical composition Substances 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013079 data visualisation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000013441 quality evaluation Methods 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 229940126673 western medicines Drugs 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/90—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to alternative medicines, e.g. homeopathy or oriental medicines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H80/00—ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Pathology (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Pharmacology & Pharmacy (AREA)
- Biomedical Technology (AREA)
- Alternative & Traditional Medicine (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a knowledge service system and a service platform for clinical trials of traditional Chinese medicine, which relate to the technical field of medical systems and are characterized in that: the device comprises a data acquisition layer, a data processing layer and a data processing layer, wherein the data acquisition layer is used for acquiring unstructured data and structured data and comprises an unstructured data acquisition unit and a structured data acquisition unit; the data processing and storing layer is used for processing and storing the obtained unstructured data and structured data; the data analysis module comprises a data conversion unit, an entity-relationship joint extraction unit, a knowledge graph unit, a knowledge question-answering unit and an evaluation model recommendation quantization scoring unit. According to the knowledge service system and the knowledge service platform for the traditional Chinese medicine clinical test, the trouble that a clinician finds out related evidence chains can be solved, and the clinician cannot quickly and fully utilize literature data resources, so that related problems of resource waste are indirectly caused.
Description
Technical Field
The invention relates to the technical field of medical systems, in particular to a knowledge service system and a service platform for clinical trials of traditional Chinese medicine.
Background
The traditional Chinese medicine has a unique theoretical system. Along with the development of modern traditional Chinese medicine, researchers combine the characteristics of traditional Chinese medicine with the ideas and methods of evidence-based medicine, and the scientificity, effectiveness and safety of the traditional Chinese medicine are elucidated by collecting, evaluating and applying the best research evidence of the traditional Chinese medicine, so as to guide the clinical practice and research of the traditional Chinese medicine. Besides ancient books and famous doctors experience of traditional Chinese medicine, the clinical research literature of traditional Chinese medicine contains abundant scientific research evidence and achievements, and is an important source of optimal research evidence.
However, as time goes by, geometric scale growth of the relevant literature volume can cause trouble to the clinician in finding relevant evidence chains, and the clinician cannot quickly and accurately and fully utilize literature data resources, so that resource waste is indirectly caused. Furthermore, the clinical practice literature is not directly translated into medical decisions, largely impeding the operability of evidence-based literature in clinical decisions and practice. Therefore, how to collect, sort and evaluate the evidence-based documents, facilitate quick search and association analysis, and how to improve the conversion efficiency of the evidence of the traditional Chinese medicine is needed to be solved.
Therefore, the invention aims to provide a knowledge service system and a service platform for clinical trials of traditional Chinese medicine for clinicians to solve the related problems.
Disclosure of Invention
The invention aims to provide a knowledge service system and a knowledge service platform for clinical trials of traditional Chinese medicine, which are used for solving the problem that the accumulation of related documents brings trouble to a clinician in finding a related evidence chain, and the clinician cannot quickly and fully utilize document data resources, so that the related problem of resource waste is indirectly caused.
The technical aim of the invention is realized by the following technical scheme: the knowledge service system comprises a data acquisition layer, a data processing and storing layer, a data analysis layer, a user interaction layer and a functional module layer;
the data acquisition layer is used for acquiring unstructured data and structured data and comprises an unstructured data acquisition unit and a structured data acquisition unit;
the data processing and storing layer is used for processing and storing the obtained unstructured data and structured data;
the data analysis module comprises a data conversion unit, an entity-relationship joint extraction unit, a knowledge graph unit, a knowledge question-answering unit and an evaluation model recommendation quantization scoring unit;
the user interaction layer is used for modifying the entity and the relation by the user, is used for cooperation and coordination among the users, and is used for establishing an expert knowledge base and an evaluation knowledge base.
The invention is further provided with: the unstructured data acquisition unit is used for acquiring PDF documents, CAJ documents and Doc documents; the structured data unit is used for acquiring a domain dictionary.
The invention is further provided with: the data processing and storage layer comprises filtering and transcoding data, and a relational database MySQL, a graph database Neo4j and a distributed search engine elastic search for storing the data.
The invention is further provided with: the data conversion unit is used for carrying out data format conversion on the document entity, and comprises OCR recognition processing, manual correction, data cleaning and text word segmentation;
the entity-relationship joint extraction unit is used for extracting and aligning the entities in the data literature through different extraction models, extracting the relationship in the data literature, and analyzing the relationship between the entities and the relationship to obtain the entity corresponding to the relationship;
the knowledge graph unit is used for constructing a knowledge graph between the entity and the relationship based on the relationship acquired by the entity-relationship joint extraction unit and the relationship between the entity;
the knowledge question-answering unit is used for matching the questions proposed by the user and ordering the answers based on the constructed knowledge graph;
the evaluation model recommendation quantization scoring unit is used for sequencing the search results according to the quantization scoring results aiming at the problems raised by the user, so that the user can acquire high-quality search results in time.
The invention also provides a knowledge service platform for the clinical trial of traditional Chinese medicine: the knowledge service platform comprises the following modules:
the knowledge online processing unit is used for online processing of the entity by a user;
the document update management unit is used for updating and managing documents by a user;
the evaluation model recommending unit is used for recommending a result after evaluation according to the quantitative score, namely a high-quality RCT experiment document;
the automatic question and answer unit is used for automatically asking and answering the questions presented by the user;
the data management unit is used for managing the data document by a user and is used for uploading the document, uploading the prompt of the same-name document, deleting the document and retrieving the document.
In summary, the invention has the following beneficial effects: according to the knowledge service system and the knowledge service platform for the traditional Chinese medicine clinical test, provided by the invention, the trouble that a clinician finds a relevant evidence chain can be solved, and the clinician cannot quickly and fully utilize document data resources to indirectly cause the relevant problem of resource waste.
Drawings
FIG. 1 is a system configuration diagram of a knowledge service system for clinical trials in TCM according to embodiment 1 of the present invention;
FIG. 2 is a schematic diagram showing entity relationship extraction in a knowledge service system for clinical trials in TCM according to embodiment 1 of the present invention;
FIG. 3 is a schematic diagram of data processing in a knowledge service system for clinical trials in TCM according to embodiment 1 of the present invention;
fig. 4 is a schematic diagram of an automatic question-answering flow in a knowledge service system for clinical trial of traditional Chinese medicine in embodiment 2 of the present invention;
FIG. 5 is a schematic diagram of a user login interface in a knowledge service system for clinical trials in TCM according to embodiment 2 of the present invention;
FIG. 6 is a schematic diagram of a knowledge graph interface in a knowledge service system for clinical trials of TCM according to embodiment 2 of the present invention;
FIG. 7 is a schematic diagram of a knowledge graph interface in a knowledge service system for clinical trials of TCM according to embodiment 2 of the present invention;
FIG. 8 is a schematic diagram of a knowledge graph interface in a knowledge service system for clinical trials of traditional Chinese medicine in embodiment 2 of the present invention;
FIG. 9 is a schematic diagram of a knowledge extraction interface in a knowledge service system for clinical trials in TCM according to embodiment 2 of the present invention;
fig. 10 is a schematic diagram of a data management interface in a knowledge service system for clinical trials in TCM according to embodiment 2 of the present invention.
Detailed Description
The invention is described in further detail below with reference to fig. 1-10.
Example 1: as shown in FIG. 1, the knowledge service platform comprises a data acquisition layer, a data processing and storage layer, a data analysis layer, a user interaction layer and a functional module layer;
the data acquisition layer is used for acquiring unstructured data and structured data and comprises an unstructured data acquisition unit and a structured data acquisition unit; the unstructured data acquisition unit is used for acquiring PDF documents, CAJ documents and Doc documents; the structured data element is used to obtain a domain dictionary.
The data processing and storing layer is used for processing and storing the obtained unstructured data and structured data; the data processing and storage layer comprises filtering and transcoding data, and a relational database MySQL, a graph database Neo4j and a distributed search engine elastic search for storing the data.
The data analysis module comprises a data conversion unit, an entity-relationship joint extraction unit, a knowledge graph unit, a knowledge question-answering unit and an evaluation model recommendation quantization scoring unit;
the data conversion unit is used for carrying out data format conversion on the document entity, and comprises OCR recognition processing, manual correction, data cleaning and text word segmentation;
in this embodiment, the filtering and transcoding processes of the data mainly adopt an end-to-end OCR technology to perform image recognition, as shown in fig. 3, a neural network architecture integrating feature extraction, sequence modeling and transcription is built, two networks cnn+rnn are adopted for training, and the model structure is shown in the following figure. The algorithm can naturally process sequences with any length, does not relate to character segmentation or horizontal scale normalization, has excellent performance in dictionary-free or dictionary-based scene text recognition tasks, can realize recognition of Chinese and English texts, and has recognition accuracy up to 95%.
The CRNN network framework is mainly divided into three parts, namely a CNN convolutional neural network layer which is responsible for extracting character features of documents, and the CNN convolutional neural network layer is also a bottom model structure of the CRNN network framework. The main procedure is that firstly the literature pictures are input to the CRNN network and these images are further processed. Common practice includes scaling its picture size to a specified size, and then performing CNN feature extraction operations. And then, carrying out feature extraction operation on the document picture through a feature sequence mapping layer to finally obtain an image feature vector (sequence) of the document, and conveying the image feature vector (sequence) to a circulation layer of a network. And the RNN circulating neural network layer is responsible for label sequence prediction. The network layer utilizes a long and short time memory network (LSTM). The prediction of the feature labels can be realized through the memory module inside the feature vector distribution method, and the label distribution of the feature vector is finally formed. It is noted that the label distribution of the previous step already contains information to be predicted. And finally, the CTC transcription layer is responsible for decoding, converts the predicted tag distribution condition of the circulation layer into a sequence tag, and finally outputs text information.
The entity-relationship joint extraction unit is used for extracting and aligning the entities in the data literature through different extraction models, extracting the relationship in the data literature, and analyzing the relationship between the entities and the relationship to obtain the entity corresponding to the relationship;
in this embodiment, the entity-relationship joint extraction unit specifically includes the following steps: firstly, summarizing and summarizing medical concept entities and evidence-based concept entities according to the characteristics and field characteristics of the evidence-based literature of traditional Chinese medicine; secondly, different entity extraction strategies are designed and realized aiming at entity type characteristics of different concepts. Aiming at entities with obvious boundary medical concepts in the traditional Chinese medicine literature, a deep learning model is used for automatically extracting entity relations; and extracting the entity without obvious boundary evidence-based concepts by adopting a scheme based on pattern matching and expert word stock.
(1) Entity relationship joint extraction model based on partition filtering encoder (PFN)
First, 9 kinds of entities such as diseases, symptoms, research groups, prescriptions, chinese patent medicines, chinese medicinal materials, western medicines, bureau indexes and the like are summarized aiming at RCT literature. In order to accurately describe medical knowledge between different entities, 7 classes of edges are defined, such as reporting, intervention, joint use, presentation, pharmaceutical composition, and off-index, and the relationship types of the entities are shown in the following table.
Based on the design of the entity relation type mode, the system adopts a PFN model to perform the joint extraction of the entity relation. The PFN model aims at the problem that the joint model can not well simulate the bidirectional interaction between NER and RE tasks, the feature extraction of two specific tasks is completed by joint coding in one encoder, and the ideas of partitioning and filtering are provided, and the 'features only related to NER' and the 'features only related to relationship prediction' and the 'features only related to NER and relationship prediction' are found out, so that the correct interaction between NER and RE is ensured, and the accuracy of entity identification and relationship prediction tasks is improved. The model mainly comprises two parts: the partition filter encoder and the two task units NER unit and RE unit, refer specifically to fig. 2.
A partition filter encoder decomposes the feature code into two steps in a single time step: partitioning and filtering. In the partitioner, neurons are partitioned into two task partitions and one shared partition. Partitions are then selected and combined in a filter to form task specific features and shared features, filtering out information that is not relevant to each task.
In NER units, the goal is to identify and classify all entity spans in a given sentence. More specifically, the task is considered a type-specific form filling problem. If the sentence input length is L, then the length of the table is L.L, the (i, j) position in the table represents the NER characterization of the span starting at the i-th position and ending at the j-th position, this characterization being concatenation: the ith location and jth NER feature, and the NER global feature, then go through Linear and ELU activation functions:
like NER, RE elements are considered a relationship-specific table filling problem. If a sentence length is L, then the table length is L.L, the (i, j) position in the table identifies a span with the i-th position as the first word, and the RE of the span with the j-th position as the first word characterizes this characterization as similar to NER units, concatenation: RE features at the ith position and the jth position, and RE global features, and then performing multi-label classification through Linear and ELU activation functions:
(2) Evidence-based conceptual entity relation extraction scheme based on pattern matching and expert word stock
And (3) extracting the entity based on the rule, wherein a domain expert is required to manually define a rule template, and character string matching is performed by using a knowledge base or a feature dictionary, so that entity extraction is realized. The specific implementation method comprises the following three steps: firstly, word segmentation and part-of-speech tagging are carried out on a text sequence; then, storing the entity after word segmentation into a dictionary and generating a labeling sequence of an initial text; performing regularized matching of the entity by utilizing a predefined rule template, wherein a specific rule extraction mode is shown in the following table:
the knowledge graph unit is used for constructing a knowledge graph between the entity and the relationship based on the relationship between the entity and the relationship acquired by the entity-relationship joint extraction unit;
the knowledge question-answering unit is used for matching the questions proposed by the user and ordering the answers based on the constructed knowledge graph;
the evaluation model recommendation quantization scoring unit is used for presenting questions for users, and ranking the search results according to the quantization scoring results, wherein the high-quality results are ranked in a descending order, so that the users can acquire the high-quality search results in time.
It should be noted that, the evaluation model recommendation quantization scoring unit decomposes the user questions by natural language processing and other methods based on knowledge extraction and processing results, calculates possible answers corresponding to the user questions, and performs answer recommendation based on score ranking.
The user interaction layer is used for modifying the entity and the relation by the user, is used for cooperation and coordination among the users, and is used for establishing an expert knowledge base and an evaluation knowledge base;
the data management unit is used for managing the data document by the user and is used for uploading the document, uploading the prompt of the same-name document, deleting the document and retrieving the document.
In the embodiment, the FR algorithm is adopted to realize the data visualization of the knowledge graph in the Unity3D, so that the relationship of the knowledge graph network structure is clearer, and the user analysis is facilitated.
The core idea of the algorithm is that according to the motion law of similar atoms or planets, the nodes in the graph are simulated into atoms, the position relation among the nodes is calculated by simulating the force field among the atoms, and the system finally enters a dynamic balance state. The FR algorithm changes stress by continuous iteration to move the nodes, and in order to avoid the boundary rushing out due to overlarge node movement distance, the FR algorithm is limited by setting a simulated annealing algorithm, and the FR algorithm is calculated in three steps in each iteration process:
(1) Calculating repulsive forces among all nodes;
(2) Calculating node attractive force connected by edges;
(3) And calculating an optimal distance formula of the two nodes according to the comprehensive stress conditions of the repulsive force and the attractive force of the nodes.
W x L represents the node drawing area, d is the distance between two nodes, V represents the number of nodes, and c prevents displacement out of bounds by simulating annealing.
In the embodiment, visual hierarchical structure display is realized based on the existing ontology data according to requirements, and ontology data retrieval and interactive display are supported. The visual part of the body mainly focuses on developing a graphical module capable of displaying the body in an intuitive mode, dynamically displaying the relation among concepts in a certain body, including the relation and the level among the concepts, and simultaneously interacting with a user and customizing the user in a personalized way according to the requirements of the user.
The Neo4J pattern database differs from the traditional database in that it has no concept of tables and fields, instead it is entities, relationships, and attributes. It uses explicit relations (relations) instead of hidden relations between tables in the traditional relational database, so that it is easier to express relations between things in the real world. In Neo4J, a link is generated between entities through an explicit relation, and the attribute and the name of the relation can be freely defined by a designer according to actual conditions without the normal form requirement of a traditional relational database, so that the system adopts a Neo4J graph database to display the ontology knowledge graph. In Neo4j, the Cypher language is the language that it is dedicated to operating on databases. The data is stored by using a Python third party package Py2Neo provided by Neo4j, and the algorithm operation steps are shown in the following table:
example 2
Based on the same concept as that of the embodiment 1, the embodiment also provides a knowledge service platform for clinical trial of traditional Chinese medicine: the knowledge service platform comprises the following modules: the knowledge online processing unit is used for online processing of the entity by a user; the document update management unit is used for updating and managing documents by a user; the evaluation model recommending unit is used for recommending the result after evaluation according to the quantitative score, namely a high-quality RCT experiment document; the automatic question and answer unit is used for automatically asking and answering the questions presented by the user;
the evaluation model recommendation unit constructs an evaluation model recommendation unit based on the knowledge extraction result of the source document from the multidimensional characteristics such as the integrity, bias, scientificity and the like of the source experiment, and recommends the high-quality RCT experiment document.
In this embodiment, the automatic question-answering unit supports a template-based question-answering, that is, a user inputs a question, performs rule-based question recognition and resolution, and outputs an answer from a database according to keyword matching or other methods. And supporting answer map display and frequency statistics.
The automatic question-answering unit specifically refers to that a user inputs related questions in a dialog box, the system transmits the questions to the back end, processes the questions, finds out the types and the entities of the questions, constructs a query template, outputs another entity of the triplet, sorts according to frequency and document quality evaluation, preferentially outputs answers with high weights, and returns query results to the front end, and the flow is shown in fig. 4.
The common pattern matching method comprises a maximum matching algorithm, regular expression matching and the like, and the method for making a rule dictionary comprises manual induction summarization, automatic statistics induction and the like. For example, the template "how $ { symptomm }" for diabetic nephropathy, the query= "which are the symptoms of diabetic nephropathy? Symptom= "symptom manifestation" in "is extracted.
Examples of query rule templates are shown in the following table:
in order to find the solution corresponding to the problem in the knowledge graph of the evidence-based Chinese medicine, the corresponding search code must be generated by using the structure of the type of the problem and the entity composition. The cytoer code, such as symptomatic questions of a disease, can be specified by the question type defined in the table above: match (n: disease) - [ l: disease_symptom ] - > (m: symptom) where n.name=' {0}return m.symptom; query the pharmaceutical composition match (n: description) - [ l: joint composition ] - > (m: media heres) where n.name=' {0} return m.name.
The data management unit is used for managing the data document by the user and is used for uploading the document, uploading the prompt of the same-name document, deleting the document and searching the document (as shown in fig. 10).
In this embodiment, the data management unit is configured to manage an upload document of a user, support the upload document, support uploading prompts of a same name document, and support basic functions such as document deletion and document retrieval; the method comprises clicking entity extraction pages of documents to articles, supporting uploading PDF format documents, supporting uploading documents by later users, and performing automatic data reading processing, entity and relation extraction.
In this embodiment, the system further includes a user login module, configured to log in the knowledge service platform by a user; in the login interface (shown in fig. 5), the correct user name, password and verification code are input, the login button is clicked, and the home page is entered;
in this embodiment, the system further includes a knowledge graph visualization interaction module, and the knowledge graph is displayed based on the entity relationship extraction result of all documents in the system (as shown in fig. 6). The knowledge graph searching and counting functions are supported, and the color and size display is supported according to the entity types and frequency. Clicking a node, the right graph shows statistics of the primary nodes connected with the node, including node names, relationship types and frequencies.
All entity relations take articles as extraction units, repeated entity merging in the articles is not counted, repeated entity merging among the articles is counted, and when the overall map is merged, research grouping type entities of each article are not merged, prescription and Chinese patent medicine type entities need to be compared with medicine compositions, the medicine compositions are completely consistent and merged, and the inconsistent medicine compositions are not merged.
As shown in fig. 7, clicking on a node in the knowledge graph, the right graph shows statistics of the level one nodes connected to it, including node names, relationship types, and frequencies.
As shown in FIG. 8, the ontology relationship page performs visualization of the ontology structure based on the obtained owl type data, and supports simple interaction such as expansion, retraction and the like.
In this embodiment, as shown in fig. 9, the system further includes a knowledge extraction module, and for the traditional Chinese medicine evidence-based domain document, the knowledge entity information extraction is performed on the traditional Chinese medicine document by adopting natural language processing key technologies such as chinese word segmentation, named entity extraction, relationship extraction, and the like, so as to provide data support for the construction of the knowledge base in the traditional Chinese medicine domain. The automatic extraction of the specified entities and relations in the traditional Chinese medicine literature is realized based on algorithms such as deep learning, and the like, including but not limited to disease entities, evidence-based entities, and the like, and the relation of the triplets thereof. And based on the extraction result, constructing a domain knowledge graph to realize visual display of entities and relations.
Highlighting of entities within articles is supported, and adding, deleting and modifying of entities and relationships is supported. Support to show the literature through keyword search or last next choice, if possible, support the marking of the word.
The present embodiment is only for explanation of the present invention and is not to be construed as limiting the present invention, and modifications to the present embodiment, which may not creatively contribute to the present invention as required by those skilled in the art after reading the present specification, are all protected by patent laws within the scope of claims of the present invention.
Claims (5)
1. The knowledge service system for the clinical trial of traditional Chinese medicine is characterized by comprising a data acquisition layer, a data processing and storing layer, a data analysis layer, a user interaction layer and a functional module layer;
the data acquisition layer is used for acquiring unstructured data and structured data and comprises an unstructured data acquisition unit and a structured data acquisition unit;
the data processing and storing layer is used for processing and storing the obtained unstructured data and structured data;
the data analysis module comprises a data conversion unit, an entity-relationship joint extraction unit, a knowledge graph unit, a knowledge question-answering unit and an evaluation model recommendation quantization scoring unit;
the user interaction layer is used for modifying the entity and the relation by the user, is used for cooperation and coordination among the users, and is used for establishing an expert knowledge base and an evaluation knowledge base.
2. The knowledge service system and service platform for clinical trials of traditional Chinese medicine according to claim 1, wherein the unstructured data acquisition unit is used for acquiring PDF documents, CAJ documents and Doc documents; the structured data unit is used for acquiring a domain dictionary.
3. The knowledge service system and platform for clinical trials of traditional Chinese medicine according to claim 1, wherein the data processing and storage layer comprises filtering and transcoding data, and relational database MySQL, graph database Neo4j and distributed search engine elastic search for storing data.
4. The knowledge service system and the service platform for clinical trial of traditional Chinese medicine according to claim 1, wherein the data conversion unit is used for converting data formats of literature entities, including OCR recognition processing, manual correction, data cleaning and text word segmentation;
the entity-relationship joint extraction unit is used for extracting and aligning the entities in the data literature through different extraction models, extracting the relationship in the data literature, and analyzing the relationship between the entities and the relationship to obtain the entity corresponding to the relationship;
the knowledge graph unit is used for constructing a knowledge graph between the entity and the relationship based on the relationship acquired by the entity-relationship joint extraction unit and the relationship between the entity;
the knowledge question-answering unit is used for matching the questions proposed by the user and ordering the answers based on the constructed knowledge graph;
the evaluation model recommendation quantization scoring unit is used for sequencing the search results according to the quantization scoring results aiming at the problems raised by the user, so that the user can acquire high-quality search results in time.
5. A knowledge service platform for a knowledge service system for clinical trials of traditional Chinese medicine according to any one of claims 1 to 4, wherein the knowledge service platform comprises the following modules:
the knowledge online processing unit is used for online processing of the entity by a user;
the document update management unit is used for updating and managing documents by a user;
the evaluation model recommending unit is used for recommending a result after evaluation according to the quantitative score, namely a high-quality RCT experiment document;
the automatic question and answer unit is used for automatically asking and answering the questions presented by the user;
the data management unit is used for managing the data document by a user and is used for uploading the document, uploading the prompt of the same-name document, deleting the document and retrieving the document.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311222119.9A CN117476184A (en) | 2023-09-21 | 2023-09-21 | Knowledge service system and service platform for clinical trial of traditional Chinese medicine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311222119.9A CN117476184A (en) | 2023-09-21 | 2023-09-21 | Knowledge service system and service platform for clinical trial of traditional Chinese medicine |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117476184A true CN117476184A (en) | 2024-01-30 |
Family
ID=89622870
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311222119.9A Pending CN117476184A (en) | 2023-09-21 | 2023-09-21 | Knowledge service system and service platform for clinical trial of traditional Chinese medicine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117476184A (en) |
-
2023
- 2023-09-21 CN CN202311222119.9A patent/CN117476184A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111708874B (en) | Man-machine interaction question-answering method and system based on intelligent complex intention recognition | |
CN109684448B (en) | Intelligent question and answer method | |
US7672987B2 (en) | System and method for integration of medical information | |
CN113505243A (en) | Intelligent question-answering method and device based on medical knowledge graph | |
CN112527999A (en) | Extraction type intelligent question and answer method and system introducing agricultural field knowledge | |
CN112131393A (en) | Construction method of medical knowledge map question-answering system based on BERT and similarity algorithm | |
CN109271505A (en) | A kind of question answering system implementation method based on problem answers pair | |
CN112487202B (en) | Chinese medical named entity recognition method and device fusing knowledge map and BERT | |
CN113590783B (en) | NLP natural language processing-based traditional Chinese medicine health preserving intelligent question-answering system | |
CN102663129A (en) | Medical field deep question and answer method and medical retrieval system | |
CN114238653B (en) | Method for constructing programming education knowledge graph, completing and intelligently asking and answering | |
Nualart et al. | How we draw texts: a review of approaches to text visualization and exploration | |
CN116340544B (en) | Visual analysis method and system for ancient Chinese medicine books based on knowledge graph | |
CN115293161A (en) | Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph | |
CN114004237A (en) | Intelligent question-answering system construction method based on bladder cancer knowledge graph | |
CN114610902A (en) | Poultry disease diagnosis system based on knowledge graph | |
CN116226349A (en) | Question and answer method and system based on table semantic fasttet question analysis | |
Da et al. | Deep learning based dual encoder retrieval model for citation recommendation | |
CN114547342A (en) | College professional intelligent question-answering system and method based on knowledge graph | |
CN116523041A (en) | Knowledge graph construction method, retrieval method and system for equipment field and electronic equipment | |
KR102434880B1 (en) | System for providing knowledge sharing service based on multimedia platform | |
CN115905554A (en) | Chinese academic knowledge graph construction method based on multidisciplinary classification | |
CN117476184A (en) | Knowledge service system and service platform for clinical trial of traditional Chinese medicine | |
CN114676258A (en) | Disease classification intelligent service method based on patient symptom description text | |
Homburga et al. | From an Analog to a Digital Workflow: An Introductory Approach to Digital Editions in Assyriology |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |