WO2021189971A1 - Medical plan recommendation system and method based on knowledge graph representation learning - Google Patents

Medical plan recommendation system and method based on knowledge graph representation learning Download PDF

Info

Publication number
WO2021189971A1
WO2021189971A1 PCT/CN2020/136060 CN2020136060W WO2021189971A1 WO 2021189971 A1 WO2021189971 A1 WO 2021189971A1 CN 2020136060 W CN2020136060 W CN 2020136060W WO 2021189971 A1 WO2021189971 A1 WO 2021189971A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
recommendation
medical
knowledge graph
training
Prior art date
Application number
PCT/CN2020/136060
Other languages
French (fr)
Chinese (zh)
Inventor
颜泽龙
王健宗
吴天博
程宁
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021189971A1 publication Critical patent/WO2021189971A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a medical solution recommendation system and method based on knowledge graph representation learning.
  • the inventor realizes that the current medical recommendation system often uses a fixed search method or simply uses the historical interactive information of doctors and patients as input to recommend relevant medical information. However, it does not pass the personal information well. Comprehensive analysis to better recommend medical information, resulting in the recommended medical plan is not accurate enough, and it is prone to potential risks.
  • this application provides a medical plan recommendation system and method based on knowledge graph representation learning, which mainly solves the problem that the medical information recommended by the existing medical recommendation system is not accurate enough and is prone to potential risks.
  • a medical solution recommendation system based on knowledge graph representation learning including:
  • the extraction module is used to obtain patient data of the target user and extract the target entity in the patient data
  • the dividing module is used to divide the subgraph of the knowledge graph from the medical knowledge graph according to the target entity;
  • the first determining module is configured to determine the low-dimensional vector corresponding to the knowledge graph sub-graph based on characterization learning
  • the obtaining module is used to input the low-dimensional vector into a recommendation model that meets the preset training standard, and obtain a medical recommendation result matching the patient data.
  • a method for recommending medical solutions based on knowledge graph representation learning including:
  • the low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  • the low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  • a computer device including a storage medium, a processor, and a computer program that is stored on the storage medium and can run on the processor.
  • the processor executes the program, the above-mentioned
  • the medical information recommendation method of knowledge graph representation learning includes:
  • the low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  • this application provides a medical solution recommendation system and method based on knowledge graph representation learning.
  • this application can first use the extraction module to extract entities in the patient data, and use Divide modules and extract subgraphs from the knowledge graph based on entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information.
  • the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference.
  • obtaining low-dimensional vectors through characterization learning can improve the accuracy of the recommendation results output by the recommendation system, and provide higher support for subsequent personalized recommendations.
  • Fig. 1 shows a schematic structural diagram of a medical scheme recommendation system based on knowledge graph representation learning provided by an embodiment of the present application
  • FIG. 2 shows a schematic structural diagram of another medical scheme recommendation system based on knowledge graph representation learning provided by an embodiment of the present application
  • FIG. 3 shows a schematic diagram of the principle structure of a knowledge graph representation learning provided by an embodiment of the present application
  • FIG. 4 shows a schematic flowchart of a method for recommending medical solutions based on knowledge graph characterization learning provided by an embodiment of the present application.
  • an embodiment of the present application provides a medical solution recommendation system based on knowledge graph representation learning.
  • the system includes: An extraction module 31, a division module 32, a first determination module 33, and an acquisition module 34;
  • the extraction module 31 can be used to obtain patient data of the target user and extract the target entity in the patient data.
  • the patient data can be case consultation information manually uploaded by the target user in the recommendation system, or case information about the target user extracted based on the medical platform.
  • it can include various data forms such as text and images. After obtaining the patient information After that, firstly, based on the existing text conversion technology (such as OCR recognition technology, etc.), the patient data in each data form is uniformly converted into a text form for subsequent extraction of the target entity.
  • the target entity refers to a word or phrase that has a descriptive meaning.
  • It can usually be a person's name, place name, organization name, product name, or content with a certain meaning in a certain field, such as the name of a disease, drug, or organism in the medical field Wait.
  • the system also includes a dividing module 32, which is used to divide the knowledge graph subgraph from the medical knowledge graph according to the target entity.
  • the first determining module 33 may be used to determine the low-dimensional vector corresponding to the subgraph of the knowledge graph based on the representation learning.
  • a new method can be used to encode triples on the basis of the traditional TRANS method, that is, by introducing position encoding and relational memory networks to mine the potential dependencies of triples, Further obtain the low-dimensional vector of the target entity.
  • This process applies position coding and relational memory network coding to visit triples, which can solve the problem that the TRANS method cannot describe the potential dependency of the knowledge graph triples to a certain extent, thereby improving the accuracy of the triple embedding vector. Provide higher support for subsequent personalized recommendations.
  • the obtaining module 34 can be used to input the low-dimensional vector determined by the first determining module 33 into a recommendation model that meets the preset training standard, and obtain a medical recommendation result matching the patient data.
  • the recommendation model in order to determine the medical recommendation scheme corresponding to the target user, specifically, the recommendation model can be pre-trained in the recommendation system based on preset classification rules, so that the recommendation model can be determined according to the entity low-dimensional vector corresponding to the patient Result of corresponding medical recommendation.
  • the result of the medical recommendation may include the combination of medications, the treatment plan adopted, and the corresponding attending doctor candidates.
  • entities in patient data can be extracted first, and subgraphs can be extracted from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference. In this application, obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations.
  • the extraction module 31 may further include: a first training unit 311 and an extraction unit 312.
  • the first training unit 311 can be used to train an entity extraction model for extracting entity classes, where, when training an entity extraction model for extracting entity classes in patient data, the first training unit 311 can specifically Used to: tag the entity classes contained in the training set data; input the training set data after the annotation processing into the entity extraction model, and the training entity extraction model extracts the entity classes based on the Jieba natural language processing library; if the entity class is determined If the extraction error of is less than the preset threshold, it is determined that the entity extraction model has passed the training; if it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, it is determined that the entity extraction model has not passed the training, and the training set data with pre-marked parts of speech is used to repeat the correction.
  • the entity extraction model is trained so that the entity extraction model meets the first preset training standard.
  • the part-of-speech tagging when performing part-of-speech tagging on the entity classes in the training set data, the part-of-speech tagging can be performed based on the ICTCLAS Chinese part-of-speech tagging set, so as to determine the part of speech of each entity class after word segmentation.
  • the data can be analyzed through the Jibba natural language processing library to classify all entity classes.
  • the Jibba natural language processing database contains super large-scale corpus data, including 349,046 words, each line corresponds to a word, and contains three parts: word, word number, and part of speech.
  • the preset threshold value should be a value from 0 to 1, indicating the maximum extraction error when the entity extraction model passes the training.
  • the specific value can be set according to actual application requirements. The smaller the preset threshold value, it indicates the training of the entity extraction model. The higher the accuracy.
  • the extraction unit 312 may be used to extract a target entity in the patient data using an entity extraction model that meets the first preset training standard.
  • the specific implementation process can be: loading the dictionary file, identifying each word segment in the patient data; constructing a directed acyclic graph based on each word segment; according to the directed acyclic The graph calculates the maximum path probability from each node to the ending position of the sentence, and determines the optimal ending position of the corresponding segment of the node when the probability is the largest; segmenting the patient data at the optimal ending position in order to obtain each target entity.
  • a directed acyclic graph DAG of the phrase will be constructed first.
  • segmentation methods there may be several possible segmentation methods. These combinations can form a directed acyclic graph. For example, four paths can be formed: 1), 1 one 3/some 4/fever; 2), 1One 3/some 5/fever; 3), 1have 2/some 4/fever; 4), 1have 2/some 5/fever, which can be determined according to the directed acyclic graph corresponding to the four paths Figure out the starting position and possible ending position of each word.
  • the probability of different ending positions corresponding to the same word is calculated, and the ending position with the highest probability is determined as the optimal ending position.
  • the probability of each word the number of words in the dictionary / The total number of words in the dictionary. If it is determined that the beginning position of the word segment in the text to be extracted is 1, two corresponding ending positions can be identified, namely: 2 and 3, then the probability corresponding to the two ending positions can be calculated, and the probability corresponding to "Yes" If greater than "one", it can be determined that the position 2 corresponds to the optimal ending position of position 1, and then the optimal ending position corresponding to other starting positions is determined based on the same method.
  • the corresponding position For the beginning position 2 of the word segment, the corresponding position can be identified There are two ending positions, namely: 4 and 5. If it is determined that the position 4 corresponds to the optimal ending position of the position 2, then the text to be extracted can be segmented at the optimal ending position 2 and 4, and the target entity is obtained as " Yes, "some", "fever".
  • the dividing module 32 may specifically include: a marking unit 321, a traversal unit 322, and a dividing unit 323;
  • the marking unit 321 can be used to mark the core object entity and the secondary object entity in the target entity;
  • the traversal unit 322 can be used to traverse the medical knowledge graph with each core object entity as the starting point of the traversal, and traverse to the secondary object entity The traversal in this direction is stopped at time;
  • the dividing unit 323 can be used to divide the knowledge graph sub-graphs according to the traversal results of each core object entity.
  • entity labeling is based on the importance and pivotality of this type of entity in the knowledge graph, marking it as a core object or a secondary object. Since the knowledge graphs in different fields have different entity types and association relationships, the task of labeling core objects and secondary objects can be completed manually.
  • the breadth-first traversal rule can be used to traverse the subgraph according to the input core object entity.
  • the core object entity is traversed, the entity can be retained as the starting point of the subsequent traversal; when the secondary object entity is traversed Then stop traversing in that direction.
  • the entity obtained in this step is actually the surrounding entity directly connected to the starting entity. Repeat the traversal steps until the entities obtained from a certain traversal, except for those already in the subgraph of the knowledge graph, the rest are all secondary object entities.
  • the first determining module 33 may specifically include: an extraction unit 331, a configuration unit 332, and an encoding unit 333 2.
  • the second training unit 334 The second training unit 334; the extraction unit 331, which can be used to extract each triplet in the knowledge graph sub-graph; the configuration unit 332, which can be used to position the entity vectors in the triples by encoding the triples Vector; encoding unit 333, which can be used to encode the triples after adding position vectors based on the relational network, to obtain the encoding vector; the second training unit 334 can be used to use the decoder to score the encoding vector, and use The adaptive moment estimation (Adam) optimizer performs iterative training to further obtain the low-dimensional vector corresponding to the subgraph of the knowledge map.
  • Adam adaptive moment estimation
  • the principle of knowledge graph representation learning can be seen in the medical triplet embedded coding structure shown in Figure 3.
  • the medical triplet can first be stored as ( The form of entity, relationship, entity), such as (patient, disease history, disease), (doctor, level, specialty) and other forms are used to construct triples.
  • the position relationship can be embedded into the entity vector corresponding to the triplet, that is, by encoding the position of the triplet, the position vector can be configured for the entity vector in the embedding training.
  • the relational memory network can be used to encode the triples, and the specific encoding process can be realized based on the multi-head self-attention mechanism.
  • the entity vector obtained during the initialization process of the entity encoding is not accurate enough, it can be scored based on the decoder and iterated with the Adam optimizer. Through the positive and negative network training process, the entity can be further evaluated. The vector is optimized and adjusted so that the resulting low-dimensional vector meets the preset accuracy requirements. In order to finally get the low-dimensional vector of the medical entity, input it into the recommendation model to complete the framework of sequential learning.
  • the medical plan recommendation system may specifically include: a labeling module 35, a training module 36, and a second determining module 37;
  • the labeling module 35 can be used to determine sample patient data , And label the corresponding preset medical recommendation plan for the sample patient data;
  • the training module 36 can be used to train the recommendation model using the low-dimensional vector corresponding to the sample patient data;
  • the second determination module 37 can be used to determine the medical recommendation output by the recommendation model If the result meets the second preset training standard, it is determined that the recommended model has passed the training;
  • the training module 36 can also be used to repeatedly train the recommended model with sample patient data if it is determined that the recommended model has not passed the training, so that the recommended model meets the second preset Training standards.
  • the corresponding medical recommendation plan can be marked in advance based on different types of sample patient data.
  • the cancer patient data can be marked with an authoritative attending doctor in the field of cancer, and the corresponding treatment plan can also be marked. , Medication combination, etc.
  • using the sample patient data labeled corresponding to the medical recommendation plan to carry out targeted training on the recommendation model can further strengthen the classification and recognition ability of the recommendation model, so that the output result of the recommendation model matches the labeling result.
  • the acquisition module 34 may specifically include: an input unit 341 and a determination unit 342.
  • the input unit 341 may be used to input low-dimensional vectors into a recommendation model that meets the second preset training standard, and obtain recommendation scores corresponding to each preset medical recommendation plan.
  • the low-dimensional vector can be input into the recommendation model that meets the second preset training standard, and the recommendation model will output the recommendation score corresponding to each preset recommendation plan , The higher the recommended score, the higher the reference value.
  • the determining unit 342 can be used to determine the preset medical recommendation plan with the highest recommended score as the medical recommendation result of the target user.
  • the preset medical recommendation scheme with the highest recommendation score can be determined as the medical recommendation result matching the target user, and then the recommendation system is output and displayed to the target user, so as to serve as a reference for the target user.
  • entities in patient data can be extracted first, and subgraphs can be extracted from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference.
  • this application obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations.
  • this application additionally introduces position coding and relational memory network to mine the potential dependency of the triples, and further obtain the low-dimensional vector of the target entity. This process applies position coding and relational memory network coding to visit triples, which can solve the problem that existing methods cannot describe the potential dependence of knowledge graph triples to a certain extent, and can improve the accuracy of triple embedding vectors. Provide higher support for subsequent personalized recommendations.
  • an embodiment of the present application provides a method for recommending a medical plan based on knowledge graph representation learning.
  • the method includes: acquiring patients of the target user Data, and extract the target entity in the patient data; divide the knowledge graph subgraph from the medical knowledge graph according to the target entity; determine the low-dimensional vector corresponding to the knowledge graph subgraph based on the representation learning; input the low-dimensional vector to meet the preset training standards In the recommendation model of, obtain the medical recommendation results that match the patient data.
  • the target entity in the patient data when extracting the target entity in the patient data, it may specifically include: training an entity extraction model for extracting the entity class; using an entity extraction model that meets the first preset training standard to extract the patient data Target entity.
  • the specific method of training the entity extraction model for extracting entity classes can be: performing part-of-speech tagging on the entity classes contained in the training set data; inputting the labeled training set data into the entity extraction model to train the entity extraction
  • the model extracts entity classes based on the Jieba natural language processing library; if it is determined that the extraction error of the entity class is less than the preset threshold, it is determined that the entity extraction model has passed the training; if it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, the entity extraction model is determined If the training fails, the training entity extraction model is repeatedly modified using the training set data that is pre-marked with part of speech, so that the entity extraction model meets the first preset training standard.
  • the target entity when dividing the subgraph of the knowledge graph from the medical knowledge graph according to the target entity, it can specifically include: marking the core object entity and the secondary object entity in the target entity; using each core object entity as the starting point for traversal The medical knowledge graph is traversed, and the traversal in this direction is stopped when the secondary object entity is traversed; the knowledge graph subgraph is divided according to the traversal results of each core object entity.
  • the low-dimensional vector corresponding to the knowledge graph sub-graph based on the representation learning may specifically include: extracting each triplet in the knowledge graph sub-graph; The entity vector configures the position vector; encodes the triples after adding the position vector based on the relational network to obtain the encoded vector; uses the decoder to evaluate the encoded vector, and uses the adaptive moment estimation (Adam) optimizer to perform Iterative training further obtains the low-dimensional vector corresponding to the subgraph of the knowledge graph.
  • Adam adaptive moment estimation
  • inputting the low-dimensional vector into the recommendation model that meets the preset training standard to obtain medical recommendation results matching the patient data may specifically include: inputting the low-dimensional vector into the recommendation model that meets the second preset training standard In, the recommended score corresponding to each preset medical recommendation plan is obtained; the preset medical recommendation plan with the highest recommended score is determined as the medical recommendation result of the target user.
  • an embodiment of the present application also provides a storage medium.
  • the above-mentioned storage medium may be a volatile storage medium or a non-volatile storage medium; a computer program is stored thereon.
  • the program is executed by the processor, the above-mentioned method for recommending medical plans based on the knowledge graph representation learning as shown in FIG. 4 is realized.
  • the technical solution of this application can be embodied in the form of a software product.
  • the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several
  • the instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
  • the embodiment of the present application also provides a computer device, which may be a personal computer, a server, or a network device.
  • the physical device includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the above-mentioned method for recommending medical solutions based on the knowledge graph representation learning as shown in FIG. 4.
  • the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on.
  • the user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like.
  • the optional network interface can include standard wired interface, wireless interface (such as Bluetooth interface, WI-FI interface), etc.
  • the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
  • the non-volatile readable storage medium may also include an operating system and a network communication module.
  • the operating system is a program for the hardware and software resources of the data processing entity equipment based on the knowledge graph, and supports the operation of the information processing program and other software and/or programs.
  • the network communication module is used to implement communication between various components in the non-volatile readable storage medium, and communication with other hardware and software in the physical device.
  • those skilled in the art can first extract entities in the patient data, and extract subgraphs from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference.
  • this application obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations.
  • this application additionally introduces position coding and relational memory network to mine the potential dependency of the triples, and further obtain the low-dimensional vector of the target entity. This process applies position coding and relational memory network coding to visit triples, which can solve the problem that existing methods cannot describe the potential dependence of knowledge graph triples to a certain extent, and can improve the accuracy of triple embedding vectors. Provide higher support for subsequent personalized recommendations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Animal Behavior & Ethology (AREA)
  • Epidemiology (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Probability & Statistics with Applications (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

Disclosed by the present application are a medical plan recommendation system and method based on knowledge graph representation learning, relating to the technical field of artificial intelligence, and capable of solving the problem that medical information recommended by existing medical recommendation systems is insufficiently accurate and such systems are prone to problems with potential risks. The system comprises: an extraction module, used for obtaining patient data of a target user and extracting a target entity in the patient data; a dividing module, used for dividing the medical knowledge graph into knowledge graph sub-graphs according to the target entity; a first determining module, used for determining, on the basis of representation learning, a low-dimensional vector corresponding to the knowledge graph sub-graph; an obtaining module, used for inputting the low-dimensional vector into a recommendation model which meets a preset training standard, and obtaining a medical recommendation result matching the patient data. The present application is suitable for the intelligent recommendation of medical solutions.

Description

基于知识图谱表征学习的医疗方案推荐系统及方法Medical plan recommendation system and method based on knowledge graph representation learning
本申请要求于2020年10月26日提交中国专利局、申请号为202011153510.4,发明名称为“基于知识图谱表征学习的医疗方案推荐系统及方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 26, 2020, the application number is 202011153510.4, and the invention title is "Medical Solution Recommendation System and Method Based on Knowledge Graph Characterization Learning", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及到一种基于知识图谱表征学习的医疗方案推荐系统及方法。This application relates to the field of artificial intelligence technology, and in particular to a medical solution recommendation system and method based on knowledge graph representation learning.
背景技术Background technique
随着现代信息技术的发展,为了构建与物联网、云计算、大数据、空间地理信息集成等新一代信息技术为基础的智慧城市,并且提高国民的数字化体验,各种智能系统应运而生。在医疗领域,通过配置医疗推荐系统,能够帮助病人和医生大大缩短就诊时间,节省人力物力。With the development of modern information technology, in order to build a smart city based on a new generation of information technology such as the Internet of Things, cloud computing, big data, and spatial geographic information integration, and to improve the digital experience of the people, various intelligent systems have emerged. In the medical field, by configuring a medical recommendation system, it can help patients and doctors greatly shorten the time for consultation and save manpower and material resources.
发明人意识到,目前现有的医疗推荐系统往往采用固定的搜索方式,或单纯使用医生和病人的历史交互信息作为输入,从而进行相关医疗信息的推荐,然而不能很好地通过对个人信息的全面分析以更好地推荐医疗信息,导致推荐的医疗方案不够准确,容易存在潜在风险。The inventor realizes that the current medical recommendation system often uses a fixed search method or simply uses the historical interactive information of doctors and patients as input to recommend relevant medical information. However, it does not pass the personal information well. Comprehensive analysis to better recommend medical information, resulting in the recommended medical plan is not accurate enough, and it is prone to potential risks.
技术问题technical problem
有鉴于此,本申请提供了一种基于知识图谱表征学习的医疗方案推荐系统及方法,主要解决现有医疗推荐系统推荐的医疗信息不够精准,且容易存在潜在风险的问题。In view of this, this application provides a medical plan recommendation system and method based on knowledge graph representation learning, which mainly solves the problem that the medical information recommended by the existing medical recommendation system is not accurate enough and is prone to potential risks.
技术解决方案Technical solutions
根据本申请的一个方面,提供了一种基于知识图谱表征学习的医疗方案推荐系统,该系统包括:According to one aspect of the present application, a medical solution recommendation system based on knowledge graph representation learning is provided, the system including:
提取模块,用于获取目标用户的患者数据,并提取所述患者数据中的目标实体;The extraction module is used to obtain patient data of the target user and extract the target entity in the patient data;
划分模块,用于根据所述目标实体从医疗知识图谱中划分知识图谱子图;The dividing module is used to divide the subgraph of the knowledge graph from the medical knowledge graph according to the target entity;
第一确定模块,用于基于表征学习确定所述知识图谱子图对应的低维向量;The first determining module is configured to determine the low-dimensional vector corresponding to the knowledge graph sub-graph based on characterization learning;
获取模块,用于将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The obtaining module is used to input the low-dimensional vector into a recommendation model that meets the preset training standard, and obtain a medical recommendation result matching the patient data.
根据本申请的另一个方面,提供了一种基于知识图谱表征学习的医疗方案推荐方法,该方法包括:According to another aspect of the present application, a method for recommending medical solutions based on knowledge graph representation learning is provided, the method including:
获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
根据本申请的另一个方面,提供了一种存储介质,其上存储有计算机程序,所述程序被处理器执行时实现上述基于知识图谱表征学习的医疗信息推荐方法,包括:According to another aspect of the present application, there is provided a storage medium on which a computer program is stored, and when the program is executed by a processor, the above medical information recommendation method based on knowledge graph representation learning is implemented, including:
获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
根据本申请的再一个方面,提供了一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现上述基于知识图谱表征学习的医疗信息推荐方法,包括:According to another aspect of the present application, a computer device is provided, including a storage medium, a processor, and a computer program that is stored on the storage medium and can run on the processor. When the processor executes the program, the above-mentioned The medical information recommendation method of knowledge graph representation learning includes:
获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
有益效果Beneficial effect
借由上述技术方案,本申请提供的一种基于知识图谱表征学习的医疗方案推荐系统及方法,与目前医疗推荐系统相比,本申请可首先利用提取模块提取出患者数据中的实体,并利用划分模块,基于实体从知识图谱中抽取子图。之后通过知识图谱表征学习,采用三元组嵌入的方法,将每个实体(医生,病人)和关系(就诊,专业领域,处方,开药等)嵌入得到低维向量,并且保持医疗关系图谱中的语义信息。之后,将嵌入得到的低维向量,输入到推荐算法对应的推荐模型中去,推荐模型即可根据患者的低维向量,进行推荐分类,进一步输出医疗推荐结果,以供患者参考。在本申请中,通过表征学习获取低维向量,能够提高推荐系统所输出推荐结果的准确性,为之后的个性化推荐提供更高的支持。With the above technical solutions, this application provides a medical solution recommendation system and method based on knowledge graph representation learning. Compared with the current medical recommendation system, this application can first use the extraction module to extract entities in the patient data, and use Divide modules and extract subgraphs from the knowledge graph based on entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference. In this application, obtaining low-dimensional vectors through characterization learning can improve the accuracy of the recommendation results output by the recommendation system, and provide higher support for subsequent personalized recommendations.
附图说明Description of the drawings
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本地申请的不当限定。在附图中:The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The exemplary embodiments and descriptions of the application are used to explain the application, and do not constitute an improper limitation of the local application. In the attached picture:
图1示出了本申请实施例提供的一种基于知识图谱表征学习的医疗方案推荐系统的结构示意图;Fig. 1 shows a schematic structural diagram of a medical scheme recommendation system based on knowledge graph representation learning provided by an embodiment of the present application;
图2示出了本申请实施例提供的另一种基于知识图谱表征学习的医疗方案推荐系统的结构示意图;FIG. 2 shows a schematic structural diagram of another medical scheme recommendation system based on knowledge graph representation learning provided by an embodiment of the present application;
图3示出了本申请实施例提供的一种知识图谱表征学习的原理结构示意图;FIG. 3 shows a schematic diagram of the principle structure of a knowledge graph representation learning provided by an embodiment of the present application;
图4示出了本申请实施例提供的一种基于知识图谱表征学习的医疗方案推荐方法的流程示意图。FIG. 4 shows a schematic flowchart of a method for recommending medical solutions based on knowledge graph characterization learning provided by an embodiment of the present application.
本发明的最佳实施方式The best mode of the present invention
下文将参考附图并结合实施例来详细说明本申请。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互结合。Hereinafter, the present application will be described in detail with reference to the drawings and in conjunction with the embodiments. It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict.
针对现有医疗推荐系统推荐的医疗信息不够精准,且容易存在潜在风险的问题,本申请实施例提供了一种基于知识图谱表征学习的医疗方案推荐系统,如图1所示,该系统包括:提取模块31、划分模块32、第一确定模块33、获取模块34;Aiming at the problem that the medical information recommended by the existing medical recommendation system is not accurate enough and is prone to potential risks, an embodiment of the present application provides a medical solution recommendation system based on knowledge graph representation learning. As shown in FIG. 1, the system includes: An extraction module 31, a division module 32, a first determination module 33, and an acquisition module 34;
在具体的应用场景中,提取模块31,可用于获取目标用户的患者数据,并提取患者数据中的目标实体。其中,患者数据可为目标用户在推荐系统中手动上传的病例咨询信息,或基于医疗平台提取出的关于目标用户的病例信息,具体可包括文本、图像等多种数据形式,在获取到患者信息后,首先需要基于现有的文本转换技术(如OCR识别技术等),将各个数据形式的患者数据统一转换为文本形式,以便后续进行目标实体的提取。目标实体是指具有可描述意义的单词或短语,通常可以是人名、地名、组织机构名、产品名称,或者在某个领域内具有一定含义的内容,比如医学领域内疾病、药物、生物体名称等。对于本实施例,需要首先从患者数据的结构化和非结构化信息中提取目标实体,以便通过将目标实体与医疗知识图谱的实体进行匹配,进一步抽取出与患者数据对应的知识图谱子图。In a specific application scenario, the extraction module 31 can be used to obtain patient data of the target user and extract the target entity in the patient data. Among them, the patient data can be case consultation information manually uploaded by the target user in the recommendation system, or case information about the target user extracted based on the medical platform. Specifically, it can include various data forms such as text and images. After obtaining the patient information After that, firstly, based on the existing text conversion technology (such as OCR recognition technology, etc.), the patient data in each data form is uniformly converted into a text form for subsequent extraction of the target entity. The target entity refers to a word or phrase that has a descriptive meaning. It can usually be a person's name, place name, organization name, product name, or content with a certain meaning in a certain field, such as the name of a disease, drug, or organism in the medical field Wait. For this embodiment, it is necessary to first extract the target entity from the structured and unstructured information of the patient data, so as to further extract the subgraph of the knowledge map corresponding to the patient data by matching the target entity with the entity of the medical knowledge graph.
相应的,由于基于子图划分的知识图谱降维表达方法对实体的编码相比传统方法更为合理,以子图为基础的降维表达能充分考虑知识图谱的局部特征,得到的实体编码向量能更好地反映实体的本质特征。故在本申请中需要进行知识图谱子图的划分,并且越大的子图通常会学习到越好的特征,故可根据所需要的运行时长预先设定子图的抽取范围,以保证医疗方案推荐结果的精准度。故系统中还包括划分模块32,用于根据目标实体从医疗知识图谱中划分知识图谱子图。Correspondingly, since the dimensionality reduction expression method of the knowledge graph based on subgraph division is more reasonable than the traditional method for encoding entities, the dimensionality reduction expression based on the subgraph can fully consider the local characteristics of the knowledge graph, and the obtained entity encoding vector Can better reflect the essential characteristics of the entity. Therefore, in this application, it is necessary to divide the knowledge graph subgraph, and the larger the subgraph usually learns the better features, so the extraction range of the subgraph can be preset according to the required running time to ensure the medical plan The accuracy of the recommended results. Therefore, the system also includes a dividing module 32, which is used to divide the knowledge graph subgraph from the medical knowledge graph according to the target entity.
在具体的应用场景中,第一确定模块33,可用于基于表征学习确定知识图谱子图对应的低维向量。对于本实施例,在进行表征学习时,可在传统TRANS方法的基础上,采用新的方法来编码三元组,即通过引入位置编码和关系性记忆网络来挖掘三元组的潜在依赖关系,进一步得到目标实体的低维向量。此一过程应用位置编码和关系型记忆网络编码就诊三元组,能够在一定程度上解决TRANS方法无法描述知识图谱三元组潜在依赖关系的问题,进而能够提高三元组嵌入向量的准确性,为之后的个性化推荐提供更高的支持。In a specific application scenario, the first determining module 33 may be used to determine the low-dimensional vector corresponding to the subgraph of the knowledge graph based on the representation learning. For this embodiment, when performing characterization learning, a new method can be used to encode triples on the basis of the traditional TRANS method, that is, by introducing position encoding and relational memory networks to mine the potential dependencies of triples, Further obtain the low-dimensional vector of the target entity. This process applies position coding and relational memory network coding to visit triples, which can solve the problem that the TRANS method cannot describe the potential dependency of the knowledge graph triples to a certain extent, thereby improving the accuracy of the triple embedding vector. Provide higher support for subsequent personalized recommendations.
相应的,获取模块34,可用于将第一确定模块33确定出的低维向量输入到符合预设训练标准的推荐模型中,获取得到与患者数据匹配的医疗推荐结果。对于本实施例,为了确定出与目标用户对应匹配的医疗推荐方案,具体的,可在推荐系统中预先基于预设分类规则训练推荐模型,使推荐模型能够根据患者对应的实体低维向量,确定出对应的医疗推荐结果。其中,医疗推荐结果可包括用药组合、采取的治疗方案、以及对应的主治医生人选等。Correspondingly, the obtaining module 34 can be used to input the low-dimensional vector determined by the first determining module 33 into a recommendation model that meets the preset training standard, and obtain a medical recommendation result matching the patient data. For this embodiment, in order to determine the medical recommendation scheme corresponding to the target user, specifically, the recommendation model can be pre-trained in the recommendation system based on preset classification rules, so that the recommendation model can be determined according to the entity low-dimensional vector corresponding to the patient Result of corresponding medical recommendation. Among them, the result of the medical recommendation may include the combination of medications, the treatment plan adopted, and the corresponding attending doctor candidates.
借由本实施例中基于知识图谱表征学习的医疗方案推荐系统,可首先提取患者数据中的实体,并基于实体从知识图谱中抽取子图。之后通过知识图谱表征学习,采用三元组嵌入的方法,将每个实体(医生,病人)和关系(就诊,专业领域,处方,开药等)嵌入得到低维向量,并且保持医疗关系图谱中的语义信息。之后,将嵌入得到的低维向量,输入到推荐算法对应的推荐模型中去,推荐模型即可根据患者的低维向量,进行推荐分类,进一步输出医疗推荐结果,以供患者参考。在本申请中,通过表征学习获取低维向量,能够提高推荐系统推荐结果的准确性,为之后的个性化推荐提供更高的支持。With the medical plan recommendation system based on knowledge graph representation learning in this embodiment, entities in patient data can be extracted first, and subgraphs can be extracted from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference. In this application, obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations.
进一步的,作为上述实施例具体实施方式的细化和扩展,为了完整说明本实施例中的具体实施过程,如图2所示,提供了另一种基于知识图谱表征学习的医疗方案推荐系统,在该医疗方案推荐系统中,提取模块31可进一步包括:第一训练单元311、抽取单元312。Further, as a refinement and extension of the specific implementation of the foregoing embodiment, in order to fully illustrate the specific implementation process in this embodiment, as shown in FIG. 2, another medical solution recommendation system based on knowledge graph representation learning is provided. In the medical plan recommendation system, the extraction module 31 may further include: a first training unit 311 and an extraction unit 312.
在具体的应用场景中,第一训练单元311可用于训练用于抽取实体类的实体抽取模型,其中,在训练用于提取患者数据中实体类的实体抽取模型时,第一训练单元311具体可以用于:对训练集数据中所包含的实体类进行词性标注;将标注处理后的训练集数据输入至实体抽取模型中,训练实体抽取模型基于Jieba自然语言处理库抽取实体类;若确定实体类的抽取误差小于预设阈值,则判定实体抽取模型通过训练;若确定实体类的抽取误差大于或等于预设阈值,则判定实体抽取模型未通过训练,利用预先标注好词性的训练集数据重复修正训练实体抽取模型,以使实体抽取模型满足第一预设训练标准。In a specific application scenario, the first training unit 311 can be used to train an entity extraction model for extracting entity classes, where, when training an entity extraction model for extracting entity classes in patient data, the first training unit 311 can specifically Used to: tag the entity classes contained in the training set data; input the training set data after the annotation processing into the entity extraction model, and the training entity extraction model extracts the entity classes based on the Jieba natural language processing library; if the entity class is determined If the extraction error of is less than the preset threshold, it is determined that the entity extraction model has passed the training; if it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, it is determined that the entity extraction model has not passed the training, and the training set data with pre-marked parts of speech is used to repeat the correction. The entity extraction model is trained so that the entity extraction model meets the first preset training standard.
对于本实施例,在对训练集数据中的实体类进行词性标注时,可基于ICTCLAS汉语词性标注集来进行词性标注,进而确定出分词后每个实体类的词性。在具体训练过程中,可通过Jibba自然语言处理库对数据进行词性分析,划分出所有实体类。其中,Jibba自然语言处理库中包含了超大规模的语料数据,包含349,046条词语,每一行对应一个词语,包含词语、词数、词性三部分。预设阈值应为0到1的数值,表示实体抽取模型通过训练时的最大抽取误差,具体数值的设定大小可根据实际应用需求进行设定,预设阈值越小,表示实体抽取模型的训练精度越高。For this embodiment, when performing part-of-speech tagging on the entity classes in the training set data, the part-of-speech tagging can be performed based on the ICTCLAS Chinese part-of-speech tagging set, so as to determine the part of speech of each entity class after word segmentation. In the specific training process, the data can be analyzed through the Jibba natural language processing library to classify all entity classes. Among them, the Jibba natural language processing database contains super large-scale corpus data, including 349,046 words, each line corresponds to a word, and contains three parts: word, word number, and part of speech. The preset threshold value should be a value from 0 to 1, indicating the maximum extraction error when the entity extraction model passes the training. The specific value can be set according to actual application requirements. The smaller the preset threshold value, it indicates the training of the entity extraction model. The higher the accuracy.
在具体的应用场景中,抽取单元312可用于利用符合第一预设训练标准的实体抽取模型抽取患者数据中的目标实体。其中,在依据实体抽取模型提取患者数据中的目标实体时,具体实现过程可为:加载词典文件,识别患者数据中各个词段;基于各个词段构建有向无环图;根据有向无环图计算每个节点到语句结束位置的最大路径概率,并确定概率最大时节点对应词段的最优结束位置;在最优结束位置处切分患者数据,以便获取得到各个目标实体。In a specific application scenario, the extraction unit 312 may be used to extract a target entity in the patient data using an entity extraction model that meets the first preset training standard. Among them, when extracting the target entity in the patient data according to the entity extraction model, the specific implementation process can be: loading the dictionary file, identifying each word segment in the patient data; constructing a directed acyclic graph based on each word segment; according to the directed acyclic The graph calculates the maximum path probability from each node to the ending position of the sentence, and determines the optimal ending position of the corresponding segment of the node when the probability is the largest; segmenting the patient data at the optimal ending position in order to obtain each target entity.
例如,输入的患者数据为“有一些发烧”,在利用实体抽取模型抽取目标实体时,首先会构建短语的有向无环图DAG。查词典进行字符串匹配的过程中,可能会出现好几种可能的切分方式,将这些组合构成有向无环图,如可构成四条路径:1)、①有一③/些④/发烧;2)、①有一③/些发⑤/烧;3)、①有②/一些④/发烧;4)、①有②/一些发⑤/烧,根据四条路径对应的有向无环图可确定出各个词的开始位置和可能的结束位置。之后计算对应同一个词的不同结束位置的概率,将概率最大的结束位置确定为最优结束位置。其中,每个词的概率 = 字典中该词的词数 / 字典总词数。若确定待抽取文本中词段的开始位置为①,可识别到对应的结束位置有两个,分别为:②和③,则可计算两处结束位置对应的概率,若“有”对应的概率大于“有一”,则可确定②位置对应为①位置的最优结束位置,之后基于同样的方法确定出其他开始位置对应的最优结束位置,对于词段的开始位置②,可识别到对应的结束位置有两个,分别为:④和⑤,如确定④位置对应为②位置的最优结束位置,之后可在最优结束位置②和④处切分待抽取文本,获取得到目标实体为“有”、“一些”、“发烧”。For example, if the input patient data is "some fever", when using the entity extraction model to extract the target entity, a directed acyclic graph DAG of the phrase will be constructed first. In the process of searching the dictionary for string matching, there may be several possible segmentation methods. These combinations can form a directed acyclic graph. For example, four paths can be formed: 1), ① one ③/some ④/fever; 2), ①One ③/some ⑤/fever; 3), ①have ②/some ④/fever; 4), ①have ②/some ⑤/fever, which can be determined according to the directed acyclic graph corresponding to the four paths Figure out the starting position and possible ending position of each word. Then, the probability of different ending positions corresponding to the same word is calculated, and the ending position with the highest probability is determined as the optimal ending position. Among them, the probability of each word = the number of words in the dictionary / The total number of words in the dictionary. If it is determined that the beginning position of the word segment in the text to be extracted is ①, two corresponding ending positions can be identified, namely: ② and ③, then the probability corresponding to the two ending positions can be calculated, and the probability corresponding to "Yes" If greater than "one", it can be determined that the position ② corresponds to the optimal ending position of position ①, and then the optimal ending position corresponding to other starting positions is determined based on the same method. For the beginning position ② of the word segment, the corresponding position can be identified There are two ending positions, namely: ④ and ⑤. If it is determined that the position ④ corresponds to the optimal ending position of the position ②, then the text to be extracted can be segmented at the optimal ending position ② and ④, and the target entity is obtained as " Yes, "some", "fever".
在具体的应用场景中,为了划分得到知识图谱子图,如图2所示,在该医疗方案推荐系统中,划分模块32,具体可包括:标记单元321、遍历单元322、划分单元323;其中,标记单元321,可用于在目标实体中标记核心对象实体和次要对象实体;遍历单元322,可用于以各个核心对象实体为遍历起点对医疗知识图谱进行遍历,并在遍历到次要对象实体时停止该方向的遍历;划分单元323,可用于依据各个核心对象实体的遍历结果划分知识图谱子图。In a specific application scenario, in order to obtain the subgraphs of the knowledge graph, as shown in FIG. 2, in the medical plan recommendation system, the dividing module 32 may specifically include: a marking unit 321, a traversal unit 322, and a dividing unit 323; , The marking unit 321 can be used to mark the core object entity and the secondary object entity in the target entity; the traversal unit 322 can be used to traverse the medical knowledge graph with each core object entity as the starting point of the traversal, and traverse to the secondary object entity The traversal in this direction is stopped at time; the dividing unit 323 can be used to divide the knowledge graph sub-graphs according to the traversal results of each core object entity.
其中,实体标记是依据该类实体在知识图谱中的重要性和枢纽性,将其标记为核心对象或次要对象。由于不同领域的知识图谱具有不同的实体类别和关联关系,所以核心对象和次要对象的标记任务可由人工完成。在进行子图遍历时,可根据输入的核心对象实体,采用广度优先遍历规则进行子图遍历,当遍历到核心对象实体时选择将实体保留,作为后续遍历的起点;当遍历到次要对象实体时则在该方向上停止遍历。该步骤得到的实体其实是与起点实体直接相连的周边实体。重复遍历步骤,直到某一次遍历得到的实体,除了已经在知识图谱子图中的,剩下的都是次要对象实体。Among them, entity labeling is based on the importance and pivotality of this type of entity in the knowledge graph, marking it as a core object or a secondary object. Since the knowledge graphs in different fields have different entity types and association relationships, the task of labeling core objects and secondary objects can be completed manually. When traversing the subgraph, the breadth-first traversal rule can be used to traverse the subgraph according to the input core object entity. When the core object entity is traversed, the entity can be retained as the starting point of the subsequent traversal; when the secondary object entity is traversed Then stop traversing in that direction. The entity obtained in this step is actually the surrounding entity directly connected to the starting entity. Repeat the traversal steps until the entities obtained from a certain traversal, except for those already in the subgraph of the knowledge graph, the rest are all secondary object entities.
相应的,为了得到知识图谱子图对应的低维向量,如图2所示,在该医疗方案推荐系统中,第一确定模块33,具体可包括:提取单元331、配置单元332、编码单元333、第二训练单元334;提取单元331,可用于提取知识图谱子图中的各个三元组;配置单元332,可用于通过对三元组进行位置编码,为三元组中的实体向量配置位置向量;编码单元333,可用于基于关系型网络对添加位置向量后的三元组进行编码处理,得到编码向量;第二训练单元334,可用于利用解码器对编码向量进行分值评定,并利用自适应矩估计(Adam)优化器进行迭代训练,进一步得到知识图谱子图对应的低维向量。Correspondingly, in order to obtain the low-dimensional vector corresponding to the knowledge graph sub-graph, as shown in FIG. 2, in the medical plan recommendation system, the first determining module 33 may specifically include: an extraction unit 331, a configuration unit 332, and an encoding unit 333 2. The second training unit 334; the extraction unit 331, which can be used to extract each triplet in the knowledge graph sub-graph; the configuration unit 332, which can be used to position the entity vectors in the triples by encoding the triples Vector; encoding unit 333, which can be used to encode the triples after adding position vectors based on the relational network, to obtain the encoding vector; the second training unit 334 can be used to use the decoder to score the encoding vector, and use The adaptive moment estimation (Adam) optimizer performs iterative training to further obtain the low-dimensional vector corresponding to the subgraph of the knowledge map.
对于本实施例,知识图谱表征学习的原理可参见图3所示的医疗三元组嵌入式编码结构,具体的,在通过表征学习得到低维向量时,首先可将医疗三元组存为(实体,关系,实体)的形式,例如(患者,是否疾病史,疾病),(医生,级别,专业)等形式进行三元组的构建。之后可将位置关系嵌入到三元组对应的实体向量中,即通过对三元组进行位置编码,以实现在嵌入训练中为实体向量配置位置向量。接着可以利用关系型记忆网络对三元组进行编码,具体编码过程可以基于多头自注意力机制来实现。此外,为了避免实体编码的初始化过程中,得出的实体向量不够精准,故可基于解码器进行打分,并利用Adam优化器等进行迭代,通过正向和负向的网络训练过程,进一步对实体向量进行优化调整,使最终得到的低维向量符合预设的精度要求。以便最后在得到医疗实体的低维向量后,将其输入推荐模型中去,完成依次学习的框架。For this embodiment, the principle of knowledge graph representation learning can be seen in the medical triplet embedded coding structure shown in Figure 3. Specifically, when obtaining low-dimensional vectors through representation learning, the medical triplet can first be stored as ( The form of entity, relationship, entity), such as (patient, disease history, disease), (doctor, level, specialty) and other forms are used to construct triples. Afterwards, the position relationship can be embedded into the entity vector corresponding to the triplet, that is, by encoding the position of the triplet, the position vector can be configured for the entity vector in the embedding training. Then, the relational memory network can be used to encode the triples, and the specific encoding process can be realized based on the multi-head self-attention mechanism. In addition, in order to avoid that the entity vector obtained during the initialization process of the entity encoding is not accurate enough, it can be scored based on the decoder and iterated with the Adam optimizer. Through the positive and negative network training process, the entity can be further evaluated. The vector is optimized and adjusted so that the resulting low-dimensional vector meets the preset accuracy requirements. In order to finally get the low-dimensional vector of the medical entity, input it into the recommendation model to complete the framework of sequential learning.
在具体的应用场景中,如图2所示,在该医疗方案推荐系统中,具体还可包括:标注模块35、训练模块36、第二确定模块37;标注模块35,可用于确定样本患者数据,并为样本患者数据标注对应的预设医疗推荐方案;训练模块36,可用于利用样本患者数据对应的低维向量训练推荐模型;第二确定模块37,可用于若判定推荐模型输出的医疗推荐结果符合第二预设训练标准,则确定推荐模型通过训练;训练模块36,还可用于若判定推荐模型未通过训练,则利用样本患者数据重复训练推荐模型,以使推荐模型符合第二预设训练标准。In a specific application scenario, as shown in Figure 2, the medical plan recommendation system may specifically include: a labeling module 35, a training module 36, and a second determining module 37; the labeling module 35 can be used to determine sample patient data , And label the corresponding preset medical recommendation plan for the sample patient data; the training module 36 can be used to train the recommendation model using the low-dimensional vector corresponding to the sample patient data; the second determination module 37 can be used to determine the medical recommendation output by the recommendation model If the result meets the second preset training standard, it is determined that the recommended model has passed the training; the training module 36 can also be used to repeatedly train the recommended model with sample patient data if it is determined that the recommended model has not passed the training, so that the recommended model meets the second preset Training standards.
对于本实施例,可预先根据不同类型的样本患者数据标注对应的医疗推荐方案,如可为癌症这一类型的患者数据标注针对癌症这一领域的权威主治医生,此外还可标注对应的治疗方案、用药组合等。进而利用标注对应医疗推荐方案的样本患者数据对推荐模型进行针对性的训练,可进一步强化推荐模型的分类识别能力,以使推荐模型输出的结果与标注结果匹配。For this embodiment, the corresponding medical recommendation plan can be marked in advance based on different types of sample patient data. For example, the cancer patient data can be marked with an authoritative attending doctor in the field of cancer, and the corresponding treatment plan can also be marked. , Medication combination, etc. Furthermore, using the sample patient data labeled corresponding to the medical recommendation plan to carry out targeted training on the recommendation model can further strengthen the classification and recognition ability of the recommendation model, so that the output result of the recommendation model matches the labeling result.
相应的,如图2所示,在该医疗方案推荐系统中,获取模块34,具体可包括:输入单元341、确定单元342。Correspondingly, as shown in FIG. 2, in the medical plan recommendation system, the acquisition module 34 may specifically include: an input unit 341 and a determination unit 342.
在具体的应用场景中,输入单元341可用于将低维向量输入到符合第二预设训练标准的推荐模型中,获取得到各个预设医疗推荐方案对应的推荐分值。对于本实施例,在获取得到目标患者对应的低维向量后,可将低维向量输入到符合第二预设训练标准的推荐模型中,推荐模型会输出各个预设推荐方案对应的推荐分值,推荐分值越高,代表参考价值越高。In a specific application scenario, the input unit 341 may be used to input low-dimensional vectors into a recommendation model that meets the second preset training standard, and obtain recommendation scores corresponding to each preset medical recommendation plan. For this embodiment, after obtaining the low-dimensional vector corresponding to the target patient, the low-dimensional vector can be input into the recommendation model that meets the second preset training standard, and the recommendation model will output the recommendation score corresponding to each preset recommendation plan , The higher the recommended score, the higher the reference value.
相应的,确定单元342可用于将推荐分值最高的预设医疗推荐方案确定为目标用户的医疗推荐结果。对于本实施例,可将推荐分值最高的预设医疗推荐方案确定为与目标用户匹配的医疗推荐结果,进而输出推荐系统,展示给目标用户,以便为目标用户起到参考作用。Correspondingly, the determining unit 342 can be used to determine the preset medical recommendation plan with the highest recommended score as the medical recommendation result of the target user. For this embodiment, the preset medical recommendation scheme with the highest recommendation score can be determined as the medical recommendation result matching the target user, and then the recommendation system is output and displayed to the target user, so as to serve as a reference for the target user.
通过上述基于知识图谱表征学习的医疗方案推荐系统,可首先提取患者数据中的实体,并基于实体从知识图谱中抽取子图。之后通过知识图谱表征学习,采用三元组嵌入的方法,将每个实体(医生,病人)和关系(就诊,专业领域,处方,开药等)嵌入得到低维向量,并且保持医疗关系图谱中的语义信息。之后,将嵌入得到的低维向量,输入到推荐算法对应的推荐模型中去,推荐模型即可根据患者的低维向量,进行推荐分类,进一步输出医疗推荐结果,以供患者参考。在本申请中,通过表征学习获取低维向量,能够提高推荐系统推荐结果的准确性,为之后的个性化推荐提供更高的支持。此外,对于表征学习,本申请在传统方法的基础上,还额外引入位置编码和关系性记忆网络来挖掘三元组的潜在依赖关系,进一步得到目标实体的低维向量。此一过程应用位置编码和关系型记忆网络编码就诊三元组,能够在一定程度上解决现有方法无法描述知识图谱三元组潜在依赖关系的问题,能够提高三元组嵌入向量的准确性,为之后的个性化推荐提供更高的支持。Through the above-mentioned medical plan recommendation system based on knowledge graph representation learning, entities in patient data can be extracted first, and subgraphs can be extracted from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference. In this application, obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations. In addition, for representation learning, on the basis of traditional methods, this application additionally introduces position coding and relational memory network to mine the potential dependency of the triples, and further obtain the low-dimensional vector of the target entity. This process applies position coding and relational memory network coding to visit triples, which can solve the problem that existing methods cannot describe the potential dependence of knowledge graph triples to a certain extent, and can improve the accuracy of triple embedding vectors. Provide higher support for subsequent personalized recommendations.
进一步的,作为图1和图2所示方法的具体体现,本申请实施例提供了一种基于知识图谱表征学习的医疗方案推荐方法,如图4所示,该方法包括:获取目标用户的患者数据,并提取患者数据中的目标实体;根据目标实体从医疗知识图谱中划分知识图谱子图;基于表征学习确定知识图谱子图对应的低维向量;将低维向量输入到符合预设训练标准的推荐模型中,获取得到与患者数据匹配的医疗推荐结果。Further, as a specific embodiment of the method shown in FIG. 1 and FIG. 2, an embodiment of the present application provides a method for recommending a medical plan based on knowledge graph representation learning. As shown in FIG. 4, the method includes: acquiring patients of the target user Data, and extract the target entity in the patient data; divide the knowledge graph subgraph from the medical knowledge graph according to the target entity; determine the low-dimensional vector corresponding to the knowledge graph subgraph based on the representation learning; input the low-dimensional vector to meet the preset training standards In the recommendation model of, obtain the medical recommendation results that match the patient data.
在具体的应用场景中,在提取患者数据中的目标实体时,具体可以包括:训练用于抽取中实体类的实体抽取模型;利用符合第一预设训练标准的实体抽取模型抽取患者数据中的目标实体。其中,训练用于抽取实体类的实体抽取模型的具体方法可为:对训练集数据中所包含的实体类进行词性标注;将标注处理后的训练集数据输入至实体抽取模型中,训练实体抽取模型基于Jieba自然语言处理库抽取实体类;若确定实体类的抽取误差小于预设阈值,则判定实体抽取模型通过训练;若确定实体类的抽取误差大于或等于预设阈值,则判定实体抽取模型未通过训练,利用预先标注好词性的训练集数据重复修正训练实体抽取模型,以使实体抽取模型满足第一预设训练标准。In a specific application scenario, when extracting the target entity in the patient data, it may specifically include: training an entity extraction model for extracting the entity class; using an entity extraction model that meets the first preset training standard to extract the patient data Target entity. Among them, the specific method of training the entity extraction model for extracting entity classes can be: performing part-of-speech tagging on the entity classes contained in the training set data; inputting the labeled training set data into the entity extraction model to train the entity extraction The model extracts entity classes based on the Jieba natural language processing library; if it is determined that the extraction error of the entity class is less than the preset threshold, it is determined that the entity extraction model has passed the training; if it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, the entity extraction model is determined If the training fails, the training entity extraction model is repeatedly modified using the training set data that is pre-marked with part of speech, so that the entity extraction model meets the first preset training standard.
在具体的应用场景中,在根据目标实体从医疗知识图谱中划分知识图谱子图时,具体可以包括:在目标实体中标记核心对象实体和次要对象实体;以各个核心对象实体为遍历起点对医疗知识图谱进行遍历,并在遍历到次要对象实体时停止该方向的遍历;依据各个核心对象实体的遍历结果划分知识图谱子图。In specific application scenarios, when dividing the subgraph of the knowledge graph from the medical knowledge graph according to the target entity, it can specifically include: marking the core object entity and the secondary object entity in the target entity; using each core object entity as the starting point for traversal The medical knowledge graph is traversed, and the traversal in this direction is stopped when the secondary object entity is traversed; the knowledge graph subgraph is divided according to the traversal results of each core object entity.
相应的,在基于表征学习确定知识图谱子图对应的低维向量时,具体可以包括:提取知识图谱子图中的各个三元组;通过对三元组进行位置编码,为三元组中的实体向量配置位置向量;基于关系型网络对添加位置向量后的三元组进行编码处理,得到编码向量;利用解码器对编码向量进行分值评定,并利用自适应矩估计(Adam)优化器进行迭代训练,进一步得到知识图谱子图对应的低维向量。Correspondingly, when determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on the representation learning, it may specifically include: extracting each triplet in the knowledge graph sub-graph; The entity vector configures the position vector; encodes the triples after adding the position vector based on the relational network to obtain the encoded vector; uses the decoder to evaluate the encoded vector, and uses the adaptive moment estimation (Adam) optimizer to perform Iterative training further obtains the low-dimensional vector corresponding to the subgraph of the knowledge graph.
在具体的应用场景中,在将低维向量输入到符合预设训练标准的推荐模型中,获取得到与患者数据匹配的医疗推荐结果之前,具体还包括:确定样本患者数据,并为样本患者数据标注对应的预设医疗推荐方案;利用样本患者数据对应的低维向量训练推荐模型;若判定推荐模型输出的医疗推荐结果符合第二预设训练标准,则确定推荐模型通过训练;若判定推荐模型未通过训练,则利用样本患者数据重复训练推荐模型,以使推荐模型符合第二预设训练标准。In specific application scenarios, before inputting low-dimensional vectors into a recommendation model that meets the preset training standards and obtaining medical recommendation results matching the patient data, it also specifically includes: determining the sample patient data and setting it as the sample patient data Label the corresponding preset medical recommendation plan; use the low-dimensional vector corresponding to the sample patient data to train the recommendation model; if it is determined that the medical recommendation result output by the recommendation model meets the second preset training standard, the recommendation model is determined to pass the training; if the recommendation model is determined If the training fails, the recommended model is repeatedly trained using sample patient data so that the recommended model meets the second preset training standard.
相应的,将低维向量输入到符合预设训练标准的推荐模型中,获取得到与患者数据匹配的医疗推荐结果,具体可包括:将低维向量输入到符合第二预设训练标准的推荐模型中,获取得到各个预设医疗推荐方案对应的推荐分值;将推荐分值最高的预设医疗推荐方案确定为目标用户的医疗推荐结果。Correspondingly, inputting the low-dimensional vector into the recommendation model that meets the preset training standard to obtain medical recommendation results matching the patient data may specifically include: inputting the low-dimensional vector into the recommendation model that meets the second preset training standard In, the recommended score corresponding to each preset medical recommendation plan is obtained; the preset medical recommendation plan with the highest recommended score is determined as the medical recommendation result of the target user.
需要说明的是,本实施例提供的一种基于知识图谱表征学习的医疗方案推荐方法的其它相应描述,可以参考图1至图2中的对应描述,在此不再赘述。It should be noted that, for other corresponding descriptions of the method for recommending medical solutions based on knowledge graph characterization learning provided in this embodiment, reference may be made to the corresponding descriptions in FIGS. 1 to 2, and details are not repeated here.
基于上述如图4所示方法,相应的,本申请实施例还提供了一种存储介质,上述存储介质可以是易失性存储介质,也可以是非易失性存储介质;其上存储有计算机程序,该程序被处理器执行时实现上述如图4所示的基于知识图谱表征学习的医疗方案推荐方法。Based on the above-mentioned method shown in FIG. 4, correspondingly, an embodiment of the present application also provides a storage medium. The above-mentioned storage medium may be a volatile storage medium or a non-volatile storage medium; a computer program is stored thereon. When the program is executed by the processor, the above-mentioned method for recommending medical plans based on the knowledge graph representation learning as shown in FIG. 4 is realized.
基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景的方法。Based on this understanding, the technical solution of this application can be embodied in the form of a software product. The software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.), including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods in each implementation scenario of the present application.
基于上述如图1、图2所示的系统,以及图4所示的方法实施例,为了实现上述目的,本申请实施例还提供了一种计算机设备,具体可以为个人计算机、服务器、网络设备等,该实体设备包括存储介质和处理器;存储介质,用于存储计算机程序;处理器,用于执行计算机程序以实现上述如图4所示的基于知识图谱表征学习的医疗方案推荐方法。Based on the above system shown in Figure 1 and Figure 2 and the method embodiment shown in Figure 4, in order to achieve the above objective, the embodiment of the present application also provides a computer device, which may be a personal computer, a server, or a network device. The physical device includes a storage medium and a processor; the storage medium is used to store a computer program; the processor is used to execute the computer program to implement the above-mentioned method for recommending medical solutions based on the knowledge graph representation learning as shown in FIG. 4.
可选地,该计算机设备还可以包括用户接口、网络接口、摄像头、射频(Radio Frequency,RF)电路,传感器、音频电路、WI-FI模块等等。用户接口可以包括显示屏(Display)、输入单元比如键盘(Keyboard)等,可选用户接口还可以包括USB接口、读卡器接口等。网络接口可选的可以包括标准的有线接口、无线接口(如蓝牙接口、WI-FI接口)等。Optionally, the computer device may also include a user interface, a network interface, a camera, a radio frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on. The user interface may include a display screen (Display), an input unit such as a keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, and the like. The optional network interface can include standard wired interface, wireless interface (such as Bluetooth interface, WI-FI interface), etc.
本领域技术人员可以理解,本实施例提供的计算机设备结构并不构成对该实体设备的限定,可以包括更多或更少的部件,或者组合某些部件,或者不同的部件布置。Those skilled in the art can understand that the computer device structure provided in this embodiment does not constitute a limitation on the physical device, and may include more or fewer components, or combine certain components, or arrange different components.
非易失性可读存储介质中还可以包括操作系统、网络通信模块。操作系统是基于知识图谱的数据处理实体设备硬件和软件资源的程序,支持信息处理程序以及其它软件和/或程序的运行。网络通信模块用于实现非易失性可读存储介质内部各组件之间的通信,以及与该实体设备中其它硬件和软件之间通信。The non-volatile readable storage medium may also include an operating system and a network communication module. The operating system is a program for the hardware and software resources of the data processing entity equipment based on the knowledge graph, and supports the operation of the information processing program and other software and/or programs. The network communication module is used to implement communication between various components in the non-volatile readable storage medium, and communication with other hardware and software in the physical device.
通过以上的实施方式的描述,本领域的技术人员可首先提取患者数据中的实体,并基于实体从知识图谱中抽取子图。之后通过知识图谱表征学习,采用三元组嵌入的方法,将每个实体(医生,病人)和关系(就诊,专业领域,处方,开药等)嵌入得到低维向量,并且保持医疗关系图谱中的语义信息。之后,将嵌入得到的低维向量,输入到推荐算法对应的推荐模型中去,推荐模型即可根据患者的低维向量,进行推荐分类,进一步输出医疗推荐结果,以供患者参考。在本申请中,通过表征学习获取低维向量,能够提高推荐系统推荐结果的准确性,为之后的个性化推荐提供更高的支持。此外,对于表征学习,本申请在传统方法的基础上,还额外引入位置编码和关系性记忆网络来挖掘三元组的潜在依赖关系,进一步得到目标实体的低维向量。此一过程应用位置编码和关系型记忆网络编码就诊三元组,能够在一定程度上解决现有方法无法描述知识图谱三元组潜在依赖关系的问题,能够提高三元组嵌入向量的准确性,为之后的个性化推荐提供更高的支持。Through the description of the above embodiments, those skilled in the art can first extract entities in the patient data, and extract subgraphs from the knowledge graph based on the entities. Then through the knowledge map representation learning, using the method of triple embedding, each entity (doctor, patient) and relationship (seeing a doctor, professional field, prescription, prescription, etc.) are embedded to obtain a low-dimensional vector, and the medical relationship map is maintained Semantic information. After that, the low-dimensional vector obtained by embedding is input into the recommendation model corresponding to the recommendation algorithm, and the recommendation model can classify the recommendation according to the low-dimensional vector of the patient, and further output the medical recommendation result for the patient's reference. In this application, obtaining low-dimensional vectors through representation learning can improve the accuracy of the recommendation results of the recommendation system and provide higher support for subsequent personalized recommendations. In addition, for representation learning, on the basis of traditional methods, this application additionally introduces position coding and relational memory network to mine the potential dependency of the triples, and further obtain the low-dimensional vector of the target entity. This process applies position coding and relational memory network coding to visit triples, which can solve the problem that existing methods cannot describe the potential dependence of knowledge graph triples to a certain extent, and can improve the accuracy of triple embedding vectors. Provide higher support for subsequent personalized recommendations.
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。Those skilled in the art can understand that the accompanying drawings are only schematic diagrams of preferred implementation scenarios, and the modules or processes in the accompanying drawings are not necessarily necessary for implementing this application. Those skilled in the art can understand that the modules in the device in the implementation scenario can be distributed in the device in the implementation scenario according to the description of the implementation scenario, or can be changed to be located in one or more devices different from the implementation scenario. The modules of the above implementation scenarios can be combined into one module or further divided into multiple sub-modules.
上述本申请序号仅仅为了描述,不代表实施场景的优劣。以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。The above serial number of this application is for description only, and does not represent the pros and cons of implementation scenarios. What has been disclosed above are only a few specific implementation scenarios of this application, but this application is not limited to these, and any changes that can be thought of by those skilled in the art should fall into the protection scope of this application.

Claims (20)

  1. 一种基于知识图谱表征学习的医疗方案推荐系统,其中,包括:A medical plan recommendation system based on knowledge graph representation learning, which includes:
    提取模块,用于获取目标用户的患者数据,并提取所述患者数据中的目标实体;The extraction module is used to obtain patient data of the target user and extract the target entity in the patient data;
    划分模块,用于根据所述目标实体从医疗知识图谱中划分知识图谱子图;The dividing module is used to divide the subgraph of the knowledge graph from the medical knowledge graph according to the target entity;
    第一确定模块,用于基于表征学习确定所述知识图谱子图对应的低维向量;The first determining module is configured to determine the low-dimensional vector corresponding to the knowledge graph sub-graph based on characterization learning;
    获取模块,用于将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The obtaining module is used to input the low-dimensional vector into a recommendation model that meets the preset training standard, and obtain a medical recommendation result matching the patient data.
  2. 一种基于知识图谱表征学习的医疗方案推荐方法,其中,包括:A medical scheme recommendation method based on knowledge graph representation learning, which includes:
    获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
    根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
    基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
    将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  3. 根据权利要求2所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述提取患者数据中的目标实体时,具体包括:The method for recommending medical solutions based on knowledge graph representation learning according to claim 2, wherein said extracting the target entity in the patient data specifically includes:
    训练用于抽取中实体类的实体抽取模型;Train the entity extraction model used to extract the entity classes;
    利用符合第一预设训练标准的所述实体抽取模型抽取患者数据中的目标实体。The entity extraction model that meets the first preset training standard is used to extract the target entity in the patient data.
  4. 根据权利要求3所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述训练用于抽取实体类的实体抽取模型的具体方法为:The medical solution recommendation method based on knowledge graph representation learning according to claim 3, wherein the specific method for training an entity extraction model for extracting entity classes is:
    对训练集数据中所包含的实体类进行词性标注;Perform part-of-speech tagging on the entity classes contained in the training set data;
    将标注处理后的所述训练集数据输入至实体抽取模型中,训练实体抽取模型基于Jieba自然语言处理库抽取实体类;Input the training set data after annotation processing into the entity extraction model, and the training entity extraction model extracts entity classes based on the Jieba natural language processing library;
    若确定实体类的抽取误差小于预设阈值,则判定所述实体抽取模型通过训练;If it is determined that the extraction error of the entity class is less than the preset threshold, it is determined that the entity extraction model has passed the training;
    若确定实体类的抽取误差大于或等于预设阈值,则判定所述实体抽取模型未通过训练,利用预先标注好词性的训练集数据重复修正训练实体抽取模型,以使实体抽取模型满足第一预设训练标准。If it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, it is determined that the entity extraction model has not passed the training, and the training set data with pre-marked parts of speech is used to repeatedly modify the training entity extraction model so that the entity extraction model meets the first prediction. Set training standards.
  5. 根据权利要求2所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述根据目标实体从医疗知识图谱中划分知识图谱子图,具体包括:The medical solution recommendation method based on knowledge graph representation learning according to claim 2, wherein the dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity specifically comprises:
    在目标实体中标记核心对象实体和次要对象实体;Mark the core object entity and the secondary object entity in the target entity;
    以各个核心对象实体为遍历起点对医疗知识图谱进行遍历,并在遍历到次要对象实体时停止该方向的遍历;Use each core object entity as the starting point to traverse the medical knowledge graph, and stop the traversal in this direction when the secondary object entity is traversed;
    依据各个核心对象实体的遍历结果划分知识图谱子图。According to the traversal results of each core object entity, the knowledge graph subgraph is divided.
  6. 根据权利要求2所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述基于表征学习确定知识图谱子图对应的低维向量时,具体包括:The medical treatment plan recommendation method based on knowledge graph representation learning according to claim 2, wherein the determination of the low-dimensional vector corresponding to the knowledge graph sub-graph based on the representation learning specifically includes:
    提取所述知识图谱子图中的各个三元组;Extracting each triplet in the subgraph of the knowledge graph;
    通过对所述三元组进行位置编码,为所述三元组中的实体向量配置位置向量;Configuring position vectors for the entity vectors in the triples by performing position encoding on the triples;
    基于关系型网络对添加所述位置向量后的三元组进行编码处理,得到编码向量;Performing encoding processing on the triplet after adding the position vector based on the relational network to obtain the encoding vector;
    利用解码器对所述编码向量进行分值评定,并利用自适应矩估计优化器进行迭代训练,得到所述知识图谱子图对应的低维向量。A decoder is used to evaluate the code vector, and an adaptive moment estimation optimizer is used to perform iterative training to obtain the low-dimensional vector corresponding to the knowledge map sub-graph.
  7. 根据权利要求6所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果之前,还包括:The medical treatment plan recommendation method based on knowledge graph characterization learning according to claim 6, wherein said inputting said low-dimensional vector into a recommendation model meeting a preset training standard to obtain a medical treatment matching said patient data Before recommending the results, it also includes:
    确定样本患者数据,并为所述样本患者数据标注对应的预设医疗推荐方案;Determine the sample patient data, and mark the corresponding preset medical recommendation plan for the sample patient data;
    利用所述样本患者数据对应的低维向量训练推荐模型;Training a recommendation model by using low-dimensional vectors corresponding to the sample patient data;
    若判定所述推荐模型输出的医疗推荐结果符合第二预设训练标准,则确定所述推荐模型通过训练;If it is determined that the medical recommendation result output by the recommendation model meets the second preset training standard, determining that the recommendation model passes the training;
    若判定所述推荐模型未通过训练,则利用所述样本患者数据重复训练所述推荐模型,以使所述推荐模型符合所述第二预设训练标准。If it is determined that the recommendation model fails the training, the recommendation model is repeatedly trained using the sample patient data, so that the recommendation model meets the second preset training standard.
  8. 根据权利要求7所述的基于知识图谱表征学习的医疗方案推荐方法,其中,所述将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果,具体包括:The method for recommending medical plans based on knowledge graph representation learning according to claim 7, wherein said inputting said low-dimensional vector into a recommendation model that meets a preset training standard to obtain a medical plan matching the patient data Recommended results, including:
    将所述低维向量输入到符合所述第二预设训练标准的推荐模型中,获取得到各个预设医疗推荐方案对应的推荐分值;Input the low-dimensional vector into a recommendation model that meets the second preset training standard, and obtain a recommendation score corresponding to each preset medical recommendation plan;
    将所述推荐分值最高的预设医疗推荐方案确定为所述目标用户的医疗推荐结果。The preset medical recommendation scheme with the highest recommendation score is determined as the medical recommendation result of the target user.
  9. 一种存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时实现一种基于知识图谱表征学习的医疗信息推荐方法:包括:A storage medium with a computer program stored thereon, wherein when the program is executed by a processor, a method for recommending medical information based on knowledge graph representation learning is realized: including:
    获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
    根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
    基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
    将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  10. 根据权利要求9所述的存储介质,其中,所述提取患者数据中的目标实体时,具体包括:The storage medium according to claim 9, wherein the extracting the target entity in the patient data specifically includes:
    训练用于抽取中实体类的实体抽取模型;Train the entity extraction model used to extract the entity classes;
    利用符合第一预设训练标准的所述实体抽取模型抽取患者数据中的目标实体。The entity extraction model that meets the first preset training standard is used to extract the target entity in the patient data.
  11. 根据权利要求10所述的存储介质,其中,所述训练用于抽取实体类的实体抽取模型的具体方法为:The storage medium according to claim 10, wherein the specific method for training an entity extraction model for extracting entity classes is:
    对训练集数据中所包含的实体类进行词性标注;Perform part-of-speech tagging on the entity classes contained in the training set data;
    将标注处理后的所述训练集数据输入至实体抽取模型中,训练实体抽取模型基于Jieba自然语言处理库抽取实体类;Input the training set data after annotation processing into the entity extraction model, and the training entity extraction model extracts entity classes based on the Jieba natural language processing library;
    若确定实体类的抽取误差小于预设阈值,则判定所述实体抽取模型通过训练;If it is determined that the extraction error of the entity class is less than the preset threshold, it is determined that the entity extraction model has passed the training;
    若确定实体类的抽取误差大于或等于预设阈值,则判定所述实体抽取模型未通过训练,利用预先标注好词性的训练集数据重复修正训练实体抽取模型,以使实体抽取模型满足第一预设训练标准。If it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, it is determined that the entity extraction model has not passed the training, and the training set data with pre-marked parts of speech is used to repeatedly modify the training entity extraction model so that the entity extraction model meets the first prediction. Set training standards.
  12. 根据权利要求9所述的存储介质,其中,所述根据目标实体从医疗知识图谱中划分知识图谱子图,具体包括:The storage medium according to claim 9, wherein the dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity specifically comprises:
    在目标实体中标记核心对象实体和次要对象实体;Mark the core object entity and the secondary object entity in the target entity;
    以各个核心对象实体为遍历起点对医疗知识图谱进行遍历,并在遍历到次要对象实体时停止该方向的遍历;Use each core object entity as the starting point to traverse the medical knowledge graph, and stop the traversal in this direction when the secondary object entity is traversed;
    依据各个核心对象实体的遍历结果划分知识图谱子图。According to the traversal results of each core object entity, the knowledge graph subgraph is divided.
  13. 根据权利要求9所述的存储介质,其中,所述基于表征学习确定知识图谱子图对应的低维向量时,具体包括:The storage medium according to claim 9, wherein the determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on the representation learning specifically comprises:
    提取所述知识图谱子图中的各个三元组;Extracting each triplet in the subgraph of the knowledge graph;
    通过对所述三元组进行位置编码,为所述三元组中的实体向量配置位置向量;Configuring position vectors for the entity vectors in the triples by performing position encoding on the triples;
    基于关系型网络对添加所述位置向量后的三元组进行编码处理,得到编码向量;Performing encoding processing on the triplet after adding the position vector based on the relational network to obtain the encoding vector;
    利用解码器对所述编码向量进行分值评定,并利用自适应矩估计优化器进行迭代训练,得到所述知识图谱子图对应的低维向量。A decoder is used to evaluate the code vector, and an adaptive moment estimation optimizer is used to perform iterative training to obtain the low-dimensional vector corresponding to the knowledge map sub-graph.
  14. 根据权利要求13所述的存储介质,其中,所述将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果之前,还包括:The storage medium according to claim 13, wherein the inputting the low-dimensional vector into a recommendation model that meets a preset training standard, and before obtaining a medical recommendation result matching the patient data, further comprises:
    确定样本患者数据,并为所述样本患者数据标注对应的预设医疗推荐方案;Determine the sample patient data, and mark the corresponding preset medical recommendation plan for the sample patient data;
    利用所述样本患者数据对应的低维向量训练推荐模型;Training a recommendation model by using low-dimensional vectors corresponding to the sample patient data;
    若判定所述推荐模型输出的医疗推荐结果符合第二预设训练标准,则确定所述推荐模型通过训练;If it is determined that the medical recommendation result output by the recommendation model meets the second preset training standard, determining that the recommendation model passes the training;
    若判定所述推荐模型未通过训练,则利用所述样本患者数据重复训练所述推荐模型,以使所述推荐模型符合所述第二预设训练标准。If it is determined that the recommendation model fails the training, the recommendation model is repeatedly trained using the sample patient data, so that the recommendation model meets the second preset training standard.
  15. 根据权利要求14所述的存储介质,其中,所述将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果,具体包括:The storage medium according to claim 14, wherein said inputting said low-dimensional vector into a recommendation model meeting a preset training standard to obtain a medical recommendation result matching said patient data specifically comprises:
    将所述低维向量输入到符合所述第二预设训练标准的推荐模型中,获取得到各个预设医疗推荐方案对应的推荐分值;Input the low-dimensional vector into a recommendation model that meets the second preset training standard, and obtain a recommendation score corresponding to each preset medical recommendation plan;
    将所述推荐分值最高的预设医疗推荐方案确定为所述目标用户的医疗推荐结果。The preset medical recommendation scheme with the highest recommendation score is determined as the medical recommendation result of the target user.
  16. 一种计算机设备,包括存储介质、处理器及存储在存储介质上并可在处理器上运行的计算机程序,其中,所述处理器执行所述程序时实现一种基于知识图谱表征学习的医疗信息推荐方法,包括:A computer device, including a storage medium, a processor, and a computer program stored on the storage medium and running on the processor, wherein the processor implements a kind of medical information based on knowledge graph representation learning when the program is executed Recommended methods include:
    获取目标用户的患者数据,并提取所述患者数据中的目标实体;Acquiring patient data of the target user, and extracting the target entity in the patient data;
    根据所述目标实体从医疗知识图谱中划分知识图谱子图;Dividing the knowledge graph subgraph from the medical knowledge graph according to the target entity;
    基于表征学习确定所述知识图谱子图对应的低维向量;Determining the low-dimensional vector corresponding to the knowledge graph sub-graph based on representation learning;
    将所述低维向量输入到符合预设训练标准的推荐模型中,获取得到与所述患者数据匹配的医疗推荐结果。The low-dimensional vector is input into a recommendation model that meets a preset training standard, and a medical recommendation result matching the patient data is obtained.
  17. 根据权利要求16所述的计算机设备,其中,所述提取患者数据中的目标实体时,具体包括:The computer device according to claim 16, wherein said extracting the target entity in the patient data specifically comprises:
    训练用于抽取中实体类的实体抽取模型;Train the entity extraction model used to extract the entity classes;
    利用符合第一预设训练标准的所述实体抽取模型抽取患者数据中的目标实体。The entity extraction model that meets the first preset training standard is used to extract the target entity in the patient data.
  18. 根据权利要求17所述的计算机设备,其中,所述训练用于抽取实体类的实体抽取模型的具体方法为:The computer device according to claim 17, wherein the specific method for training an entity extraction model for extracting entity classes is:
    对训练集数据中所包含的实体类进行词性标注;Perform part-of-speech tagging on the entity classes contained in the training set data;
    将标注处理后的所述训练集数据输入至实体抽取模型中,训练实体抽取模型基于Jieba自然语言处理库抽取实体类;Input the training set data after annotation processing into the entity extraction model, and the training entity extraction model extracts entity classes based on the Jieba natural language processing library;
    若确定实体类的抽取误差小于预设阈值,则判定所述实体抽取模型通过训练;If it is determined that the extraction error of the entity class is less than the preset threshold, it is determined that the entity extraction model has passed the training;
    若确定实体类的抽取误差大于或等于预设阈值,则判定所述实体抽取模型未通过训练,利用预先标注好词性的训练集数据重复修正训练实体抽取模型,以使实体抽取模型满足第一预设训练标准。If it is determined that the extraction error of the entity class is greater than or equal to the preset threshold, it is determined that the entity extraction model has not passed the training, and the training set data with pre-marked parts of speech is used to repeatedly modify the training entity extraction model so that the entity extraction model meets the first prediction. Set training standards.
  19. 根据权利要求16所述的计算机设备,其中,所述根据目标实体从医疗知识图谱中划分知识图谱子图,具体包括:The computer device according to claim 16, wherein said dividing the subgraph of the knowledge graph from the medical knowledge graph according to the target entity specifically comprises:
    在目标实体中标记核心对象实体和次要对象实体;Mark the core object entity and the secondary object entity in the target entity;
    以各个核心对象实体为遍历起点对医疗知识图谱进行遍历,并在遍历到次要对象实体时停止该方向的遍历;Use each core object entity as the starting point to traverse the medical knowledge graph, and stop the traversal in this direction when the secondary object entity is traversed;
    依据各个核心对象实体的遍历结果划分知识图谱子图。According to the traversal results of each core object entity, the knowledge graph subgraph is divided.
  20. 根据权利要求16所述的计算机设备,其中,所述基于表征学习确定知识图谱子图对应的低维向量时,具体包括:The computer device according to claim 16, wherein the determining the low-dimensional vector corresponding to the subgraph of the knowledge graph based on the representation learning specifically comprises:
    提取所述知识图谱子图中的各个三元组;Extracting each triplet in the subgraph of the knowledge graph;
    通过对所述三元组进行位置编码,为所述三元组中的实体向量配置位置向量;Configuring position vectors for the entity vectors in the triples by performing position encoding on the triples;
    基于关系型网络对添加所述位置向量后的三元组进行编码处理,得到编码向量;Performing encoding processing on the triplet after adding the position vector based on the relational network to obtain the encoding vector;
    利用解码器对所述编码向量进行分值评定,并利用自适应矩估计优化器进行迭代训练,得到所述知识图谱子图对应的低维向量。A decoder is used to evaluate the code vector, and an adaptive moment estimation optimizer is used to perform iterative training to obtain the low-dimensional vector corresponding to the knowledge map sub-graph.
PCT/CN2020/136060 2020-10-26 2020-12-14 Medical plan recommendation system and method based on knowledge graph representation learning WO2021189971A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011153510.4A CN112242187B (en) 2020-10-26 2020-10-26 Medical scheme recommendation system and method based on knowledge graph characterization learning
CN202011153510.4 2020-10-26

Publications (1)

Publication Number Publication Date
WO2021189971A1 true WO2021189971A1 (en) 2021-09-30

Family

ID=74169617

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136060 WO2021189971A1 (en) 2020-10-26 2020-12-14 Medical plan recommendation system and method based on knowledge graph representation learning

Country Status (2)

Country Link
CN (1) CN112242187B (en)
WO (1) WO2021189971A1 (en)

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114005509A (en) * 2021-10-30 2022-02-01 平安国际智慧城市科技股份有限公司 Treatment scheme recommendation system, method, device and storage medium
CN114004228A (en) * 2021-10-28 2022-02-01 泰康保险集团股份有限公司 Medical text data standardization processing method and device
CN114121212A (en) * 2021-11-19 2022-03-01 东南大学 Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning
CN114218402A (en) * 2021-12-17 2022-03-22 迈创企业管理服务股份有限公司 Method for recommending computer hardware fault replacement part
CN114496234A (en) * 2022-04-18 2022-05-13 浙江大学 Cognitive-atlas-based personalized diagnosis and treatment scheme recommendation system for general patients
CN114547345A (en) * 2022-04-18 2022-05-27 支付宝(杭州)信息技术有限公司 Input prompting method and device combining map mode
CN114582443A (en) * 2022-02-23 2022-06-03 西北大学 Medicine relation extraction method based on knowledge graph
CN114707004A (en) * 2022-05-24 2022-07-05 国网浙江省电力有限公司信息通信分公司 Method and system for extracting and processing case-affair relation based on image model and language model
CN114707005A (en) * 2022-06-02 2022-07-05 浙江建木智能系统有限公司 Knowledge graph construction method and system for ship equipment
CN114783580A (en) * 2022-06-20 2022-07-22 武汉博科国泰信息技术有限公司 Medical data quality evaluation method and system
CN114820139A (en) * 2022-05-25 2022-07-29 重庆大学 Multi-user recommendation system based on knowledge graph path reasoning
CN114840777A (en) * 2022-07-04 2022-08-02 杭州城市大脑有限公司 Multi-dimensional endowment service recommendation method and device and electronic equipment
CN114864037A (en) * 2022-04-26 2022-08-05 泰康保险集团股份有限公司 Medical aid recommendation method and device, readable storage medium and electronic equipment
CN114884727A (en) * 2022-05-06 2022-08-09 天津大学 Internet of things risk positioning method based on dynamic hierarchical knowledge graph
CN114969557A (en) * 2022-07-29 2022-08-30 之江实验室 Propaganda and education pushing method and system based on multi-source information fusion
CN115036034A (en) * 2022-08-11 2022-09-09 之江实验室 Similar patient identification method and system based on patient characterization map
CN115050441A (en) * 2022-08-16 2022-09-13 北京嘉和美康信息技术有限公司 Treatment scheme display method and device, electronic equipment and medium
CN115148344A (en) * 2022-09-06 2022-10-04 深圳市指南针医疗科技有限公司 Ant colony algorithm-based medical technology management method, device, equipment and storage medium
CN115344717A (en) * 2022-10-18 2022-11-15 国网江西省电力有限公司电力科学研究院 Method and device for constructing regulation and control operation knowledge graph for multi-type energy supply and consumption system
CN115579104A (en) * 2022-09-08 2023-01-06 广东技术师范大学 Artificial intelligence-based liver cancer full-course digital management method and system
CN115952296A (en) * 2022-12-12 2023-04-11 江苏电子信息职业学院 Enterprise technical service recommendation method and device based on knowledge enhancement and graph contrast learning
WO2023071845A1 (en) * 2021-10-25 2023-05-04 支付宝(杭州)信息技术有限公司 Knowledge graph processing
CN116186359A (en) * 2023-05-04 2023-05-30 安徽宝信信息科技有限公司 Integrated management method, system and storage medium for multi-source heterogeneous data of universities
CN116343980A (en) * 2023-05-30 2023-06-27 深圳市即达健康医疗科技有限公司 Intelligent medical review follow-up data processing method and system
CN116383413A (en) * 2023-06-05 2023-07-04 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116610871A (en) * 2023-07-18 2023-08-18 腾讯科技(深圳)有限公司 Media data recommendation method, device, computer equipment and storage medium
CN116842109A (en) * 2023-06-27 2023-10-03 北京大学 Information retrieval knowledge graph embedding method, device and computer equipment
CN117010494A (en) * 2023-09-27 2023-11-07 之江实验室 Medical data generation method and system based on causal expression learning
CN117149998A (en) * 2023-10-30 2023-12-01 北京南师信息技术有限公司 Intelligent diagnosis recommendation method and system based on multi-objective optimization
CN117196027A (en) * 2023-11-07 2023-12-08 北京航天晨信科技有限责任公司 Training sample generation method and device based on knowledge graph
CN117893694A (en) * 2024-03-15 2024-04-16 北京大学第三医院(北京大学第三临床医学院) Atlantoaxial dislocation treatment scheme recommendation method and system

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113327691B (en) * 2021-06-01 2022-08-12 平安科技(深圳)有限公司 Query method and device based on language model, computer equipment and storage medium
CN113434692B (en) * 2021-06-22 2023-08-01 上海交通大学医学院附属仁济医院 Method, system and equipment for constructing graphic neural network model and recommending diagnosis and treatment scheme
CN113535974B (en) * 2021-06-28 2024-04-09 科大讯飞华南人工智能研究院(广州)有限公司 Diagnostic recommendation method and related device, electronic equipment and storage medium
CN113792104B (en) * 2021-09-16 2024-03-01 平安科技(深圳)有限公司 Medical data error detection method and device based on artificial intelligence and storage medium
CN113808664B (en) * 2021-09-26 2024-03-19 平安科技(深圳)有限公司 Antibody screening method and device based on machine learning
CN114461734B (en) * 2022-04-12 2022-07-12 支付宝(杭州)信息技术有限公司 Dynamic control method and system for knowledge graph subgraph matching
CN115148330B (en) * 2022-05-24 2023-07-25 中国医学科学院北京协和医院 POP treatment scheme forming method and system
CN114996412B (en) * 2022-08-02 2022-11-15 医智生命科技(天津)有限公司 Medical question and answer method and device, electronic equipment and storage medium
CN115658877B (en) * 2022-12-27 2023-03-21 神州医疗科技股份有限公司 Medicine recommendation method and device based on reinforcement learning, electronic equipment and medium
CN116364240B (en) * 2023-02-02 2024-01-26 复旦大学附属肿瘤医院 Remote nutrition information processing method and system based on Internet
CN116612892B (en) * 2023-07-17 2023-09-26 天津市疾病预防控制中心 Health monitoring method and system of wearable device
CN116796007B (en) * 2023-08-03 2024-05-03 苏州浪潮智能科技有限公司 Target knowledge graph embedding method, target knowledge graph embedding device and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287337A (en) * 2019-06-19 2019-09-27 上海交通大学 The system and method for medicine synonym is obtained based on deep learning and knowledge mapping
CN111159424A (en) * 2019-12-27 2020-05-15 东软集团股份有限公司 Method, device, storage medium and electronic equipment for labeling knowledge graph entities
CN111613339A (en) * 2020-05-15 2020-09-01 山东大学 Similar medical record searching method and system based on deep learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334339B (en) * 2019-04-30 2021-04-13 华中科技大学 Sequence labeling model and labeling method based on position perception self-attention mechanism
CN110275960B (en) * 2019-06-11 2021-09-14 中国电子科技集团公司电子科学研究院 Method and system for expressing knowledge graph and text information based on named sentence
CN111767410B (en) * 2020-06-30 2023-05-30 深圳平安智慧医健科技有限公司 Method, device, equipment and storage medium for constructing clinical medical knowledge graph

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287337A (en) * 2019-06-19 2019-09-27 上海交通大学 The system and method for medicine synonym is obtained based on deep learning and knowledge mapping
CN111159424A (en) * 2019-12-27 2020-05-15 东软集团股份有限公司 Method, device, storage medium and electronic equipment for labeling knowledge graph entities
CN111613339A (en) * 2020-05-15 2020-09-01 山东大学 Similar medical record searching method and system based on deep learning

Cited By (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023071845A1 (en) * 2021-10-25 2023-05-04 支付宝(杭州)信息技术有限公司 Knowledge graph processing
CN114004228A (en) * 2021-10-28 2022-02-01 泰康保险集团股份有限公司 Medical text data standardization processing method and device
CN114005509A (en) * 2021-10-30 2022-02-01 平安国际智慧城市科技股份有限公司 Treatment scheme recommendation system, method, device and storage medium
CN114121212B (en) * 2021-11-19 2024-04-02 东南大学 Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning
CN114121212A (en) * 2021-11-19 2022-03-01 东南大学 Traditional Chinese medicine prescription generation method based on knowledge graph and group representation learning
CN114218402A (en) * 2021-12-17 2022-03-22 迈创企业管理服务股份有限公司 Method for recommending computer hardware fault replacement part
CN114218402B (en) * 2021-12-17 2024-05-28 迈创企业管理服务股份有限公司 Method for recommending computer hardware fault replacement parts
CN114582443A (en) * 2022-02-23 2022-06-03 西北大学 Medicine relation extraction method based on knowledge graph
CN114582443B (en) * 2022-02-23 2023-08-18 西北大学 Knowledge graph-based drug relation extraction method
CN114547345B (en) * 2022-04-18 2022-07-19 支付宝(杭州)信息技术有限公司 Input prompting method and device combining map mode
CN114496234B (en) * 2022-04-18 2022-07-19 浙江大学 Cognitive-atlas-based personalized diagnosis and treatment scheme recommendation system for general patients
CN114547345A (en) * 2022-04-18 2022-05-27 支付宝(杭州)信息技术有限公司 Input prompting method and device combining map mode
CN114496234A (en) * 2022-04-18 2022-05-13 浙江大学 Cognitive-atlas-based personalized diagnosis and treatment scheme recommendation system for general patients
CN114864037A (en) * 2022-04-26 2022-08-05 泰康保险集团股份有限公司 Medical aid recommendation method and device, readable storage medium and electronic equipment
CN114884727A (en) * 2022-05-06 2022-08-09 天津大学 Internet of things risk positioning method based on dynamic hierarchical knowledge graph
CN114884727B (en) * 2022-05-06 2023-02-24 天津大学 Internet of things risk positioning method based on dynamic hierarchical knowledge graph
CN114707004B (en) * 2022-05-24 2022-08-16 国网浙江省电力有限公司信息通信分公司 Method and system for extracting and processing case-affair relation based on image model and language model
CN114707004A (en) * 2022-05-24 2022-07-05 国网浙江省电力有限公司信息通信分公司 Method and system for extracting and processing case-affair relation based on image model and language model
CN114820139A (en) * 2022-05-25 2022-07-29 重庆大学 Multi-user recommendation system based on knowledge graph path reasoning
CN114820139B (en) * 2022-05-25 2024-05-28 重庆大学 Multi-user recommendation system based on knowledge graph path reasoning
CN114707005A (en) * 2022-06-02 2022-07-05 浙江建木智能系统有限公司 Knowledge graph construction method and system for ship equipment
CN114707005B (en) * 2022-06-02 2022-10-25 浙江建木智能系统有限公司 Knowledge graph construction method and system for ship equipment
CN114783580B (en) * 2022-06-20 2022-09-13 武汉博科国泰信息技术有限公司 Medical data quality evaluation method and system
CN114783580A (en) * 2022-06-20 2022-07-22 武汉博科国泰信息技术有限公司 Medical data quality evaluation method and system
CN114840777A (en) * 2022-07-04 2022-08-02 杭州城市大脑有限公司 Multi-dimensional endowment service recommendation method and device and electronic equipment
CN114969557A (en) * 2022-07-29 2022-08-30 之江实验室 Propaganda and education pushing method and system based on multi-source information fusion
CN115036034B (en) * 2022-08-11 2022-11-08 之江实验室 Similar patient identification method and system based on patient characterization map
CN115036034A (en) * 2022-08-11 2022-09-09 之江实验室 Similar patient identification method and system based on patient characterization map
CN115050441A (en) * 2022-08-16 2022-09-13 北京嘉和美康信息技术有限公司 Treatment scheme display method and device, electronic equipment and medium
CN115148344B (en) * 2022-09-06 2022-11-29 深圳市指南针医疗科技有限公司 Ant colony algorithm-based medical and technical management method, device, equipment and storage medium
CN115148344A (en) * 2022-09-06 2022-10-04 深圳市指南针医疗科技有限公司 Ant colony algorithm-based medical technology management method, device, equipment and storage medium
CN115579104A (en) * 2022-09-08 2023-01-06 广东技术师范大学 Artificial intelligence-based liver cancer full-course digital management method and system
CN115344717A (en) * 2022-10-18 2022-11-15 国网江西省电力有限公司电力科学研究院 Method and device for constructing regulation and control operation knowledge graph for multi-type energy supply and consumption system
CN115952296A (en) * 2022-12-12 2023-04-11 江苏电子信息职业学院 Enterprise technical service recommendation method and device based on knowledge enhancement and graph contrast learning
CN116186359B (en) * 2023-05-04 2023-09-01 安徽宝信信息科技有限公司 Integrated management method, system and storage medium for multi-source heterogeneous data of universities
CN116186359A (en) * 2023-05-04 2023-05-30 安徽宝信信息科技有限公司 Integrated management method, system and storage medium for multi-source heterogeneous data of universities
CN116343980A (en) * 2023-05-30 2023-06-27 深圳市即达健康医疗科技有限公司 Intelligent medical review follow-up data processing method and system
CN116343980B (en) * 2023-05-30 2023-08-29 深圳市即达健康医疗科技有限公司 Intelligent medical review follow-up data processing method and system
CN116383413A (en) * 2023-06-05 2023-07-04 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116383413B (en) * 2023-06-05 2023-08-29 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116842109A (en) * 2023-06-27 2023-10-03 北京大学 Information retrieval knowledge graph embedding method, device and computer equipment
CN116610871A (en) * 2023-07-18 2023-08-18 腾讯科技(深圳)有限公司 Media data recommendation method, device, computer equipment and storage medium
CN116610871B (en) * 2023-07-18 2024-01-26 腾讯科技(深圳)有限公司 Media data recommendation method, device, computer equipment and storage medium
CN117010494B (en) * 2023-09-27 2024-01-05 之江实验室 Medical data generation method and system based on causal expression learning
CN117010494A (en) * 2023-09-27 2023-11-07 之江实验室 Medical data generation method and system based on causal expression learning
CN117149998A (en) * 2023-10-30 2023-12-01 北京南师信息技术有限公司 Intelligent diagnosis recommendation method and system based on multi-objective optimization
CN117149998B (en) * 2023-10-30 2024-01-23 北京南师信息技术有限公司 Intelligent diagnosis recommendation method and system based on multi-objective optimization
CN117196027B (en) * 2023-11-07 2024-02-02 北京航天晨信科技有限责任公司 Training sample generation method and device based on knowledge graph
CN117196027A (en) * 2023-11-07 2023-12-08 北京航天晨信科技有限责任公司 Training sample generation method and device based on knowledge graph
CN117893694A (en) * 2024-03-15 2024-04-16 北京大学第三医院(北京大学第三临床医学院) Atlantoaxial dislocation treatment scheme recommendation method and system

Also Published As

Publication number Publication date
CN112242187A (en) 2021-01-19
CN112242187B (en) 2023-06-27

Similar Documents

Publication Publication Date Title
WO2021189971A1 (en) Medical plan recommendation system and method based on knowledge graph representation learning
CN111339774B (en) Text entity relation extraction method and model training method
CN108984683B (en) Method, system, equipment and storage medium for extracting structured data
CN106776711B (en) Chinese medical knowledge map construction method based on deep learning
WO2021151353A1 (en) Medical entity relationship extraction method and apparatus, and computer device and readable storage medium
CN112270196B (en) Entity relationship identification method and device and electronic equipment
CN111324743A (en) Text relation extraction method and device, computer equipment and storage medium
CN110675944A (en) Triage method and device, computer equipment and medium
WO2021042516A1 (en) Named-entity recognition method and device, and computer readable storage medium
CN112015917A (en) Data processing method and device based on knowledge graph and computer equipment
US20220284174A1 (en) Correcting content generated by deep learning
WO2021179693A1 (en) Medical text translation method and device, and storage medium
WO2023029502A1 (en) Method and apparatus for constructing user portrait on the basis of inquiry session, device, and medium
WO2023029513A1 (en) Artificial intelligence-based search intention recognition method and apparatus, device, and medium
WO2024099037A1 (en) Data processing method and apparatus, entity linking method and apparatus, and computer device
CN112270184B (en) Natural language processing method, device and storage medium
CN115292457A (en) Knowledge question answering method and device, computer readable medium and electronic equipment
CN115858886B (en) Data processing method, device, equipment and readable storage medium
CN113707299A (en) Auxiliary diagnosis method and device based on inquiry session and computer equipment
CN113657105A (en) Medical entity extraction method, device, equipment and medium based on vocabulary enhancement
WO2023168810A1 (en) Method and apparatus for predicting properties of drug molecule, storage medium, and computer device
CN110969005B (en) Method and device for determining similarity between entity corpora
CN116662583B (en) Text generation method, place retrieval method and related devices
CN113378569A (en) Model generation method, entity identification method, model generation device, entity identification device, electronic equipment and storage medium
CN116757195A (en) Implicit emotion recognition method based on prompt learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927237

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927237

Country of ref document: EP

Kind code of ref document: A1