CN113744886A

CN113744886A - Traditional Chinese medicine syndrome differentiation and treatment mode mining method and system based on traditional Chinese medicine case mining

Info

Publication number: CN113744886A
Application number: CN202010461494.9A
Authority: CN
Inventors: 白琳; 任晋宇; 周志阳; 钟华; 刘杰; 叶丹
Original assignee: Institute of Software of CAS
Current assignee: Institute of Software of CAS
Priority date: 2020-05-27
Filing date: 2020-05-27
Publication date: 2021-12-03
Anticipated expiration: 2040-05-27
Also published as: CN113744886B

Abstract

The invention discloses a method and system for mining a TCM syndrome differentiation and treatment mode based on the mining of TCM medical records. The method includes: 1) standardizing and segmenting data information in the TCM medical records; wherein the standardization processing of the data information refers to processing based on TCM terms The vocabulary specifies standard names for symptom names, disease names, syndrome names, syndrome names, prescription names, and Chinese medicine names that appear in medical records; symptoms are composed of symptom attributes and symptom values, where symptom attributes represent the objects described by the symptoms, and the symptoms 2) Model the TCM medical record data processed in step 1) into a graph data structure; 3) Mining the graph data structure according to the mining conditions to discover the key symptom characteristics of disease and syndrome diagnosis, which can be used According to the principle of treatment, the applicable formula and the composition of traditional Chinese medicine, a TCM syndrome differentiation and treatment mode that meets the mining conditions is formed. The invention effectively solves the problem of lack of key symptoms and guarantees the completeness of the syndrome differentiation and treatment mode.

Description

Traditional Chinese medicine syndrome differentiation and treatment mode mining method and system based on traditional Chinese medicine case mining

Technical Field

The invention relates to a traditional Chinese medicine syndrome differentiation and treatment mode mining method and system based on traditional Chinese medicine case mining, and belongs to the technical field of software.

Background

The traditional Chinese medicine is a treasure of the traditional culture of the famous Chinese people. The traditional Chinese medicine diagnosis and treatment is based on the concept of treatment and diagnosis based on syndrome differentiation, and the diagnosis and treatment process and results depend on the cognition and understanding of traditional Chinese medicine thinking and concept and the practical experience of individuals to a great extent. Over thousands of years of practice and development, clinical experience of famous physicians in past generations gradually precipitates as precious wealth of Chinese medicine culture in China, and is an important component of Chinese medicine culture inheritance. However, the clinical experience of the famous physicians is in a form of subjective consciousness, and is mostly hidden in each specific medical record. How to adopt a scientific method to refine, summarize, abstract, summarize and summarize the diagnosis and treatment experience hidden in the medical record, so that the diagnosis and treatment experience is objective, standardized and visualized, and has important significance for developing and inheriting Chinese medicine culture.

Data mining is a common data analysis technology in computer science and aims to search data rules from large-scale data records and discover potential and implicit data information. Compared with traditional mining algorithms such as classification and clustering, graph data mining can model and represent more complex data structures and discover more complex data association, and becomes a fundamental research problem which is concerned by the current data mining field.

Disclosure of Invention

Aiming at the technical problems in the prior art, the invention aims to provide a traditional Chinese medicine dialectical treatment mode mining method and system based on graph data mining, wherein a graph data structure is utilized to model complex logic association among elements such as symptoms, diseases, syndrome types, syndromes, treatment rules, formulas and traditional Chinese medicine compositions in a traditional Chinese medicine case, and an expanded frequent subgraph mining algorithm is adopted to extract key elements of the traditional Chinese medicine dialectical treatment process, so that a traditional Chinese medicine dialectical treatment mode facing to the diseases, syndrome types and syndromes is formed. The system can display the diagnosis and treatment experience hidden in the traditional Chinese medical record in an objective and graphical traditional Chinese medicine dialectical treatment mode diagram mode, is beneficial to inheritance and development of clinical experience of famous physicians, and has important guiding significance and application value for traditional Chinese medicine dialectical treatment theory research, traditional Chinese medicine intelligent auxiliary diagnosis and the like.

The technical solution of the invention is as follows: the traditional Chinese medicine dialectical treatment mode mining system based on traditional Chinese medicine case mining constructs a traditional Chinese medicine case into a dialectical treatment process diagram when a doctor is on the way, and extracts key diagnosis and treatment information in the case by adopting an expanded diagram mining algorithm to form a traditional Chinese medicine dialectical treatment mode aiming at different diseases, syndrome types and syndromes. The system comprises a data preprocessing module, a data modeling module and a syndrome differentiation and treatment mode mining module. Wherein:

and the data preprocessing module is used for carrying out standardization and word segmentation processing on data information in the medical records. The data standardization process specifies standard names for symptom names, disease names, syndrome names, prescription names and traditional Chinese medicine names appearing in the medical records according to the Chinese medicine term vocabulary. The word segmentation processing is mainly developed aiming at symptom information and is used for splitting the complex symptom description into fine-grained minimum symptom description units. The symptom is composed of a symptom attribute and a symptom value. Wherein, the symptom attribute represents the object described by the symptom, such as the color of tongue coating, the quality of tongue coating, the pulse condition, etc.; symptom values describe the appearance of attributes of the symptom, such as dark red, thick, slippery, etc.

And the data modeling module is used for modeling the traditional Chinese medical record data processed by the data preprocessing module into a graph data structure. The nodes in the graph represent symptoms, diseases, syndrome, treatment rules, prescriptions and traditional Chinese medicine components in the medical record, namely symptom nodes, disease nodes, syndrome nodes, treatment rules nodes, prescription nodes and traditional Chinese medicine component nodes. The symptom nodes are connected with the disease nodes, the syndrome nodes and the syndrome nodes by edges to represent that the symptoms described by the symptom nodes belong to the symptoms of diseases, syndromes and syndromes described by the disease nodes, the syndrome nodes and the syndrome nodes connected with the symptom nodes. The syndrome node and the rule node, and the rule node and the prescription node are connected by edges to represent the corresponding rules and available prescriptions. The edge between the prescription node and the Chinese medicine composition node indicates that the Chinese medicine corresponding to the Chinese medicine composition node is the Chinese medicine composition corresponding to the prescription.

The dialectical treatment mode mining module adopts an expanded frequent subgraph mining algorithm to mine a graph-structured traditional Chinese medicine case, finds key symptom characteristics of disease diagnosis, an adopted therapeutic rule and treatment method, an applicable prescription and a core traditional Chinese medicine composition, and forms a traditional Chinese medicine dialectical treatment mode aiming at different diseases, syndrome types and syndromes. The module is realized by the following steps:

1) and according to the mining target of the user, selecting a traditional Chinese medical case set Z meeting the mining condition of the user from the output result of the data preprocessing module. The mining conditions include the name of the doctor and the name of the disease, syndrome or syndrome to be mined. Wherein, the disease name, syndrome type name and syndrome name are the necessary mining conditions of three syndrome differentiation treatment modes of disease, syndrome type and syndrome respectively; doctor names are optional mining conditions. The user can designate and dig the syndrome differentiation and treatment mode of a certain doctor about a certain disease, syndrome type or syndrome, or can not designate the doctor and dig the syndrome differentiation and treatment mode about a certain disease, syndrome type or syndrome in all medical records.

2) Traversing all medical records in the medical record set Z, and counting the occurrence frequency of each symptom, treatment rule, prescription and traditional Chinese medicine composition.

Wherein, the frequency of occurrence is the number of occurrences/total number of cases.

3) And setting a minimum support parameter according to the node type.

And 3.1) the three types of nodes of diseases, syndrome types and syndromes are nodes related to the user mining target and are used for appointing which disease, syndrome type or syndrome the dialectical treatment mode mined by the user aims at. The three types of nodes are necessary nodes of three syndrome differentiation treatment modes respectively, and minimum support degree parameters are not required to be set.

And 3.2) for the symptom nodes, sequentially acquiring the symptom attribute to which each symptom node belongs, and then respectively setting a minimum support degree parameter for each symptom attribute according to the occurrence frequency of each symptom value corresponding to the symptom attribute. The setting steps are as follows:

3.2.1) sorting the symptom values from large to small according to the occurrence frequency of each symptom value corresponding to the current symptom attribute to obtain a symptom value sequence L, wherein list (L) is the length of the sequence L;

3.2.2) sequentially selecting each element L (i) in L from the first element, wherein i is more than or equal to 1 and less than or equal to list (L), and accumulating and summing the appearance frequencies corresponding to L (i) until the accumulation frequency exceeds a preset threshold value;

3.2.3) taking the appearance frequency corresponding to the element L (i) taken out from L for the last time as the minimum support degree parameter of the current symptom attribute.

3.3) for three types of nodes formed by rules of treatment, prescriptions and traditional Chinese medicine, respectively setting minimum support degree parameters of the three types of nodes according to actual application requirements.

4) And (3) screening a graph data set G (Z) corresponding to the medical record set Z from the output result of the data modeling module according to the medical record ID, and then mining on G (Z) by adopting an expanded frequent subgraph mining algorithm, wherein the obtained frequent subgraph is a traditional Chinese medicine syndrome differentiation and treatment mode which consists of key elements such as symptoms, syndrome types, syndromes, diseases, treatment rules, prescriptions, traditional Chinese medicines and the like and can reflect internal logic association. The excavating steps are as follows:

4.1) selecting a corresponding minimum support parameter as a filtering condition according to the node type, mining on G (Z) by adopting a frequent subgraph mining algorithm to obtain frequent nodes and frequent edges, and sequencing the frequent edges according to the sequence of descending frequency and ascending DFS (depth-first search) code values;

4.2) sequentially acquiring edges in the frequent edge set, calculating DFS codes, and constructing frequent subtrees by using the frequent edges;

4.3) for each frequent subtree, finding out all the inner edges which can be connected with the frequent subtree from the frequent edge set, and adding the inner edges into the subtree to form a frequent subgraph, namely, a traditional Chinese medicine dialectical treatment mode corresponding to the mining condition.

Compared with the prior art, the invention has the advantages that:

(1) modeling the traditional Chinese medical record into a graph structure, extracting key diagnosis and treatment information in the traditional Chinese medical record based on the thought of frequent subgraph mining, forming a graphical traditional Chinese medicine dialectical treatment mode, and showing traditional Chinese medicine dialectical thinking and a method in a more visualized mode.

(2) The mining requirements of three common traditional Chinese medicine dialectical treatment modes including the mode mining of disease dialectical treatment, the mode mining of syndrome dialectical treatment and the mode mining of syndrome dialectical treatment can be met; the method can realize the dialectical treatment mode excavation facing to the famous doctors and the famous families aiming at individual famous doctors.

(3) The traditional frequent subgraph mining algorithm is expanded, different minimum support degree parameters are set according to the type of the node, the flexibility of mode mining is improved, and the requirements of a user on various mode mining requirements can be met. Particularly, aiming at the symptom nodes, the minimum support degree parameter is respectively set for each symptom attribute according to the symptom attribute, so that the problem of key symptom deficiency is effectively solved, and the completeness of a syndrome differentiation treatment mode is guaranteed.

Drawings

FIG. 1 is a system architecture diagram of the present invention;

FIG. 2 is a diagram illustrating the implementation of the dialectical treatment model mining module in the system of the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples.

As shown in FIG. 1, the system of the present invention comprises three functional modules of data preprocessing, data modeling and dialectical treatment mode mining.

And the data preprocessing module is used for carrying out standardization and word segmentation processing on data information in the medical records. The data standardization process specifies standard names for symptom names, disease names, syndrome names, prescription names and traditional Chinese medicine names appearing in the medical records according to the Chinese medicine term vocabulary. The vocabulary of Chinese medicine terms defines the standard names and aliases of common Chinese medicine terms. The Chinese medicine terms appearing in the Chinese medicine medical record can be standardized by utilizing the word list of the Chinese medicine terms, for example, the symptom names of 'epigastric pain by pressing', 'epigastric pain by pressing' and 'epigastric pain by pressing' can be unified and standardized into 'epigastric pain by pressing'. The word segmentation processing is mainly developed aiming at symptom information and is used for segmenting the complex symptom description into fine-grained minimum symptom description units, for example, the symptom ' pulse condition is deep, wiry and thin, and the number of the pulses ' can be segmented into four independent symptoms ' pulse condition is deep ', ' pulse condition is wiry ', ' pulse condition is thin, and ' pulse condition is number '.

And the data modeling module is used for modeling the traditional Chinese medical record data processed by the data preprocessing module into a graph data structure. The nodes in the graph represent symptoms, diseases, syndrome types, syndromes, treatment rules, prescriptions and traditional Chinese medicine compositions in the medical records. The side connection between the symptom and the disease, syndrome or syndrome indicates that the symptom belongs to the manifestation of the disease, syndrome or syndrome. If a plurality of diseases, syndromes and syndromes exist, each symptom is connected with the plurality of diseases, syndromes and syndromes respectively. The syndrome and the rules of treatment, and the rules of treatment are also connected by the border, indicating the corresponding rules of treatment for the syndrome and the available formulas for the rules of treatment. The border between the prescription and the Chinese medicine composition indicates that the Chinese medicine is the composition component of the prescription. If there are multiple syndromes, treatment principles and prescriptions, the syndromes are connected with the treatment principles, the treatment principles are connected with the prescriptions, and the prescriptions are connected with the Chinese herbs.

The dialectical treatment mode mining module adopts an expanded frequent subgraph mining algorithm to mine a graph-structured traditional Chinese medicine case, finds key symptom characteristics of disease diagnosis, an adopted therapeutic rule and treatment method, an applicable prescription and a core traditional Chinese medicine composition, and forms a traditional Chinese medicine dialectical treatment mode aiming at different diseases, syndrome types and syndromes. The module implementation process is shown in fig. 2. Firstly, a medical record set is screened according to mining conditions input by a user. The excavation conditions include: 1) the 'doctor name' is mined according to the medical record of the doctor; 2) the "disease name", "syndrome name" and "syndrome name" refer to which disease, syndrome or syndrome is mined. Then, different mining parameters, namely the minimum support degree parameter, are set for different types of nodes in the medical record graph according to the actual mining requirements of the user. The node types capable of independently setting the minimum support degree parameter comprise a rule node, a prescription node, a traditional Chinese medicine composition node and a symptom node. The minimum support parameter of the symptom node can be set in two ways: 1) unifying the minimum support degree, namely adopting a unified minimum support degree parameter for all symptom nodes; 2) and multiple minimum support degrees, namely setting a special minimum support degree parameter for each symptom attribute according to the value range and the data distribution characteristics of different symptom attributes. And finally, mining frequent subgraphs in a graph set by taking the minimum support degree parameters corresponding to different node types as filtering conditions of the frequency to obtain a traditional Chinese medicine syndrome differentiation treatment mode comprising key symptom characteristics, a common treatment rule treatment method, an applicable prescription and core traditional Chinese medicines.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A traditional Chinese medicine dialectical treatment mode mining method based on traditional Chinese medicine medical record mining comprises the following steps:

1) carrying out standardization and word segmentation processing operation on data information in the medical records; wherein the data information standardization processing refers to assigning standard names for symptom names, disease names, syndrome names, prescription names and traditional Chinese medicine names appearing in the medical records according to the Chinese medicine term vocabulary; the word segmentation processing is used for splitting the complex symptom description in the traditional Chinese medical record into fine-grained minimum symptom description units; the symptom is composed of a symptom attribute and a symptom value, wherein the symptom attribute represents an object described by the symptom, and the symptom value is a concrete expression of the symptom attribute;

2) modeling the traditional Chinese medicine medical record data processed in the step 1) into a graph data structure, wherein nodes in the graph data structure represent symptoms, diseases, syndrome types, syndromes, rules of treatment, prescriptions and traditional Chinese medicines in the traditional Chinese medicine medical record, namely symptom nodes, disease nodes, syndrome nodes, rules of treatment nodes, prescription nodes and traditional Chinese medicine composition nodes; the symptom nodes are connected with the disease nodes, the syndrome nodes and the syndrome nodes by edges to represent that the symptoms described by the symptom nodes belong to the symptoms of diseases, syndromes and syndromes described by the disease nodes, the syndrome nodes and the syndrome nodes connected with the symptom nodes; the syndrome node and the rule node are connected by edges, and the rule node and the prescription node are connected by edges to represent the corresponding rule of the syndrome and the available prescription of the rule; the prescription nodes and the Chinese medicine composition nodes are connected by edges to indicate that the Chinese medicine corresponding to the Chinese medicine composition nodes is the Chinese medicine composition corresponding to the prescription;

3) mining the graph data structure according to the received mining conditions, and finding out key symptom characteristics of disease diagnosis, an adopted therapeutic rule and treatment method, an applicable prescription and traditional Chinese medicine composition to form a traditional Chinese medicine dialectical treatment mode according with the mining conditions; wherein the mining condition includes a disease, syndrome, or syndrome name to be mined.

2. The method of claim 1, wherein the dialectical treatment model of TCM conforming to mining conditions is formed by: firstly, selecting a traditional Chinese medical record set Z meeting mining conditions from output results of a data preprocessing module; then traversing all medical records in the medical record set Z, and counting the occurrence frequency of each symptom, treatment rule, prescription and traditional Chinese medicine composition; then, for a symptom node, acquiring a symptom attribute to which the symptom node belongs, and setting a minimum support parameter of the symptom attribute to which the symptom node belongs according to the occurrence frequency of each symptom value corresponding to the symptom attribute; respectively setting corresponding minimum support degree parameters for the therapeutic rule node, the prescription node and the Chinese medicine composition node according to application requirements; then, according to the traditional Chinese medical record ID, a graph data set G (Z) corresponding to the medical record set Z is screened out from an output result of the data modeling module, then mining is carried out on the G (Z) by adopting a frequent subgraph mining algorithm, and the obtained frequent subgraph is a traditional Chinese medical syndrome differentiation and treatment mode which accords with mining conditions; the traditional Chinese medicine syndrome differentiation treatment mode comprises symptoms, syndrome types, syndromes, diseases, treatment rules, a prescription and traditional Chinese medicines and can reflect internal logic correlation.

3. The method of claim 2, wherein the graph data structure is mined by: firstly, selecting a corresponding minimum support parameter as a filtering condition according to the node type, and mining on G (Z) by adopting a frequent subgraph mining algorithm to obtain frequent nodes and frequent edges; then constructing a frequent subtree by using the frequent edge; and finally, expanding the frequent subtrees to form frequent subgraphs, namely mining the traditional Chinese medicine syndrome differentiation treatment mode corresponding to the conditions.

4. The method of claim 2, wherein the method for setting the minimum support parameter of the symptom attribute to which the symptom node belongs comprises: firstly, obtaining symptom attributes to which the symptom nodes belong, and sorting the symptom values from large to small according to the occurrence frequency of each symptom value corresponding to the symptom attributes to obtain a symptom value sequence L; sequentially selecting each element in the symptom value sequence L from the first element of the symptom value sequence L, and accumulating and summing the occurrence frequencies corresponding to the selected elements until the accumulation frequency exceeds a preset threshold value; and then, taking the appearance frequency corresponding to the element extracted from the symptom value sequence L for the last time as the minimum support degree parameter of the symptom attribute to which the symptom node belongs.

5. The method of claim 1, wherein the mining condition further comprises a doctor name.

6. A traditional Chinese medicine dialectical treatment mode mining system based on traditional Chinese medicine case mining is characterized by comprising a data preprocessing module, a data modeling module and a dialectical treatment mode mining module; wherein

The data preprocessing module is used for carrying out standardization and word segmentation processing operation on data information in the medical records; wherein the data information standardization processing refers to assigning standard names for symptom names, disease names, syndrome names, prescription names and traditional Chinese medicine names appearing in the medical records according to the Chinese medicine term vocabulary; the word segmentation processing is used for splitting the complex symptom description in the traditional Chinese medical record into fine-grained minimum symptom description units; the symptom is composed of a symptom attribute and a symptom value, wherein the symptom attribute represents an object described by the symptom, and the symptom value is a concrete expression of the symptom attribute;

the data modeling module is used for modeling the traditional Chinese medicine medical record data processed by the data preprocessing module into a graph data structure, wherein nodes in the graph data structure represent symptoms, diseases, syndrome types, syndromes, treatment rules, prescriptions and traditional Chinese medicines in the traditional Chinese medicine medical record; namely symptom nodes, disease nodes, syndrome nodes, treatment rules nodes, prescription nodes and Chinese medicine composition nodes; the symptom nodes are connected with the disease nodes, the syndrome nodes and the syndrome nodes by edges to represent that the symptoms described by the symptom nodes belong to the symptoms of diseases, syndromes and syndromes described by the disease nodes, the syndrome nodes and the syndrome nodes connected with the symptom nodes; the syndrome node and the rule node are connected by edges, and the rule node and the prescription node are connected by edges to represent the corresponding rule of the syndrome and the available prescription of the rule; the prescription nodes and the Chinese medicine composition nodes are connected by edges to indicate that the Chinese medicine corresponding to the Chinese medicine composition nodes is the Chinese medicine composition corresponding to the prescription;

the dialectical treatment mode mining module is used for mining the graph data structure according to the received mining conditions, finding key symptom characteristics of disease diagnosis, an adopted therapeutic principle and treatment method, an applicable prescription and traditional Chinese medicine composition, and forming a traditional Chinese medicine dialectical treatment mode according with the mining conditions; wherein the mining condition includes a disease, syndrome, or syndrome name to be mined.

7. The system of claim 6, wherein the dialectical treatment mode mining module first selects a traditional Chinese medical case set Z meeting mining conditions from the output results of the data preprocessing module; then traversing all medical records in the medical record set Z, and counting the occurrence frequency of each symptom, treatment rule, prescription and traditional Chinese medicine composition; then obtaining the symptom attribute of the symptom node, and setting the minimum support parameter of the symptom node according to the occurrence frequency of each symptom value corresponding to the symptom attribute; respectively setting corresponding minimum support degree parameters for the therapeutic rule node, the prescription node and the Chinese medicine composition node according to application requirements; then, according to the traditional Chinese medical record ID, a graph data set G (Z) corresponding to the medical record set Z is screened out from an output result of the data modeling module, then mining is carried out on the G (Z) by adopting a frequent subgraph mining algorithm, and the obtained frequent subgraph is a traditional Chinese medical syndrome differentiation and treatment mode which accords with mining conditions; the traditional Chinese medicine syndrome differentiation treatment mode comprises symptoms, syndrome types, syndromes, diseases, treatment rules, a prescription and traditional Chinese medicines and can reflect internal logic correlation.

8. The system of claim 7, wherein the graph data structure is mined by: firstly, selecting a corresponding minimum support parameter as a filtering condition according to the node type, and mining on G (Z) by adopting a frequent subgraph mining algorithm to obtain frequent nodes and frequent edges; then constructing a frequent subtree by using the frequent edge; and finally, expanding the frequent subtrees to form frequent subgraphs, namely mining the traditional Chinese medicine syndrome differentiation treatment mode corresponding to the conditions.

9. The system of claim 7, wherein the method for setting the minimum support parameter for the symptom attribute to which the symptom node belongs comprises: firstly, obtaining symptom attributes to which the symptom nodes belong, and sorting the symptom values from large to small according to the occurrence frequency of each symptom value corresponding to the symptom attributes to obtain a symptom value sequence L; sequentially selecting each element in the symptom value sequence L from the first element of the symptom value sequence L, and accumulating and summing the occurrence frequencies corresponding to the selected elements until the accumulation frequency exceeds a preset threshold value; and then, taking the appearance frequency corresponding to the element extracted from the symptom value sequence L for the last time as the minimum support degree parameter of the symptom attribute to which the symptom node belongs.

10. The system of claim 6, wherein the mining condition further comprises a doctor name.