CN111681717B - Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof - Google Patents

Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof Download PDF

Info

Publication number
CN111681717B
CN111681717B CN202010467222.XA CN202010467222A CN111681717B CN 111681717 B CN111681717 B CN 111681717B CN 202010467222 A CN202010467222 A CN 202010467222A CN 111681717 B CN111681717 B CN 111681717B
Authority
CN
China
Prior art keywords
traditional chinese
chinese medicine
gene
model
chinese medicines
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010467222.XA
Other languages
Chinese (zh)
Other versions
CN111681717A (en
Inventor
熊江辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yingu Aromatic Technology Co ltd
Original Assignee
Beijing Yingu Aromatic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yingu Aromatic Technology Co ltd filed Critical Beijing Yingu Aromatic Technology Co ltd
Priority to CN202010467222.XA priority Critical patent/CN111681717B/en
Publication of CN111681717A publication Critical patent/CN111681717A/en
Application granted granted Critical
Publication of CN111681717B publication Critical patent/CN111681717B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/50Molecular design, e.g. of drugs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures

Abstract

The invention discloses a traditional Chinese medicine quality control and natural material efficacy prediction model based on artificial intelligence and a construction and use method thereof. The method comprises the following steps: 1) For traditional Chinese medicines with known nature, taste and channel tropism effects, chemical component diversity of the traditional Chinese medicines is obtained; 2) According to the chemical component groups of the traditional Chinese medicines, interaction genes corresponding to each chemical component are obtained, and an interaction gene set of the traditional Chinese medicines is formed; 3) Calculating the targeting coefficient of the traditional Chinese medicine to the functional gene module for the interaction gene set of the traditional Chinese medicine; 4) Based on the targeting coefficient of the traditional Chinese medicine to the functional gene module, a model for evaluating the nature, taste and channel tropism of the traditional Chinese medicine and the natural substances is established, for example, a model based on a sparse self-encoder. The artificial intelligence-based method can quantify the difference between the traditional Chinese medicine sample and the theoretical efficacy thereof so as to achieve the aim of quality control, and can also predict and evaluate the efficacy of the new natural product extract.

Description

Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof
Technical Field
The invention relates to the field of evaluation of natural substances, in particular to traditional Chinese medicines, and more particularly relates to a method for accurately evaluating and predicting the efficacy of natural substances, in particular to traditional Chinese medicines in the aspect of returning to channels by an artificial intelligence method.
Background
The Chinese medicinal materials and the Chinese medicinal compound are all natural compound complex systems. The Chinese medicinal materials have relatively stable efficacy and usage in thousands of years of application, and the material basis is derived from the main components of the compounds in the Chinese medicinal materials and the combination proportion thereof are relatively fixed. The nature and flavor of Chinese herbs enter meridians, wherein the nature generally refers to five properties of cold, heat, warm, cool and flat; "taste" refers to five flavors of pungent, sweet, sour, bitter and salty; "enter meridians" is an efficacy label describing the selective action of a drug on the body, such as "enter lung meridian", "enter stomach meridian", "enter gallbladder meridian", etc. Even if the indirect correspondence between visceral text labels and human anatomy is ignored, meridian tropism attributes reflect the selectivity of the drug acting on the human body, the selectivity may correspond to a specific biological process of the human body or a specific module of a gene network in addition to the implicit correspondence with tissues and organs. In a word, the property, taste and meridian tropism efficacy label reflects the relatively stable character of the traditional Chinese medicinal materials and is also the basis of the efficacy. The material basis of the drug effect of the traditional Chinese medicine refers to chemical components which play a role in the drug effect in the traditional Chinese medicine and the compound preparation thereof, and is the guarantee of the safety, the effectiveness and the stable and controllable quality of the traditional Chinese medicine and the product thereof.
However, the complexity of the traditional Chinese medicine components is limited by the current research thought and method, and the quality-quantity-effect relationship of the complex system of the traditional Chinese medicine and the traditional Chinese medicine compound is not clear, so that the quality control level of the traditional Chinese medicine is difficult to be related to the efficacy, quantity and inaccuracy, difficult control and difficult assessment, and the traditional Chinese medicine compound becomes a technical bottleneck for restricting the modernization, industrialization and internationalization of the traditional Chinese medicine.
The existing traditional Chinese medicine quality control technical route mainly comprises three types of traditional Chinese medicine quality control routes:
the first is sensory evaluation, also called shape evaluation, mainly refers to judging the quality of the Chinese medicine by the appearance of the Chinese medicine material such as shape, size, color, smell, surface characteristics, texture, etc. However, the method lacks definite quantitative characteristics, and cannot accurately judge medicinal materials with small quality differences;
the second is a chemical evaluation method based on traditional Chinese medicine fingerprint, but because the chemical component change factor of traditional Chinese medicine is complex, and the correlation between the current fingerprint and pharmacological action is not clear, the pharmacological action represented by Duan Feng cannot be resolved from the fingerprint. At present, the fingerprint spectrum reflects the chemical information of part of the components of the traditional Chinese medicine, but not the efficacy information, which is far from the clinical efficacy, so that a new technology is needed to cross the gap between the chemical spectrum and the efficacy;
The third method is a biological evaluation method, which can qualitatively and quantitatively represent and evaluate the authenticity and the quality of the internal quality of the traditional Chinese medicine and the toxic and side effects, and breaks through the limitation of the quality of the differentiation. The defects are that the correlation between the test index and the curative effect of the traditional Chinese medicine is poor, and the reproducibility and the stability of the method are insufficient.
Thus, there remains a need in the art for a method that enables accurate predictions of the quality and efficacy of natural substances, particularly traditional Chinese medicines.
Disclosure of Invention
The invention aims to provide a novel method for precisely quantifying and predicting the quality and efficacy of natural substances, particularly traditional Chinese medicines.
In a first aspect, the present invention provides a method of modeling the performance-taste-menopause of a natural substance, the method comprising:
1) For traditional Chinese medicines with known nature, taste and channel tropism effects, the traditional Chinese medicines comprise traditional Chinese medicines with known nature, taste and channel tropism effects, and chemical component diversity of the traditional Chinese medicines is obtained;
2) According to the chemical component groups of the traditional Chinese medicines, interaction genes corresponding to each chemical component are obtained, and an interaction gene set of the traditional Chinese medicines is formed;
3) Calculating the targeting coefficient of the traditional Chinese medicine to the functional gene module for the interaction gene set of the traditional Chinese medicine;
4) Based on the targeting coefficient of the traditional Chinese medicine to the functional gene module, a model for evaluating the nature, taste and channel tropism of natural substances is established.
In one embodiment, the traditional Chinese medicine comprises a traditional Chinese medicine known not to have the efficacy of the sex flavor to return to menstruation.
In one embodiment, in 4), a univariate model is built with the targeting coefficients of multiple traditional Chinese medicines to one gene module.
In one embodiment, the univariate model is based on a median method or based on an ROC curve method.
In one embodiment, in 4), the targeting coefficients of a plurality of traditional Chinese medicines to a plurality of gene modules are used as the characteristic vector of each traditional Chinese medicine, and the correspondence between the characteristic vector and the efficacy label is established for training an artificial intelligent neural network model, such as a sparse self-encoder model.
In a second aspect, the present invention provides a model for evaluating the natural product's taste and menstruation established by the method of the first aspect of the invention.
In a third aspect, the present invention provides a method of evaluating the performance, taste and menstruation of a natural product, the method comprising:
1) For a natural object to be detected, respectively obtaining chemical component diversity of the natural object to be detected;
2) According to the chemical component set of the natural substance to be detected, obtaining an interaction gene corresponding to each chemical component to form an interaction gene set of the natural substance to be detected;
3) Calculating the targeting coefficient of the natural substance to be detected for the functional gene module for the interaction gene set of the natural substance to be detected;
4) Based on the targeting coefficient of the natural substance to be detected to the functional gene module, predicting whether the natural substance to be detected has the sex-flavor menstruation-inducing effect by using the model of the second aspect of the invention.
In one embodiment, the functional gene module comprises a plurality of functional gene modules.
In one embodiment, the sexual taste meridian tropism efficacy is selected from cold, heat, warm, cool, flat, sweet, sour, bitter, salty, biliary meridian tropism, cardiac meridian tropism, liver meridian tropism, spleen meridian tropism, lung meridian tropism, kidney meridian tropism, stomach meridian tropism, large intestine, small intestine, bladder meridian tropism, cardiac packet meridian tropism, triple-focus meridian tropism.
In one embodiment, the set of chemical components comprises at least 10 chemical components, preferably at least 15 chemical components.
In one embodiment, the functional gene module is from a database of functional gene modules.
In one embodiment, the targeting factor is calculated as follows: x is X ij =K ij /N i The method comprises the steps of carrying out a first treatment on the surface of the Wherein K is ij The number of genes which are obtained by gathering and crossing the interaction genes of the natural substances to be detected or the traditional Chinese medicine i and the functional gene module j, N i Is the number of interacting genes of the natural substance to be detected or the traditional Chinese medicine i.
The invention has the advantages that: through the mapping of the relationship between the compound targeted human genes and proteins, the chemical component set is projected to the dimension space of the human functional gene module, and then the complex relationship between the gene module and the nature, especially the nature, taste and channel tropism attribute of the traditional Chinese medicine is established by utilizing an artificial intelligent deep learning algorithm.
Drawings
FIG. 1 shows a ROC curve with an area under the curve AUC value of 0.78, which is used for constructing a univariate predictive model of "Guidanjing" traditional Chinese medicine according to an embodiment of the invention.
Fig. 2 shows a neural network architecture for constructing a sparse self-encoder model for "return to biliary tract" traditional Chinese medicine, according to one embodiment of the invention.
Fig. 3 shows ROC curves of a sparse self-encoder model constructed for "return to biliary tract" traditional Chinese medicine, with an area under the curve AUC value of 1, according to one embodiment of the invention.
Fig. 4 shows ROC curves of a sparse self-encoder model constructed for "cold" traditional Chinese medicine with an area under the curve AUC value of 0.99, according to one embodiment of the invention.
FIG. 5 illustrates the input variable weight values of a first hidden layer neuron (upper graph), a second hidden layer neuron (middle graph), and a third hidden layer neuron (lower graph) in a stacked sparse self-encoder model constructed for "cold" property traditional Chinese medicine, according to one embodiment of the invention.
Fig. 6 shows ROC curves of a sparse self-encoder model constructed for "pungent" Chinese medicine with an area under the curve AUC value of 0.99, according to one embodiment of the invention.
FIG. 7 illustrates the input variable weight values of a first hidden layer neuron (upper graph), a second hidden layer neuron (middle graph), and a third hidden layer neuron (lower graph) in a stacked sparse self-encoder model constructed for "pungent" traditional Chinese medicine, according to one embodiment of the invention.
Detailed Description
The present invention will be described in detail below. It is to be understood that the following description is intended to illustrate the invention by way of example only, and is not intended to limit the scope of the invention as defined by the appended claims. And, it is understood by those skilled in the art that modifications may be made to the technical scheme of the present invention without departing from the spirit and gist of the present invention. The technical means used in the examples are conventional means well known to those skilled in the art unless otherwise indicated.
The present invention provides a new method for controlling the quality of natural substances, particularly traditional Chinese medicines, namely an artificial intelligence method. The method is based on a traditional Chinese medicine chemical component collection, and projects the chemical component collection to a dimensional space of a human functional gene module through compound-targeted human gene and protein relation mapping. And then, by utilizing an artificial intelligent deep learning algorithm, a complex relationship between the gene module and natural substances, particularly the nature, taste and channel tropism attributes of traditional Chinese medicines is established. The method of the invention comprises two technical points: first, the ratio of a certain traditional Chinese medicine targeted to a certain functional gene set, namely the targeting rate. Secondly, the data dimension reduction, feature extraction and classification prediction of the high-dimension space of the gene module are realized by using a deep learning self-encoder (autoencoder) algorithm.
In the present invention, the sex taste meridian tropism efficacy comes from the traditional Chinese medicine system, for example from the traditional Chinese medicine theory classical or textbook. For example, the composition may be selected from cold, heat, warm, cool, flat, spicy, sweet, sour, bitter, salty, gallbladder meridian, heart meridian, liver meridian, spleen meridian, lung meridian, kidney meridian, stomach meridian, large intestine, small intestine, bladder meridian, heart-qi, and triple-jiao meridian.
In the invention, natural substances refer to substances which are not single in component and have potential traditional Chinese medicine application, and can be obtained from nature or synthesized or semisynthetic. For example, the natural product of the present invention may be a newly developed natural product, or a plant extract.
In the invention, the traditional Chinese medicines with known property, taste and channel tropism effects comprise various traditional Chinese medicines, so that the traditional Chinese medicines which can better represent the property, taste and channel tropism effects can be selected, the newly developed natural substances with unknown effects can be predicted based on the traditional Chinese medicines, and particularly whether the traditional Chinese medicines have the property, taste and channel tropism effects or not can be predicted, so that the prediction accuracy can be improved. Here, the number of the plurality of Chinese medicines is at least 1, at least 10, at least 50, preferably at least 100, more preferably at least 150. The term "traditional Chinese medicine with known property and taste and meridian tropism efficacy" refers to that the person skilled in the art can know the property and taste and meridian tropism efficacy of the traditional Chinese medicine through conventional means, for example, the traditional Chinese medicine is cold or non-cold, has the efficacy of returning to the gallbladder meridian or does not have the efficacy of returning to the gallbladder meridian, and the like. "traditional Chinese medicine with known nature and taste to enter menstruation efficacy" is a collection and can include "traditional Chinese medicine with known nature and taste to enter menstruation efficacy" (for example, cold nature or having "enter menstruation" efficacy) and optionally "traditional Chinese medicine without known nature and taste to enter menstruation efficacy" (for example, not cold nature or having "enter menstruation" efficacy).
In the present invention, the chemical composition of the natural product includes the main component or the main active ingredient of the natural product. The chemical composition of the natural substance may be determined experimentally or may be obtained through a database, such as the natural substance (traditional Chinese medicine) chemical composition database Traditional Chinese Medici ne Integrated Database, TCMID for short.
In the invention, the functional gene module comprises a plurality of functional gene modules, so that the functional gene module which can better represent the sex flavor menstruation efficiency can be selected, and whether the newly developed natural product with unknown efficiency has the sex flavor menstruation efficiency or not can be predicted based on the functional gene module, so that the accuracy of prediction is improved. Here, the number of the plurality of functional gene modules is at least 1, or at least 5, or at least 10, or at least 15, or at least 50, preferably at least 100, more preferably at least 150, and most preferably all functional gene modules in the database. Preferably, for all functional gene modules, calculating the targeting coefficient of the functional gene modules of the traditional Chinese medicines with known sex-taste meridian tropism efficacy (including the traditional Chinese medicines with known sex-taste meridian tropism efficacy and the optional traditional Chinese medicines without known sex-taste meridian tropism efficacy), and selecting the functional gene module with large targeting coefficient. Functional gene modules are essentially a collection of genes with specific functions, on the one hand from gene function notes such as gene ontology, KEGG pathway, etc., and on the other hand from a collection of genes associated with various phenotypes accumulated in human experiments. The functional gene module database used herein, the main data source is from Molecular Signatures Database database, abbreviated as MSigDB, which already contains gene ontology, KEGG pathway and other gene function notes, and part of the gene set from the experiment, the authors additionally add gene subnetwork data from the STRING protein interaction database.
Examples
Hereinafter, the present invention is described in more detail in connection with exemplary embodiments. However, the exemplary embodiments disclosed herein are for illustrative purposes only and should not be construed as limiting the scope of the invention.
Data source
The nature, taste and channel tropism data of traditional Chinese medicines: 310 Chinese medicines collected from Chinese medicine teaching materials.
List of chemical components of traditional Chinese medicine: from the database Traditio nal Chinese Medicine Integrated Database of chemical compositions of natural substances (traditional Chinese medicine), abbreviated as TCMID (http:// www.me gabonet. Org/TCMID /).
List of interacting genes: from the STITCH database Chemical Association Net works (http:// stinch. Emmbl. De), the interaction genes (stitchsorb > =200) were obtained from the above list of chemical components, and the treatment method was consistent with the FDA approved disease drugs described above.
Functional gene module data: from the Molecular Signatures Database database, abbreviated as MSigDB (https:// www.gsea-MSigDB. Org/gsea/MSigDB /). The MSigDB database contains commonly used GO Biological Process, GO Cellular Component, GO Molecular Function, KEGG pathway, etc. gene modules. Gene subnetwork data (PPI data, e.g., PPI: BATF2, representing all genes with protein interactions with BATF 2) from the STRIN G protein interaction database was additionally added.
GEO database: the gene expression database, collectively Gene Expression Omnibus, created and maintained by the national center for biotechnology information NCBI.
173 traditional Chinese medicines used in the invention: 37. radix Salviae Miltiorrhizae, radix Aconiti lateralis Preparata, herba Murrayae, cortex Acanthopancis, fructus Schisandrae chinensis, radix Ginseng, herba Agrimoniae, radix Polygoni Multiflori, herba Eupatorii, radix Codonopsis, fructus Anisi Stellati, cordyceps, semen Cassiae, radix Et caulis Acanthopanacis Senticosi, radix Glehniae, rhizoma cimicifugae, rhizoma Pinelliae, radix Aristolochiae Mollissimae, cortex Magnoliae officinalis, fructus evodiae, fructus Camptothecae Acuminatae, cortex Illici, spica Prunellae, fructus Jujubae, bulbus Allii, herba plantaginis, radix Et rhizoma Rhei, radix asparagi, rhizoma arisaematis, radix Trichosanthis, rhizoma Gastrodiae, rhizoma Curcumae Longae, radix Clematidis benzoin, notopterygii rhizoma, fructus crataegi, corni fructus, radix Dipsaci, rhizoma Ligustici Chuanxiong, radix Morindae officinalis, fructus crotonis, zingiberis rhizoma, herba Agastaches, rhizoma corydalis, radix Angelicae sinensis, radix Cynanchi Paniculati, rhizoma Pinelliae, thallus laminariae, herba Equiseti hiemalis, radix aucklandiae, semen Armeniacae amarum eucommia ulmoides, isatis roots, immature bitter oranges, medlar, lemon, bupleurum, cassia twig, mulberry leaves, mulberry twigs, mulberry, betelnuts, camphorwood, camphor, olives, orange peels, leeches, sea buckthorn, myrrh, milfoil, uniflower swisscentaury roots tobacco, achyranthes, burdock root, bezoar, radix angelicae pubescentis, radix euphorbiae Fischerianae, radix scrophulariae, roses, mother of pearl, liquorice, sugarcane, ginger, gingko, white peony root, angelica dahurica, common perilla leaf, japanese ampelopsis root, gleditsia sinensis, motherwort herb, pink herb, rhizoma anemarrhenae, japanese ardisia herb, grassleaf sweelflag rhizome, villous amomum fruit, ash bark, perilla leaf, safflower, asarum, japanese dock root, notopterygium root, cinnamon, cistanche, nutmeg, cang Gonghua, mugwort leaf, aloe, pricklyash peel, perilla leaf, perilla stem, alfalfa, kuh-seng, apple, poria cocos, fennel, oriental wormwood, tarragon, schizonepeta, strawberry, medicinal dandelion, chrysanthemum, kudzu vine root, grape, fenugreek, pollen typhae, fructus viticis, mint, mushrooms, giant knotweed, pricklyash peel, honey, centipede, toad, shortbread, american ginseng, patrinia herb, red paeony root, magnolia flower, polygala tenuifolia, weeping forsythiae capsule, radix curcumae, vinegar, willow fruit, honeysuckle, cynomorium songaria, radix stephaniae, radix sileris, donkey-hide gelatin, radix Aconiti lateralis Preparata, pericarpium Citri Tangerinae, herba Agastaches, herba Artemisiae Annuae, herba Moslae, fructus Aristolochiae, herba Houttuyniae, cornu Cervi Pantotrichum, moschus, fructus Hordei Germinatus, herba Ephedrae, cortex Phellodendri, scutellariae radix, radix astragali, herba Artemisiae Annuae and Coptidis rhizoma.
Example 1. Construction of univariate predictive model of Chinese medicine for gallbladder meridian
In this example, 173 kinds of Chinese medicines with 15 kinds of chemical components are selected for modeling. The specific steps of modeling are as follows.
Firstly, obtaining chemical components contained in 173 traditional Chinese medicines (i is marked as i and is an integer from 1 to 173) through a TCMID database;
next, the Chinese medicine (i is an integer of 1 to 173) is obtained by STITCH databaseTotal number N of interacting genes corresponding to the chemical components contained i
Then, obtaining the targeting coefficient of the traditional Chinese medicine-gene module through MSigDB database, X ij =K ij /N i . Wherein K is ij The number of genes which are the intersection of the interaction genes of the traditional Chinese medicine i and the gene module j (module j) set;
again, based on the targeting coefficient X ij In the embodiment, two simple models of univariate are adopted to realize the prediction of the Chinese medicine of 'returning to the gallbladder meridian'.
The first univariate model is the median method. The method is based on a single gene module and carries out classification prediction on traditional Chinese medicines with the efficacy label of 'returning to gallbladder channel'. For each gene module j, calculating all the targeting coefficients X of the gene module j for the 173 traditional Chinese medicines *j Median value of T j . The median value is used as a threshold value for judging whether the traditional Chinese medicine i has the efficacy of returning to the gallbladder, namely if the traditional Chinese medicine i has the targeting coefficient X to the gene module j ij Greater than or equal to median T j Then predict the Chinese medicine Y i =1, i.e. has the effect of "returning to gallbladder meridian", otherwise Y i =0, i.e. has no "return to gallbladder meridian" effect. The ratio OR (odds ratio) of this median-based predictive model was calculated and the results are shown in table 1 below.
The ratio odds ratio is one way to measure the accuracy of the predictive model, with a larger ratio indicating more accurate predictions. The calculation method is as follows: assume positive drug = drug with a return to biliary tract attribute; the medicine is predicted to be positive, and the number of medicines with positive actual conditions is a; the number of the medicines which are predicted to be positive and the real situation to be non-positive is b; predicting that the medicine is non-positive, and the number of medicines with positive actual conditions is c; predicting to be non-positive, wherein the number of non-positive medicines in the real situation is d; ratio = ratio of truly positive drug in predicted positive drug/ratio of truly positive drug in predicted non-positive drug = (a/b)/(c/d)).
The predicted performance of the single variable model based on a single gene module for the above 173 traditional Chinese medicines is shown in Table 1 below, the first column of the table shows the individual functional gene modules j of the MSigDB database, using each The individual gene modules establish a prediction model; the second column shows the basis factors in each gene module j; the third column shows the ratio OR; the result of the "OR threshold" column in the table is the targeting coefficient X of each gene module j for all the 173 traditional Chinese medicines *j Median T of (2) j . Taking the gene module GO_CELLARAR_RESPONSE_TO_EXTERNAL_STIMULUS as an example, the gene module contains 335 genes, the OR threshold is 0.039, which means that if the interaction gene of a certain Chinese medicine i and the targeting coefficient X of the gene module ij Greater than 0.039 (assuming that a certain traditional Chinese medicine i has 1000 interacting genes, which intersect more than 39 with 335 interacting genes of the go_cell_response_to_outer_stmulus gene module), then the drug y=1 is predicted. The ratio of the prediction models of the gene module is 4.86, and the gene module has certain prediction capability.
Table 1: predictive performance of univariate models based on single genetic modules
The second univariate model is the ROC curve method, i.e. the subject work characteristic curve, which is a comprehensive index reflecting the sensitivity and specificity continuous variables, and the relationship between the sensitivity and specificity of the model is shown by a mapping method. In the present invention, sensitivity and specificity of individual gene modules to "return to biliary tract" efficacy was studied using ROC profiling.
Sensitivity and specificity were calculated as follows: considering a two-class case, the classes are 1 and 0, and the positive (positive) class and the negative (negative) class are respectively 1 and 0, the actual classification result is 4. Assume that: predicting to be 1, the number of samples actually being 1 is TP, predicting to be 1, the number of samples actually being 0 is FP, predicting to be 0, the number of samples actually being 1 is FN, predicting to be 0, and the number of samples actually being 0 is TN; then: TPR: true positive rate, describing the proportion of all identified positive examples to all positive examples, and calculating the formula as follows: tpr=tp/(tp+fn); TNR: true ne gative rate, describing the proportion of the identified negative example to all negative examples, and calculating the formula as follows: tnr=tn/(fp+tn); where TPR is sensitivity (sensitivity) and TNR is specificity (specificity).
Specifically, all the targeting coefficients X of the above 173 traditional Chinese medicines *j The sliding threshold value in the value range of (a) is marked as t, samples higher than the threshold value t are set as Y=1, samples lower than the threshold value t are set as Y=0, and an ROC curve is obtained. The area under the curve of the ROC curve, i.e. the AUC value, is calculated. In the present invention, the AUC value characterizes the accuracy of the gene module for predicting the efficacy of taste and channel tropism. The AUC value is between 0 and 1, and the larger the value is, the higher the representing accuracy is. In this example, the AUC value represents the accuracy of each functional gene module for predicting "return to biliary tract" efficacy. Taking the example of the GO_CELLARER_RESPONSE_TO_EXTERNAL_STIMULUS gene module, which contains 335 genes, the "AUC threshold" of the gene module is 0.045, which means that if the interaction gene of a certain Chinese medicine i and the targeting coefficient X of the gene module ij Greater than 0.045 (or, assuming a certain chinese medicine i has 1000 interacting genes whose intersection with 335 interacting genes of the go_cell_response_to_external_stmulus gene module exceeds 45), then y=1 for that chinese medicine is predicted. The AUC of the gene model is 0.751, and the gene model has certain prediction capability.
Relevant gene modules with AUC values >0.75 are listed in table 1, as shown in column 5 of table 1, the sixth column shows the threshold t for each functional gene module under the model. Table 1 ordered in descending order of AUC threshold, it can be seen that gene modules with predictive ability for drugs with "return to biliary tract" efficacy, the names of which contain the keyword "response", for example:
go_cell_response_to_outer_stmulus, which characterizes the RESPONSE of cells TO EXTERNAL stimuli;
go_cell_response_to_extracelluar_stimulus, which characterizes the RESPONSE of a cell TO EXTRACELLULAR stimuli;
go_response_to_status, which characterizes the RESPONSE of cells TO STARVATION);
CHAUHAN_RESPONSE_TO_METHOXYESTRADIOL_UP;
SMIRNOV_RESPONSE_TO_IR_2HR_UP。
FIG. 1 shows the ROC curve of the GO_response_TO_STARVATION gene module for the "return TO gallbladder channel" efficacy under this model, which characterizes the specificity and sensitivity of the gene module for the "return TO gallbladder channel" efficacy. The abscissa of fig. 1 represents the specificity, and the ordinate represents the sensitivity. As can be seen from fig. 1, the AUC value of the ROC curve of the go_response_to_start gene module is 0.78, which indicates that the gene module is significantly related TO the "gallbladder channel returning" efficacy of the traditional Chinese medicine.
Example 2 construction of sparse self-encoder model of Chinese medicine for gallbladder meridian
In this example, the gallbladder meridian was taken as an example, 173 kinds of Chinese medicines containing at least 15 kinds of chemical components were selected, and modeling was performed using a sparse self-encoder.
In this example, 173 kinds of Chinese medicines with 15 kinds of chemical components are selected for modeling. The specific steps of modeling are as follows.
Firstly, obtaining chemical components contained in 173 traditional Chinese medicines (i is marked as i and is an integer from 1 to 173) through a TCMID database;
next, the total number N of interacting genes corresponding to the chemical components contained in the Chinese medicine (i is an integer of 1 to 173) is obtained by STITCH database i
Then, obtaining the targeting coefficient of the traditional Chinese medicine-gene module through MSigDB database, X ij =K ij /N i . Wherein K is ij The number of genes which are the intersection of the interaction genes of the traditional Chinese medicine i and the gene module j (module j) set;
model input, also the targeting coefficient X of each drug ij Vector, i.e. drug i has a targeting coefficient for gene module j.
Output Y of model i For classifying labels, y=1 if a drug has the property of returning bile channel, and y=0 if there is no property of returning bile channel.
The model neural network is formed by stacking 2 layers of sparse self encoders and softmax classification layers, as shown in fig. 2. The first layer self-encoder has a coding transfer function of log sig, a decoding transfer function of purelin and a cost function as follows:
wherein E is a cost value; lambda is an L2 regularization coefficient, set to 0.001; beta is a sparse regularization coefficient and is set to be 4; n is the number of samples (Chinese medicine); k is the variable number in the training data; x is x kn Is the targeting coefficient of the nth Chinese medicine and the kth gene module.
In artificial intelligence neural network modeling, the inventors are very concerned about the prediction capability of the model, namely the performance of the model on new data, without hope of occurrence of the overfitting phenomenon, and use regularization (reg normalization) technology to avoid overfitting of the model and ensure generalization capability by explicitly controlling the complexity of the model. In the cost function, Ω weights Is an L2 regularization term, and calculates the absolute value of each coefficient of the x weight. Adding it as a penalty term to the cost function forces each coefficient of the optimal solution to approach 0. Lambda is a super parameter that controls the importance of L2 regularization. If λ=0 then there is no L2 regularization; if λ=positive infinity, each parameter can only be 0. Here λ is set to 0.001.
Ω sparsity Is a sparse regularization term. Sparse regularization attempts to impose constraints on the sparsity of the output from the hidden layer. Adding a regularization term may encourage sparsity when the average activation value of a neuron is not close to its expected value. The relative entropy (relative entropy), also referred to as the Kullback-Leibler divergence (Kullback-Leibler diverg ence) or information divergence (information divergence), is used herein as regularization term. In information theory, the relative entropy is equivalent to the difference of the information entropy (Shannon entropy) of two probability distributions. Beta is a superparameter that controls the importance of sparse regularization. If β=0, there is no sparse regularization. Here β is set to 4.
The self-encoder training algorithm is a scale conjugate gradient descent method (scaled conjugate gradient descent), and the parameters of the second layer self-encoder are the same as those of the first layer self-encoder; the softmax layer cost function is cross entropy and is used for establishing a classification model.
Specifically, from the simple model, the inventors calculated the classification prediction performance of the model of the hidden layer neuron number k=2, 3, 4 … on the "return to the gallbladder channel" attribute traditional Chinese medicine, respectively, and found that the area under the curve AUC was 1 when the hidden layer neuron number k was 2, as shown in fig. 3. This shows that the model constructed by 2 hidden layer neurons can accurately predict the attribute of the Chinese medicine 'Guidanjing'.
The weights of these 2 hidden layer neurons are further output as shown in table 2 below. The first hidden layer neuron was found to consist of the following variables, in the table "model parameters" are weights W for the variables (gene modules) mapped to the hidden layer neurons jk I.e. the mapping weights of the gene module j to the kth=2nd hidden layer neurons. W (W) jk The magnitude of the absolute value of (c) reflects the relative importance of the variable to the predicted outcome, the greater the absolute value, the more important to the predicted outcome. The table also suggests that the gene module associated with the response to external stimuli (the name of the gene module contains the keyword "response") is important in the hidden layer neurons. In contrast TO the previous results, the variables such as the gene module TSAI_response_TO_IONIZING_RADIATION, while not selected as the single variable for the most predictive gene module, here, the linear combination between them forms hidden neurons that play an important role in the predictive model.
Table 2: variable parameters of first hidden layer neuron of Chinese medicine prediction model
Table 3 below shows the variable composition of the 2 nd hidden layer neuron, again, the "model parameters" in the table are weights W for the mapping of the variable (gene module) to the hidden layer neuron jk I.e. the mapping weights of the gene module j to the kth=2nd hidden layer neurons. W (W) jk The magnitude of the absolute value reflects the relative importance of the variable, the larger the absolute value the more important.
Table 3: variable parameters of second hidden layer neuron of Chinese medicine prediction model
/>
From the modeling result of the traditional Chinese medicine "return to the gallbladder channel", the prediction performance is greatly improved by introducing an artificial intelligence self-encoder algorithm, the area under the maximum curve AUC= 0.831 (namely 83.1 percent) of a single variable is improved to AUC=1 (namely 100 percent) from the gene module GSE43955_TGFB_IL6_VS_TGFB_IL6_IL23_TH17_ACT_CD4_TCELL_52H_DN with the base factor of 200 in the table 1, and meanwhile, the prediction model has good interpretation. For each hidden layer neuron, the gene module that it contributes mainly can be traced back.
EXAMPLE 3 construction of drug "sexual" related Artificial Intelligence model of traditional Chinese medicine
Five kinds of properties of cold, heat, warm, cool and flat are important characteristics of traditional Chinese medicines, and distinguishing of properties is an important decision point for diagnosis and medication of traditional Chinese medicines, and judgment of the properties has a great influence on treatment thought. Therefore, an artificial intelligence model about the drug "nature" of the traditional Chinese medicine is established in this example, genes related to five kinds of drug properties of the traditional Chinese medicine, namely cold, heat, warm, cool and flat, are screened, and then modeling is performed to predict the drug properties of the medicinal materials, the pharmaceutical composition, the plant extract and the like.
In this example, the gene modules used were identical to the gene module databases used in examples 1 and 2, and 173 kinds of Chinese medicine containing at least 15 kinds of chemical components were selected for subsequent analysis.
In this example, the gene modules used were added to lymphocyte ratio modules based on the gene module databases used in examples 1 and 2, considering that the drug properties may be related to the immune system. The data source for this lymphocyte proportion module is from the GEO database: GSE77445 (https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgiac=gse 77445). The raw data of the white blood cell count in this data source also includes cd8+ T cell fraction, cd4+ T cell fraction, NK cell fraction, B cell fraction, monocyte fraction, granulocyte fraction. DNA methylation sites which are remarkably related to the cell proportion are obtained through Pearson correlation coefficients, and then gene modules related to the immune cell percentages are obtained through gene annotation.
In addition, in order to improve the specificity of the gene module, considering that the age is a factor related to various diseases of human beings, a remarkable age-related gene module is established for each gene module. Human blood cell DNA methylation data containing age information was used, again from the GEO database: GSE40279 (https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgiac=gse 40279). Using the method of extracting gene-level DNA methylation characteristics, the SIMPO algorithm, which has been published by the inventors, is generally known as Statistical difference of DNA Methylation between Promoter and Other Body Region, (Mining the Selective Re modeling of DNA Methylation in Promoter Regions to Identify RobustGene-Level Associations with Phenotype, https:// www.biorxiv.org/cote nt/10.1101/2020.01.05.895326v2), the DNA methylation chip detection signal is processed into a gene methylation level signal, i.e., a gene expression value. The calculation of the SIMPO value for each gene is actually used to screen for genes associated with phenotypes. Then, gene SIMPO score differences between elderly (> = 80 years) and young (< 35 years) individuals were calculated based on T-test, and genes with P value <0.001 in T-test results were selected to form age-significantly related gene modules. Each gene module retains only age-related genes as a representation of that gene module.
Modeling is similar to the previous embodiment, with the following steps.
Firstly, obtaining chemical components contained in 173 traditional Chinese medicines (i is marked as i and is an integer from 1 to 173) through a TCMID database;
next, the content of the Chinese medicine (denoted as i, where i is an integer of 1 to 173) is obtained by STITCH databaseTotal number of interacting genes N corresponding to chemical components i
Then, obtaining the targeting coefficient of the traditional Chinese medicine-gene module through MSigDB database, X ij =K ij /N i . Wherein K is ij The number of genes which are the intersection of the interaction genes of the traditional Chinese medicine i and the gene module j (module j) set.
For the traditional Chinese medicinal materials with cold, cool, flat, warm and hot attribute labels, the traditional Chinese medicinal materials are mapped to a gene module with human gene functions through an interaction gene database of chemical components. Characterizing by using a dimension space of 2248 gene modules, for any combination of the gene modules and the medicine, calculating significance of intersection of gene sets contained in the gene modules and interaction gene sets of the medicine by using the super geometric distribution, and performing-log 10 (P) conversion on P values of the super geometric distribution, wherein the P values are used as correlation indexes of the gene modules and the medicine and are marked as HPI (hereb-path index).
The cold, cool, flat, warm and hot attributes are calculated respectively, and the cold label and the medicine without the label are classified into two types, and the obvious difference of HPI indexes in the two types of medicines is calculated by T-test. The calculation results are shown in table 4 below.
Table 4: gene module list related to cold, cool, flat, warm and heat attributes
/>
The second column in table 4 is the number of gene modules that are significantly related to each of the attributes "cold", "cool", "flat", "warm" and "hot". The gene modules with the p value less than 0.02 are listed, and the most gene modules related to the heat attribute indicate that the biological process of targeting the traditional Chinese medicine with the heat attribute has stronger diversity from the quantity.
Genetic modules associated with immune function, particularly lymphocyte percentages, appear in the "cold", "warm" attributes. In the "cold" attribute, the CD4+ T cell fraction (p= 0.000823) of the gene module named DS: CD 4T cell proportion, the granulocyte fraction (p=0.0039) of the gene module named DS: granulocytes cell proportion and the CD8+ T cell fraction (0.0153) of the gene module named DS: CD 8T cell proportion are the 1 st, 2 nd and 4 th positions of the gene module, respectively, most significant in the "cold" attribute. In the "warm" attribute, the most prominent lymphocytes are the CD 4T cell fraction (= 0.00702) of the gene module name DS: CD 4T cell proportion.
Based on the screened gene modules, the inventor establishes a classification model related to traditional Chinese medicine 'sex' through stacking sparse self-encoders. The stacked self-encoder architecture employed was the same as the neural network architecture of fig. 2 in example 2, which was stacked with 2 layers of sparse self-encoders and softmax layers. The number of neurons for the 2 hidden layers was set to 8. The coding transfer function of the first layer self-encoder is log sig, the decoding transfer function is purelin, and the cost function is as follows:
Wherein E is a cost value; lambda is an L2 regularization coefficient, set to 0.001; beta is a sparse regularization coefficient and is set to be 4; omega shape weights Calculating the absolute value of each coefficient of the x weight for the L2 regularization term; omega shape sparsity Is a sparse regularization term; n is the number of samples (Chinese medicine); k is the variable number in the training data; x is x kn Is the targeting coefficient of the nth Chinese medicine and the kth gene module. The sparse ratio (the expected ratio of individual neurons to training samples) was 0.05. The self-encoder training algorithm is a scale conjugate gradient descent method. The second layer self-encoder parameters are the same as the self-encoder 1. The softmax layer cost function is cross entropy and is used for establishing a classification model.
Specifically, starting from a simple model, the inventors calculated the classification predictive performance of the model of the hidden layer neuron numbers k=2, 3, 4 … 16 on the "cold" attribute traditional Chinese medicine, respectively. When the hidden layer neuron is 2, the AUC of the predicted ROC curve is 94%; auc=0.99 of the predicted ROC curve when the number of hidden neurons is 3, as shown in fig. 4.
Further analyzing the model with the number k of hidden neurons being 3, fig. 5 shows the input variable weight values of the first hidden neurons (upper graph), the second hidden neurons (middle graph) and the third hidden neurons (lower graph) in the stacked sparse self-encoder model constructed for the "cold" traditional Chinese medicine in this embodiment. Wherein the 15 gene modules with the largest absolute value of the weight value in each hidden layer neuron are listed respectively, and the linear combination of the gene modules forms the characteristic of each hidden layer neuron.
As can be seen from fig. 5, in the model for the "cold" traditional Chinese medicine of the present embodiment, 3 hidden layer neurons show functional characteristics of immune cells and vascular endothelial cells on cell types, and different hidden layer neurons have different functional preferences.
Specifically, in the first hidden layer neuron shown in the upper graph OF fig. 5, the highest absolute value OF the weight is the phagocytosis (gene module name REACTOME ROLE OF PHOSPHOLIPIDS IN PHAGOCYTOSIS) involved in phospholipid, and the absolute value OF the weight is much higher than that OF the second-ranked gene module GO related OF MAP KINASE ACTIVITY. Phagocytosis is the oldest and one of the most basic defense mechanisms for organisms. Higher animal specialized phagocytes mainly include macrophages (macrophages) and neutrophils (neutrophilis). They ingest and destroy infected bacteria, viruses, and damaged cells, senescent erythrocytes by swallowing the phage. Leukocytes can leak from the blood vessel through the endothelial space of the capillaries and migrate through the tissue space, so that the action of leukocytes depends on the synergy of the vascular endothelial system.
In the second hidden layer neuron shown in the diagram of FIG. 5, the two gene modules with the largest absolute weights are the granulocyte component-related gene module (DS: granulocytes cell pro portion) and the DNA conformational change-related gene module (GO DNA CONFORMATION CHANGE), respectively, and the two gene modules have no large difference in weights. In fact, as can be seen from the middle graph of fig. 5, among the 15 gene modules of the second hidden layer neuron, the gene module in which the absolute value of the weight value in the first hidden layer neuron is dominant is not present.
In the third hidden layer neuron shown in the lower diagram of fig. 5, the two gene modules with the largest absolute weight values are the gene module related to the G2/M phase DNA damage CHECKPOINT (reach G2M DNA DAMAGE chekpoint) and the gene module related to endothelial cell proliferation (GO ENDOTHELIAL CELL PROLIFERATION), respectively. Wherein, the gene module (DS: CD 8T cell proportion) of the CD8+T cell ratio is also within the absolute value Top15 of the weight.
Therefore, from the characteristics of hidden neurons of the current sparse self-encoder model, the model mainly reflects the main biological process that the cooperative regulation and control of immune cells and endothelial cells is the targeting of cold traditional Chinese medicine.
EXAMPLE 4 construction of an artificial Intelligent model for the pungent taste of traditional Chinese medicine
In this example, the gene modules used were identical to the gene module databases used in examples 1 and 2, and 173 kinds of Chinese medicine containing at least 15 kinds of chemical components were selected for subsequent analysis.
Modeling is similar to the previous embodiment, with the following steps.
Firstly, obtaining chemical components contained in 173 traditional Chinese medicines (i is marked as i and is an integer from 1 to 173) through a TCMID database;
next, the total number N of interacting genes corresponding to the chemical components contained in the Chinese medicine (i is an integer of 1 to 173) is obtained by STITCH database i
Then, obtaining the targeting coefficient of the traditional Chinese medicine-gene module through MSigDB database, X ij =K ij /N i . Wherein K is ij The number of genes which are the intersection of the interaction genes of the traditional Chinese medicine i and the gene module j (module j) set.
All 173 Chinese medicines are divided into two categories, the first category is labeled with pungent taste, or "pungent taste Chinese medicine", and the second category is labeled without pungent taste, or "non-pungent taste Chinese medicine". Which gene modules were tested by rank-sum test (Wilcoxon rank sumtest) for significant differences between the two classes of herbs with pungent taste and the results are shown in table 5 below. Wherein the P values in the table are obtained by the rank sum test.
The T value in the table is defined as T j =X 1j -X 2j I.e. for gene module j, X 1j The targeting coefficient of the gene module j is the median value, X of all pungent traditional Chinese medicines 2j The median of the targeting coefficient of all non-pungent herbs to gene module j was similar to the T value in T-test. The following table 5 orders by the T values, the larger the T value of the gene module, i.e., the larger the difference between the median value of the targeting coefficients of the pungent traditional Chinese medicine and the non-pungent traditional Chinese medicine to the gene module j, meaning that the gene module j has a significant difference between the pungent traditional Chinese medicine and the non-pungent traditional Chinese medicine, or is more "enriched" in the interaction genes of the pungent traditional Chinese medicines, so that the gene module j can be used for predicting the "pungent" or the "non-pungent" of the traditional Chinese medicine. FDR in the table is the false positive rate calculated from the P value.
Table 5: gene module remarkably related to pungent attribute of traditional Chinese medicine
/>
From table 5, it can be seen that among all the gene modules, the gene module enriched in pungent traditional Chinese medicines (i.e., top-ranked in table 5) is a protein interaction (PPI) network with olfactory receptors as cores. Since the pungent traditional Chinese medicines are all olfactory receptors, partial olfactory receptor PPI gene modules are removed, and the gene modules of non-PPI types which are suboptimal in ranking are also shown in table 5, and most of the gene modules are found to be related to olfactory signal pathways, such as REACTOME OLFACTORY SIGNALING PATHWAY, GO OLFACTORY RECEPTOR ACTIVITY and the like.
The results in table 5 demonstrate that the targeting characteristics of the pungent drugs are consistent with the inventors' expectations for pungent drugs. The inventor further establishes a classification model of the pungent drugs by using an artificial intelligent self-encoder algorithm. The stacked self-encoder architecture used in this embodiment is the same as the neural network architecture of fig. 2 of embodiment 2, which is formed by stacking 2 layers of sparse self-encoders and softmax layers. Starting from the simple model, the inventors calculated the classification predictive performance of the model of the number of hidden neurons k=2, 3, 4 … for "pungent" traditional Chinese medicine, respectively, with an AUC of the predicted ROC curve of 97% when the number of hidden neurons k is 2, and 99% when the number of hidden neurons k is 3, as shown in fig. 6.
Therefore, the model with the number k of hidden neurons being 3 is selected for analysis, and fig. 7 shows the input variable weight values of the first hidden neurons (upper graph), the second hidden neurons (middle graph) and the third hidden neurons (lower graph) in the stacked sparse self-encoder model constructed for "pungent" traditional Chinese medicine in this embodiment. Wherein the 15 gene modules with the largest absolute value of the weight value in each hidden layer neuron are listed respectively, and the linear combination of the gene modules forms the characteristic of each hidden layer neuron.
In the first hidden layer neuron shown in the upper graph of fig. 7, the gene module with a more significant absolute value of weight is an MHC II related antigen presenting process, such as the gene module named REACTOME MHC CLASS II ANTIGEN preshenyton, and the related PPI gene module. As shown in the middle panel of fig. 7, the major contribution of the second hidden layer neurons comes from another set of PPI gene modules. As shown in the lower panel of fig. 7, the third hidden layer neuron consists essentially of the olfactory receptor PPI gene module. In fact, as can be seen from fig. 7, no gene module whose absolute value of the weight is dominant appears, either 15 gene modules of the first hidden layer neuron, 15 gene modules of the second hidden layer neuron, or 15 gene modules of the third hidden layer neuron.
Thus, from the characteristics of hidden neurons of the current sparse self-encoder model, the model mainly reflects the main biological processes of olfactory receptors, olfactory signal pathways, and the like, which are targeted by "pungent" traditional Chinese medicines.
The result shows that the method provided by the invention can be used for accurately predicting the pungent traditional Chinese medicine, and the established neural network model has good interpretation.

Claims (12)

1. A method of modeling for evaluating the performance and taste of a natural product, the method comprising:
1) For traditional Chinese medicines with known sex-flavor menstruation-inducing effects, obtaining chemical ingredient groups of the traditional Chinese medicines, wherein the traditional Chinese medicines comprise traditional Chinese medicines with known sex-flavor menstruation-inducing effects;
2) According to the chemical component groups of the traditional Chinese medicines, interaction genes corresponding to each chemical component are obtained, and an interaction gene set of the traditional Chinese medicines is formed;
3) For the interaction gene set of the traditional Chinese medicine, calculating the targeting coefficient of the traditional Chinese medicine to the functional gene module, wherein the targeting coefficient is calculated as follows: x is X ij = K ij /N i Wherein K is ij Is the gene number, N of the interaction genes and functional gene module j set intersection of the traditional Chinese medicine i i Is the number of interacting genes of the traditional Chinese medicine i;
4) Based on the targeting coefficient of the traditional Chinese medicine to the functional gene module, a model for evaluating the nature, taste and channel tropism of natural substances is established, wherein the model is an artificial intelligent model.
2. The method according to claim 1, wherein said Chinese medicine in step 1) further comprises a Chinese medicine having no effect of said sexual taste and menstruation.
3. The method according to claim 1 or 2, in the 4), a univariate model is built by using the targeting coefficient of a plurality of traditional Chinese medicines to one gene module, a threshold value is set in the univariate model, and the size relation between the targeting coefficient of the traditional Chinese medicines and the threshold value is judged so as to predict the efficacy of the traditional Chinese medicines in returning to the channels.
4. The method of claim 3, wherein the threshold is a median of the targeting coefficients of the plurality of traditional Chinese medicines to the gene module.
5. A method according to claim 3, wherein ROC curves are obtained for the univariate model at different thresholds.
6. The method according to claim 1 or 2, in the 4), the targeting coefficient of a plurality of traditional Chinese medicines to a plurality of gene modules is used as the characteristic vector of each traditional Chinese medicine, and the correspondence between the characteristic vector and the efficacy label is established for training the artificial intelligent model.
7. The method of claim 6, the artificial intelligence model being a sparse self-encoder model.
8. The method according to any one of claims 1, 2, 4, 5 or 7, wherein the sex flavor meridian tropism efficacy is selected from cold, heat, warm, cool, flat, spicy, sweet, sour, bitter, salty, gallbladder meridian tropism, heart meridian tropism, liver meridian tropism, spleen meridian tropism, lung meridian tropism, kidney meridian tropism, stomach meridian tropism, large intestine tropism, small intestine tropism, bladder meridian tropism, heart packet meridian tropism, triple-focus meridian tropism.
9. The method of any one of claims 1, 2, 4, 5 or 7, wherein the set of chemical components comprises at least 10 chemical components.
10. The method of any one of claims 1, 2, 4, 5, or 7, wherein the set of chemical components comprises at least 15 chemical components.
11. The method according to any one of claims 1, 2, 4, 5 or 7, wherein the functional gene modules are from a database of functional gene modules.
12. A method of evaluating the performance of a natural product, the method comprising:
1) For a natural object to be detected, respectively obtaining chemical component diversity of the natural object to be detected;
2) According to the chemical component set of the natural substance to be detected, obtaining an interaction gene corresponding to each chemical component to form an interaction gene set of the natural substance to be detected;
3) Calculating the targeting coefficient of the natural substance to be detected for the functional gene module for the interaction gene set of the natural substance to be detected;
4) Predicting whether the natural substance to be tested has the effect of the sex flavor restoring channel or not by using a model for evaluating the sex flavor restoring channel of the natural substance, which is established by the method of any one of claims 1 to 11, based on the targeting coefficient of the natural substance to be tested for the functional gene module.
CN202010467222.XA 2020-05-28 2020-05-28 Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof Active CN111681717B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010467222.XA CN111681717B (en) 2020-05-28 2020-05-28 Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010467222.XA CN111681717B (en) 2020-05-28 2020-05-28 Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof

Publications (2)

Publication Number Publication Date
CN111681717A CN111681717A (en) 2020-09-18
CN111681717B true CN111681717B (en) 2023-09-15

Family

ID=72434348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010467222.XA Active CN111681717B (en) 2020-05-28 2020-05-28 Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof

Country Status (1)

Country Link
CN (1) CN111681717B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112461981B (en) * 2020-11-25 2023-02-03 内蒙古医科大学 Mongolian medicine Tabusson-2 extraction process response surface optimization method and detection method
CN113053455A (en) * 2021-03-05 2021-06-29 北京中医药大学 Application of potential taste-effect associated key quality attribute identification method in pediatric food retention removing and cough relieving oral liquid
CN114203254B (en) * 2021-12-02 2023-05-23 杭州艾沐蒽生物科技有限公司 Method for analyzing immune characteristic related TCR based on artificial intelligence

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2600269A2 (en) * 2011-12-03 2013-06-05 Medeolinx, LLC Microarray sampling and network modeling for drug toxicity prediction
CN106055921A (en) * 2016-05-27 2016-10-26 华中农业大学 Pharmaceutical activity prediction and selection method based on genetic expressions and drug targets
CN108154929A (en) * 2018-01-10 2018-06-12 华子昂 A kind of parsing disease and the method for screening drug and its application in traditional Chinese medical science robot
CN110289106A (en) * 2019-06-28 2019-09-27 淮阴工学院 A method of effect, which is analyzed, from Chinese medicine compound prescription corresponds to Chinese medicine and its pharmacological property compatibility relationship

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2600269A2 (en) * 2011-12-03 2013-06-05 Medeolinx, LLC Microarray sampling and network modeling for drug toxicity prediction
CN106055921A (en) * 2016-05-27 2016-10-26 华中农业大学 Pharmaceutical activity prediction and selection method based on genetic expressions and drug targets
CN108154929A (en) * 2018-01-10 2018-06-12 华子昂 A kind of parsing disease and the method for screening drug and its application in traditional Chinese medical science robot
CN110289106A (en) * 2019-06-28 2019-09-27 淮阴工学院 A method of effect, which is analyzed, from Chinese medicine compound prescription corresponds to Chinese medicine and its pharmacological property compatibility relationship

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Xin-yu Li 等.Network toxicology and LC-MS-based metabolomics: New approaches for mechanism of action of toxic components in traditional Chinese medicines.《Chinese Herbal Medicines》.2019,第357-363页. *

Also Published As

Publication number Publication date
CN111681717A (en) 2020-09-18

Similar Documents

Publication Publication Date Title
CN111681717B (en) Traditional Chinese medicine quality control and natural matter efficacy prediction model based on artificial intelligence and construction and use methods thereof
Khan et al. Augmentation and proliferation of T lymphocytes and Th-1 cytokines by Withania somnifera in stressed mice
CN108090669A (en) A kind of Quality Evaluation of Chinese Medicinal evaluation method
CN112161965A (en) Method, device, computer equipment and storage medium for detecting traditional Chinese medicine property
Kumar et al. Effect of bacopa monniera on cold stress induced neurodegeneration in hippocampus of wistar rats: a histomorphometric study
Sathiya et al. An automatic classification and early disease detection technique for herbs plant
CN108073780B (en) Method for comparing clinical curative effects of traditional Chinese medicine compound
CN113178234B (en) Compound function prediction method based on neural network and connection graph algorithm
CN110648726B (en) Network target-based drug network pharmacology intelligent and quantitative analysis method and system
CN102188720A (en) Method for studying base of medicinal effect materials
CN113012820A (en) Identification method for key quality attributes of potential taste and flavor effects of traditional Chinese medicine preparation
CN105628885B (en) Analysis of Traditional Chinese Patent Medicine method based on multi-source data
CN104971364B (en) The screening technique of main effect drug ingedient in Chinese medicine compound prescription
CN115472241B (en) Chinese medicine component cluster menstruation determining method based on chemical structure topology index comparison and chromatographic imprinting measurement
Qiu et al. Identification of the Origin, Authenticity and Quality of Panax Japonicus Based on a Multistrategy Platform
Narzullaev et al. Methods for assessing risk factors to the health of highly qualified athletes
CN110322929A (en) A method of the direct target spot of prediction Chinese medicine compound prescription and action component
CN104547514A (en) Traditional Chinese medicine composition for treating systemic lupus erythematosus rheumatoid arthritis vasculitis and application thereof
CN113053455A (en) Application of potential taste-effect associated key quality attribute identification method in pediatric food retention removing and cough relieving oral liquid
CN114894944B (en) Identification method of external medicine flavor
CN103877509B (en) A kind of medicine and its preparation method treating leukemia
Muhammad et al. Evaluation of haematological parameters and blood glucose after a 28-day oral administration of standardized extract of Laggera aurita (Linn) in rats
Adhav et al. Survey on Healing Herbs Detection using Machine Learning
CN115372515B (en) Method for rapidly identifying authenticity of gastrodia elata powder based on electronic nose
Sinulingga et al. Imunomodulatory Effect of Red Tip Leaf’s (Syzygiummyrtifolium Walp.) Ethanol Extract on Male Rat

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant