CN110322926A - The recognition methods of miRNA sponge module and device - Google Patents

The recognition methods of miRNA sponge module and device Download PDF

Info

Publication number
CN110322926A
CN110322926A CN201910684738.7A CN201910684738A CN110322926A CN 110322926 A CN110322926 A CN 110322926A CN 201910684738 A CN201910684738 A CN 201910684738A CN 110322926 A CN110322926 A CN 110322926A
Authority
CN
China
Prior art keywords
sponge
mirna
gene
module
genes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910684738.7A
Other languages
Chinese (zh)
Other versions
CN110322926B (en
Inventor
张俊鹏
饶妮妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Shaoping
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910684738.7A priority Critical patent/CN110322926B/en
Publication of CN110322926A publication Critical patent/CN110322926A/en
Application granted granted Critical
Publication of CN110322926B publication Critical patent/CN110322926B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Analytical Chemistry (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Apparatus Associated With Microorganisms And Enzymes (AREA)

Abstract

The present invention provides recognition methods and the device of a kind of miRNA sponge module, and method includes: the expression matrix of the expression matrix and target gene that obtain the sponge gene of matched sample.According to the expression matrix of the expression matrix of sponge gene and target gene, multiple sponge genes-target gene coexpression module is obtained.It obtains in each sponge gene-target gene coexpression module, shares the sensibility canonical correlation coefficient when canonical correlation coefficient and shared miRNA between significance value, sponge gene and the target gene of miRNA.And determine whether each sponge gene-target gene coexpression module is miRNA sponge module according to the above parameter.Since this programme can measure the competition intensity between sponge gene and target gene under module level, and whether Dinghai silk floss gene-target gene coexpression module is miRNA sponge module really according to competition intensity between sponge gene and target gene, so that the identification of miRNA sponge module is more accurate.

Description

Identification method and device of miRNA sponge module
Technical Field
The invention relates to the technical field of gene identification, in particular to a method and a device for identifying a miRNA sponge module.
Background
Micro ribonucleic acid (MicroRNA, miRNA) is an endogenous class of small non-coding RNA molecules of about 21-23 nucleotides in length that complete cleavage, inhibition of translation and protein degradation processes with Messenger ribonucleic acid (mRNA) by the base complementary pairing principle. Different transcripts sharing the same Microrna Response Elements (MREs), which in turn constitute a competitive relationship, are collectively referred to as miRNA sponges. miRNA sponges do not stand alone but aggregate into clusters or modules to complete an important mission in their physiological and pathological processes. Therefore, the excavation of the miRNA sponge module is of great significance for the research of human physiological and pathological processes.
In the prior art, miRNA sponge modules may be identified based on miRNA sponge interaction networks, e.g. by means of Markov Cluster Algorithm (MCL).
Because the miRNA sponge interaction network is formed by integrating single sponge gene-target gene interaction pairs, the influence of shared miRNAs on the competitive strength between the sponge genes and the target genes is not considered, and the identified miRNA sponge modules are not accurate enough.
Disclosure of Invention
The invention aims to provide a method and a device for identifying a miRNA sponge module, which aim to solve the problem that the identified miRNA sponge module is not accurate enough in the prior art.
In a first aspect, an embodiment of the present invention provides a method for identifying a miRNA sponge module, including:
and obtaining an expression matrix of the sponge genes and an expression matrix of the target genes which are matched with the sample. And obtaining a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes, wherein each sponge gene-target gene co-expression module represents a gene which can be co-expressed by the sponge genes and the target genes. And acquiring a significance value of the shared miRNA, a typical correlation coefficient between the sponge gene and the target gene and a sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module. And determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of the sponge genes and the target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared.
In an alternative embodiment, obtaining a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes comprises: and clustering the expression matrix of the sponge genes and the expression matrix of the target genes according to a clustering algorithm to obtain a plurality of clustering results of the sponge genes and the target genes. And taking the sponge gene and the target gene in each clustering result as a sponge gene-target gene co-expression module.
In an alternative embodiment, the clustering algorithm comprises: a one-way clustering algorithm or a two-way clustering algorithm. And if the clustering algorithm is a one-way clustering algorithm, clustering the sponge genes and the target genes according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matched sample. And if the clustering algorithm is a bidirectional clustering algorithm, clustering the sponge genes, the target genes and the matching samples of the preset part according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matching samples of the preset part.
In an alternative embodiment, obtaining a significance value for a shared miRNA comprises: and obtaining the significance value of the shared miRNA between the sponge gene and the target gene in the sponge gene-target gene co-expression module by a super-geometric distribution test method according to a preset miRNA-target gene regulation relationship.
In an alternative embodiment, a typical correlation coefficient between a sponge gene and a target gene is obtained, comprising: and obtaining the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the sponge genes and the expression matrix of the target genes. And calculating the typical correlation coefficient according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector.
In an alternative embodiment, the sensitivity profile correlation coefficients when sharing mirnas are obtained, including: and acquiring a typical correlation coefficient between the shared miRNA and the sponge gene and a typical correlation coefficient between the shared miRNA and the target gene. And calculating a partial canonical correlation coefficient between the sponge gene and the target gene according to the canonical correlation coefficient between the sponge gene and the target gene, the canonical correlation coefficient between the shared miRNA and the sponge gene, and the canonical correlation coefficient between the shared miRNA and the target gene. And subtracting the partial typical correlation coefficient between the sponge gene and the target gene from the typical correlation coefficient between the sponge gene and the target gene to obtain the sensitivity typical correlation coefficient when the miRNA is shared.
In an alternative embodiment, obtaining the canonical correlation coefficient between the shared miRNA and the spongin gene and the canonical correlation coefficient between the shared miRNA and the target gene comprises: and obtaining an expression matrix of the shared miRNA. And acquiring the column vector of the shared miRNA, the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the shared miRNA, the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene. And calculating a typical correlation coefficient between the shared miRNA and the sponge gene according to the column vector of the shared miRNA, the column vector of the sponge gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene, the covariance matrix and a preset typical vector. Obtaining a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene. And calculating a typical correlation coefficient between the shared miRNA and the target gene according to the column vector of the shared miRNA, the column vector of the target gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, the covariance matrix and a preset typical vector.
In an alternative embodiment, determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of sponge genes and target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes, and the sensitivity typical correlation coefficient when the miRNA is shared includes: and if the number of the sponge genes and the target genes in the sponge gene-target gene co-expression module is more than or equal to 2, the significance value of the shared miRNA is less than 0.05, the typical correlation coefficient between the sponge genes and the target genes is more than 0.8, and the sensitivity typical correlation coefficient when the miRNA is shared is more than 0.1, determining that the sponge gene-target gene co-expression module is the miRNA sponge module.
In an alternative embodiment, after determining whether each sponge gene-target gene co-expression module is a miRNA sponge module, further comprising: and if the sponge gene-target gene co-expression module is an miRNA sponge module, acquiring relevant data of the miRNA sponge module and the target gene, wherein the relevant data comprises significance data of the miRNA sponge module and a preset miRNA sponge module, and whether the miRNA sponge module and the miRNA sponge module are markers of the target gene, and the miRNA sponge module comprises long-chain non-coding ribonucleic acid (LncRNA) and protein coding ribonucleic acid (mRNA).
In an alternative embodiment, obtaining significance data for a miRNA sponge module and a pre-set miRNA sponge module comprises: the first Pearson correlation coefficient value (mean absolute Pearson correlation, Pearson) was obtained for the mean absolute values of all LncRNA-mRNA coexpression pairs in each miRNA sponge module. And generating a plurality of preset miRNA sponge modules according to all LncRNA and mRNA in each miRNA sponge module by a preset random algorithm. And obtaining a second Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each preset miRNA sponge module. And acquiring significance data of the miRNA sponge module and the preset miRNA sponge module according to the first Pearson correlation coefficient value and the second Pearson correlation coefficient value by a preset pairing difference detection algorithm.
In an alternative embodiment, obtaining significance data of the miRNA sponge module and the target gene comprises: and acquiring the quantity of LncRNA and mRNA in the preset data set and the quantity of LncRNA and mRNA co-expressed with the target gene in the preset data set. And acquiring the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA coexpressed by the miRNA sponge module and the target gene. And calculating significance data of the miRNA sponge module and the target gene by a hyper-geometric distribution test method according to the number of LncRNA and mRNA in the preset data set and the number of LncRNA and mRNA co-expressed with the target gene in the preset data set, the number of LncRNA and mRNA in the miRNA sponge module and the number of LncRNA and mRNA co-expressed with the target gene in the miRNA sponge module.
In an alternative embodiment, determining whether a miRNA sponge module is a marker for a target gene comprises: calculating a risk value of each matched sample according to miRNA through a first preset algorithm. Binarizing the risk value of each matched sample to obtain a first set of risk values and a second set of risk values, wherein the risk values in the first set of risk values are greater than the risk values in the second set of risk values. And calculating the risk ratio of the first risk value set and the second risk value set through a second preset algorithm according to the first risk value set and the second risk value set. And acquiring the difference significance of the first risk value set and the second risk value set according to the first risk value set and the second risk value set and the test algorithm. And determining whether the miRNA is a marker of the target gene according to the risk ratio and the difference significance.
In a second aspect, an embodiment of the present invention provides an apparatus for identifying a miRNA sponge module, including: and the acquisition module is used for acquiring the expression matrix of the sponge genes and the expression matrix of the target genes which are matched with the sample. And the acquisition module is also used for acquiring a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes, wherein each sponge gene-target gene co-expression module represents a gene which can be co-expressed by the sponge genes and the target genes. The acquisition module is also used for acquiring the significance value of the shared miRNA, the typical correlation coefficient between the sponge gene and the target gene and the sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module. And the determining module is used for determining whether each sponge gene-target gene co-expression module is an miRNA sponge module according to the number of the sponge genes and the target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared.
In an optional embodiment, the obtaining module is specifically configured to cluster the expression matrix of the sponge genes and the expression matrix of the target genes according to a clustering algorithm, and obtain a plurality of clustering results of the sponge genes and the target genes. And taking the sponge gene and the target gene in each clustering result as a sponge gene-target gene co-expression module.
In an alternative embodiment, the clustering algorithm comprises: a one-way clustering algorithm or a two-way clustering algorithm. And if the clustering algorithm is a one-way clustering algorithm, clustering the sponge genes and the target genes according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matched sample. And if the clustering algorithm is a bidirectional clustering algorithm, clustering the sponge genes, the target genes and the matching samples of the preset part according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matching samples of the preset part.
In an optional embodiment, the obtaining module is specifically configured to obtain, according to a preset miRNA-target gene regulation relationship, a significance value of a miRNA shared between a sponge gene and a target gene in the sponge gene-target gene co-expression module through a hyper-geometric distribution test algorithm.
In an alternative embodiment, the obtaining module is specifically configured to obtain the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the sponge genes and the expression matrix of the target genes. And calculating the typical correlation coefficient according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector.
In an alternative embodiment, the obtaining module is specifically configured to obtain a canonical correlation coefficient between the shared miRNA and the sponge gene and a canonical correlation coefficient between the shared miRNA and the target gene. And calculating a partial canonical correlation coefficient between the sponge gene and the target gene according to the canonical correlation coefficient between the sponge gene and the target gene, the canonical correlation coefficient between the shared miRNA and the sponge gene, and the canonical correlation coefficient between the shared miRNA and the target gene. And subtracting the partial typical correlation coefficient between the sponge gene and the target gene from the typical correlation coefficient between the sponge gene and the target gene to obtain the sensitivity typical correlation coefficient when the miRNA is shared.
In an alternative embodiment, the obtaining module is specifically configured to obtain an expression matrix sharing mirnas. And acquiring the column vector of the shared miRNA, the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the shared miRNA, the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene. And calculating a typical correlation coefficient between the shared miRNA and the sponge gene according to the column vector of the shared miRNA, the column vector of the sponge gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene, the covariance matrix and a preset typical vector. Obtaining a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene. And calculating a typical correlation coefficient between the shared miRNA and the target gene according to the column vector of the shared miRNA, the column vector of the target gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, the covariance matrix and a preset typical vector.
In an alternative embodiment, the determining module is specifically configured to determine that the sponge gene-target gene co-expression module is the miRNA sponge module if the numbers of the sponge genes and the target genes in the sponge gene-target gene co-expression module are both greater than or equal to 2, the significance value of the shared miRNA is less than 0.05, the typical correlation coefficient between the sponge genes and the target genes is greater than 0.8, and the sensitivity typical correlation coefficient when the miRNA is shared is greater than 0.1.
In an optional embodiment, the obtaining module is further configured to obtain correlation data between the miRNA sponge module and the target gene if the sponge gene-target gene co-expression module is the miRNA sponge module, where the correlation data includes significance data of the miRNA sponge module and a preset miRNA sponge module, and whether the miRNA sponge module and the miRNA sponge module are markers of the target gene, and the miRNA sponge module includes Long non-coding RNA (incrna) and mRNA.
In an alternative embodiment, the obtaining module is specifically configured to obtain a first Pearson (Pearson) correlation coefficient value of the mean absolute value of all LncRNA-mRNA co-expression pairs in each miRNA sponge module. And generating a plurality of preset miRNA sponge modules according to all LncRNA and mRNA in each miRNA sponge module by a preset random algorithm. And obtaining a second Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each preset miRNA sponge module. And acquiring significance data of the miRNA sponge module and the preset miRNA sponge module according to the first Pearson correlation coefficient value and the second Pearson correlation coefficient value by a preset pairing difference detection algorithm.
In an alternative embodiment, the obtaining module is specifically configured to obtain the number of LncRNA and mRNA in the preset data set and the number of LncRNA and mRNA co-expressed with the target gene in the preset data set. And acquiring the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA coexpressed by the miRNA sponge module and the target gene. And calculating significance data of the miRNA sponge module and the target gene by a hyper-geometric distribution test method according to the number of LncRNA and mRNA in the preset data set and the number of LncRNA and mRNA co-expressed with the target gene in the preset data set, the number of LncRNA and mRNA in the miRNA sponge module and the number of LncRNA and mRNA co-expressed with the target gene in the miRNA sponge module.
In an alternative embodiment, the obtaining module is specifically configured to calculate, according to the miRNA, a risk value for each matching sample by a first preset algorithm. Binarizing the risk value of each matched sample to obtain a first set of risk values and a second set of risk values, wherein the risk values in the first set of risk values are greater than the risk values in the second set of risk values. And calculating the risk ratio of the first risk value set and the second risk value set through a second preset algorithm according to the first risk value set and the second risk value set. And acquiring the difference significance of the first risk value set and the second risk value set according to the first risk value set and the second risk value set and the test algorithm. And determining whether the miRNA is a marker of the target gene according to the risk ratio and the difference significance.
In a third aspect, an embodiment of the present invention provides an apparatus for identifying a miRNA sponge module, including: the miRNA sponge module identification device comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the identification device of the miRNA sponge module is operated, the processor and the storage medium are communicated through the bus, and the processor executes the machine-readable instructions to execute the steps of any one of the methods in the first aspect.
In a fourth aspect, the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of any one of the methods in the first aspect.
In the invention, whether each sponge gene-target gene co-expression module is an miRNA sponge module or not is determined by obtaining a significance value of a shared miRNA in a plurality of sponge gene-target gene co-expression modules, a typical correlation coefficient between a sponge gene and a target gene and a sensitivity typical correlation coefficient when the miRNA is shared, and according to the number of the sponge genes and the target gene in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge gene and the target gene and the sensitivity typical correlation coefficient when the miRNA is shared. The competitive strength between the sponge gene and the target gene can be measured at the module level, and whether the sponge gene-target gene co-expression module is the miRNA sponge module or not is determined according to the competitive strength between the sponge gene and the target gene, so that the identification of the miRNA sponge module is more accurate.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flow chart of a method for identifying a miRNA sponge module according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 3 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 4 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 5 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 6 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 7 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 8 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 9 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention;
fig. 10 is a schematic diagram illustrating a comparison of co-expression levels of miRNA sponge modules and corresponding random modules in scenario one according to an embodiment of the present invention;
fig. 11 is a schematic diagram illustrating a comparison of co-expression levels of miRNA sponge modules and corresponding random modules in scenario two according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of an identification apparatus for a miRNA sponge module according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of an identification device of a miRNA sponge module according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters denote like items in the following figures and formulas, and thus, once an item is defined in one figure or formula, it need not be further defined and explained in subsequent figures or formulas.
Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
Fig. 1 is a schematic flow chart of a method for identifying a miRNA sponge module according to an embodiment of the present invention. The main execution body of the identification method of the miRNA sponge module can be a terminal device with computing capacity, such as: a desktop computer, a notebook computer, a server, a cloud, a customized terminal or an intelligent terminal, etc., which are not limited herein.
As shown in fig. 1, the identification method of the miRNA sponge module includes:
s110, obtaining an expression matrix of the sponge genes and an expression matrix of the target genes of the matched sample.
In some embodiments, the expression matrix of the sponge gene and the expression matrix of the target gene can be obtained by obtaining the sponge gene expression profile data and the gene expression profile data of the target gene according to the sponge gene expression profile data and the gene expression profile data of the target gene of the matched sample, because the gene expression data is usually in a matrix form.
In some embodiments, the gene expression profile data may be obtained by preparing a gene chip, for example, by fixing a plurality of gene probes on a carrier such as a glass plate, a polypropylene or a nylon membrane in a predetermined arrangement to form a gene chip. The gene probe is then purified, reverse-transcribed or fluorescently labeled to obtain a labeled probe, wherein the fluorescent label is usually labeled with Cye3-dUTP (Cy3) and Cye5-dUTP (Cy 5). Then, for example, in the case of fluorescence labeling, after the labeled probe is hybridized with the chip for a certain time according to the set conditions, the unbound probe is washed away, and the fluorescence signal is scanned and analyzed to obtain a hybridization image. And finally, extracting the gene expression profile data from the hybrid image to obtain the gene expression profile data.
Wherein, the spongiform gene and the target gene are targets of miRNA, and the type of RNA can be one of other coding RNAs, pseudogenes (pseudogenes), circular ribonucleic acids (CircRNA), LncRNA group and mRNA group.
S120, obtaining a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes.
Wherein each sponge gene-target gene co-expression module represents a gene which can be co-expressed by the sponge gene and the target gene.
In some embodiments, the expression matrix D of the sponge genes1Can be as follows:
expression matrix D of target genes2Can be as follows:
wherein G represents a gene in the expression matrix, R represents a matching sample, S is the number of matching samples, n1Represents the number of sponge genes, n, in each matched sample2Indicating the number of target genes in each matched sample.
S130, obtaining a significance value of the shared miRNA, a typical correlation coefficient between the sponge gene and the target gene and a sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module.
It should be noted that the significance value of shared miRNA, the typical correlation coefficient between the sponge gene and the target gene, and the typical correlation coefficient of sensitivity when miRNA is shared are important parameters in the miRNA sponge module competition mechanism.
Wherein, the miRNA expression profile data sample must be matched with the sponge gene and target gene expression profile data sample, and is expressed asWhere S is the number of matched samples, n3Representing mi in each matching sampleThe number of RNAs.
In some embodiments, there are five types of RNA groups in the miRNA sponge module competition mechanism: other coding RNA groups, Pseudogene group, CircRNA group, LncRNA group and mRNA group, typical competition types include: the other encoding RNA group, Pseudogene group, CircRNA group or LncRNA group compete with the mRNA group.
In some embodiments, the mRNA population will be entirely translated into protein if the other coding RNA population, Pseudogene population, CircRNA population, or LncRNA population competes for success, and will be entirely degraded if the mRNA population competes for success.
Alternatively, other competition patterns exist, such as other coding RNA groups and Pseudogene groups, other coding RNA groups and CircRNA groups, other coding RNA groups and LncRNA groups, Pseudogene groups and CircRNA groups, Pseudogene groups and LncRNA groups, and CircRNA groups and LncRNA groups, etc., without limitation.
S140, determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of the sponge genes and the target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared.
It should be noted that when the target gene competition in the sponge gene-target gene co-expression module is successfully determined according to the miRNA sponge module competition mechanism, specific parameters of the significance value of the shared miRNA, the typical correlation coefficient between the sponge gene and the target gene, and the sensitivity typical correlation coefficient when the miRNA is shared may be determined, so as to determine whether the sponge gene-target gene co-expression module is the miRNA sponge module according to the parameters.
In this embodiment, whether each sponge gene-target gene co-expression module is a miRNA sponge module is determined by obtaining a significance value of a shared miRNA, a typical correlation coefficient between a sponge gene and a target gene, and a sensitivity typical correlation coefficient when the miRNA is shared among a plurality of sponge gene-target gene co-expression modules, and according to the number of the sponge gene and the target gene in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge gene and the target gene, and the sensitivity typical correlation coefficient when the miRNA is shared. The competitive strength between the sponge gene and the target gene can be measured at the module level, and whether the sponge gene-target gene co-expression module is the miRNA sponge module or not is determined according to the competitive strength between the sponge gene and the target gene, so that the identification of the miRNA sponge module is more accurate.
Fig. 2 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 2, obtaining a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes comprises:
s121, clustering the expression matrix of the sponge genes and the expression matrix of the target genes according to a clustering algorithm to obtain a plurality of clustering results of the sponge genes and the target genes.
In some embodiments, by clustering, the co-expressible sponge gene and the target gene can be integrated to obtain multiple clustering results.
In an alternative embodiment, the clustering algorithm comprises: a one-way clustering algorithm or a two-way clustering algorithm. And if the clustering algorithm is a one-way clustering algorithm, clustering the sponge genes and the target genes according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matched sample. And if the clustering algorithm is a bidirectional clustering algorithm, clustering the sponge genes, the target genes and the matching samples of the preset part according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matching samples of the preset part.
In some embodiments, the one-way clustering algorithm comprises: the method includes, but is not limited to, a weighted gene co-expression network analysis (WGCNA), a K-Means Clustering method (K-Means), a hierarchical Clustering method, a Density-Noise-Based applied space Clustering method (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), and a Fuzzy C-Clustering method (FCM). Using WGCNA as an example, we explain the unidirectional clustering forThe sponge genes and the target genes determined by the expression matrix are co-expressed with the similarity value s of each pair of genes i and jijComprises the following steps:
sij=|cor(i,j)|
wherein, | cor (i, j) | is the absolute value of Pearson correlation coefficient of gene i and gene j.
The gene co-expression similarity matrix may be defined as S ═ sij]Selecting a soft threshold using a scale-free topology criterion and converting the similarity matrix into an adjacency matrix A based on the soft threshold, typically a minimum scale-free topology fit index R2Usually not less than 0.8. Based on the adjacency matrix a, WGCNA generates a Topological Overlay Matrix (TOM) W ═ wij]. TOM similarity value w of gene i and gene jijComprises the following steps:
wherein u represents all genes in the sponge gene expression matrix and the target gene expression matrix in the matched sample, and TOM dissimilarity values of genes i and j are dij=1-wij. To identify the spongio-target gene co-expression module, WGCNA used Hierarchical Clustering (HC) to the non-similarity matrix D ═ di for TOMj]And (6) clustering. The identified sponge gene-target gene co-expression module has high topological overlapping.
The bidirectional clustering algorithm comprises the following steps: one of Sparse Group Factorial Analysis (SGFA), binary Group factorial Analysis (for bicomputer Acquisition) FABIA, binary mass spectrometry (bcspectra), lattice model bicompartment, etc. (bcspectra), but not limited thereto. Taking SGFA as an example, bidirectional clustering is explained, for the sponge genes and the target genes determined by the expression matrix, the number of the bidirectional clusters to be mined is set as B, and the membership or association degree of each gene (sponge gene and target gene) and each matched sample belonging to each bidirectional cluster needs to be calculated. Wherein the nth gene and the kth geneBidirectional clustering relevance gn,kAnd the degree of association between the d-th matched sample and the k-th bidirectional clusterThe calculation method of (c) is as follows:
wherein:
m represents the input data set label, and since the SGFA input data are two data sets of a spongiform gene expression matrix and a target gene expression matrix, the maximum value of m is 2. Binary variableRepresents whether the kth bidirectional cluster contains the nth gene (1 is contained, 0 is not contained),representing the scale factor of the k-th bi-directional cluster. Binary variableRepresents whether the kth bidirectional cluster contains the d-th sample (containing 1 and not containing 0) from the m-th expression profile data,and the scale factor represents the kth bidirectional cluster in the mth expression profile data.Is composed ofProbability of time, hyper-parameter aπ,bπ,aα,bαAnd an initial rate of change parameter δ0Default to 1. Degree of association gn,kAndhas a value range of [ -1,1 [)]The strength of association of each gene and matching sample with each bi-directional cluster is determined by the Absolute Value of Association (AVA), which is typically not less than 0.8.
S122, using the sponge genes and the target genes in each clustering result as a sponge gene-target gene co-expression module.
For example, in the clustering method in S121, a plurality of clustering results can be obtained, each clustering result includes a sponge gene and a target gene, i.e., the sponge gene and the target gene are used as a sponge gene-target gene co-expression module.
In this embodiment, the sponge genes and the target genes are clustered by a unidirectional clustering or bidirectional clustering method to obtain a sponge gene-target gene co-expression module, so that the obtained sponge gene-target gene co-expression module is more accurate, and the obtained miRNA sponge module is more accurate when the miRNA sponge module is determined.
In an alternative embodiment, obtaining a significance value for a shared miRNA comprises: and obtaining the significance value of the shared miRNA between the sponge gene and the target gene in the sponge gene-target gene co-expression module through a hyper-geometric distribution test algorithm according to a preset miRNA-target gene regulation relationship.
In some embodiments, the predetermined miRNA-target gene regulatory relationship is obtained by fusing a plurality of different experimentally validated databases. The preset miRNA-target gene regulation relationship comprises the following steps: and miRNA-lncRNA regulation relation data and miRNA-mRNA regulation relation data. Wherein, the miRNA-lncRNA regulation and control relation data can be obtained by integrating two databases of NPInter v3.0 and LncBasev2.0 experimental modules, and the miRNA-mRNA regulation and control relation data can be obtained by integrating three databases of mirTarBase v7.0, TarBase v7.0 and mirWalk v2.0, but not limited thereto.
In some embodiments, the significance value p of sharing miRNA between the spongiform gene and the target gene can be calculated by the following formula:
wherein N is1Represents the number of all miRNAs in the data set, M1And K1Number of miRNAs regulating the spongiform gene and the target gene, L1Number of miRNAs, L, shared by spongiform and target genes1The value is usually equal to or greater than 3.
In the embodiment, a preset miRNA-target gene regulation and control relation is obtained by fusing a plurality of different experimental verification database modes, the significance value of the shared miRNA between the sponge gene and the target gene is obtained through hyper-geometric distribution inspection according to the regulation and control relation, the accuracy of the obtained significance value is higher, the obtained significance value of the shared miRNA between the sponge gene and the target gene is used for determining the miRNA sponge module, and the determined miRNA sponge module is more accurate.
Fig. 3 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 3, a typical correlation coefficient between a sponge gene and a target gene is obtained, including:
s131, obtaining column vectors of the sponge genes and column vectors of the target genes according to the expression matrix of the sponge genes and the expression matrix of the target genes.
In some embodiments, the sponge gene-target gene co-expression module comprises a column vector of the sponge gene and the target geneThe column vector of (a) is an expression matrix of the sponge gene and an expression matrix of the target gene, and the elements of each column are collected, for example, the column vector in the expression matrix of the sponge gene may be X ═ (X)1,x2,...,xp)TThe column vector in the target gene expression matrix may be Y ═ (Y)1,y2,...,yq)TBut not limited thereto.
S132, obtaining a variance matrix and a covariance matrix between the expression matrix of the sponge genes and the expression matrix of the target genes.
In some embodiments, Σ may be usedXXSum ΣYYRespectively representing the variance matrix, Σ, calculated from the matrices X and YXYRepresenting the covariance matrix between the X and Y matrices.
S133, calculating a typical correlation coefficient according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector.
In one possible implementation, the Canonical Correlation (CC) coefficient between the sponge gene and the target gene can be calculated by the following formula:
wherein, the RNA1Representing the sponge Gene, RNA2Denotes the target gene, a a ∈ RpAnd b b ∈ RqTo maximize the typical correlation coefficient (corr (a)TX,bTY)) a preset representative vector of values.
In the embodiment, the column vector of the sponge gene, the column vector of the target gene, the variance matrix and the covariance matrix are obtained through the expression matrix of the sponge gene and the expression matrix of the target gene, a typical correlation coefficient is calculated according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector, and the typical correlation coefficient is used for determining the miRNA sponge module, so that the determined miRNA sponge module is more accurate.
Fig. 4 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 4, typical correlation coefficients for sensitivity when sharing mirnas are obtained, including:
s134, obtaining a typical correlation coefficient between the shared miRNA and the sponge gene and a typical correlation coefficient between the shared miRNA and the target gene.
In some embodiments, the way of obtaining the canonical correlation coefficient between the shared miRNA and the spongin gene and the canonical correlation coefficient between the shared miRNA and the target gene is the same as in S131-S133, and is not described herein again.
S135, calculating a partial canonical correlation coefficient between the sponge gene and the target gene according to the canonical correlation coefficient between the sponge gene and the target gene, the canonical correlation coefficient between the shared miRNA and the sponge gene, and the canonical correlation coefficient between the shared miRNA and the target gene.
In some embodiments of the present invention, the substrate is,represents the partial typical correlation coefficient between the sponge gene and the target gene, namely the typical correlation coefficient between the sponge gene and the target gene under the precondition of considering the sharing of miRNAs.Can be calculated according to the following formula:
wherein,representing typical correlation coefficients between the shared miRNA group and the sponge genes,representing typical correlation coefficients between the shared set of miRNAs and the target gene.
And S136, subtracting the partial typical correlation coefficient between the sponge gene and the target gene from the typical correlation coefficient between the sponge gene and the target gene to obtain a sensitivity typical correlation coefficient when the miRNA is shared.
In some embodiments, the sponge gene-target gene co-expression module has a Sensitivity Canonical Correlation (SCC) coefficient between the sponge gene and the target geneThe definition is as follows:
in this embodiment, by incorporating miRNA expression profile data, a sensitivity typical correlation coefficient when sharing miRNAs is obtained is calculated and used to determine a miRNA sponge module under the precondition that the shared miRNAs is considered, so that the determined miRNA sponge module is more accurate.
Fig. 5 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 5, obtaining a canonical correlation coefficient between the shared miRNA and the sponge gene and a canonical correlation coefficient between the shared miRNA and the target gene comprises:
s1341, obtaining an expression matrix of the shared miRNA.
S1342, obtaining a column vector of the shared miRNA, a column vector of the sponge gene and a column vector of the target gene according to the expression matrix of the shared miRNA, the expression matrix of the sponge gene and the expression matrix of the target gene.
S1343, obtaining a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponginum gene, and obtaining a covariance matrix.
S1344, calculating a typical correlation coefficient between the shared miRNA and the sponge gene according to the column vector of the shared miRNA, the column vector of the sponge gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene, a covariance matrix and a preset typical vector.
S1345, obtaining a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, and obtaining a covariance matrix.
S1346, calculating a typical correlation coefficient between the shared miRNA and the target gene according to the column vector of the shared miRNA, the column vector of the target gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, a covariance matrix and a preset typical vector.
The methods for calculating the typical correlation coefficients in steps S1341-S1345 and S131-S133 for obtaining the typical correlation coefficients between the shared miRNA and the spongin gene and the typical correlation coefficients between the shared miRNA and the target gene are the same, and are not described herein again.
In an alternative embodiment, determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of sponge genes and target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes, and the sensitivity typical correlation coefficient when the miRNA is shared includes: and if the number of the sponge genes and the target genes in the sponge gene-target gene co-expression module is more than or equal to 2, the significance value of the shared miRNA is less than 0.05, the typical correlation coefficient between the sponge genes and the target genes is more than 0.8, and the sensitivity typical correlation coefficient when the miRNA is shared is more than 0.1, determining that the sponge gene-target gene co-expression module is the miRNA sponge module.
Fig. 6 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 6, after determining whether each sponge gene-target gene co-expression module is a miRNA sponge module, the method further comprises:
s150, if the sponge gene-target gene co-expression module is the miRNA sponge module, acquiring the correlation data of the miRNA sponge module and the target gene.
The relevant data comprises significance data of the miRNA sponge module and a preset miRNA sponge module, and whether the miRNA sponge module and the miRNA sponge module are markers of a target gene, wherein the miRNA sponge module comprises long-chain non-coding ribonucleic acid LncRNA and protein coding ribonucleic acid mRNA.
In some embodiments, after the miRNA sponge module is obtained, it may be further verified to determine whether the obtained miRNA sponge module is related to the target gene, for example, whether the miRNA sponge module is related to the breast cancer pathogenic gene may be verified, i.e., data relating the miRNA sponge module to the breast cancer pathogenic gene may be obtained.
Fig. 7 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 7, obtaining significance data for a miRNA sponge module and a pre-set miRNA sponge module comprises:
s151, obtaining a first Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each miRNA sponge module.
In some embodiments, the first Pearson correlation coefficient value ranges from [0,1], and a larger value indicates a higher level of co-expression between lncRNA and mRNA.
S152, generating a plurality of preset miRNA sponge modules according to all LncRNA and mRNA in each miRNA sponge module through a preset random algorithm.
In some embodiments, lncRNA and mRNA in each miRNA sponge module may be randomly arranged for a predetermined number of times, for example, 1000 times, and a total of 1000 random miRNA sponge modules having the same number of genes (i.e., the same number of lncRNA and mRNA) as each miRNA sponge module may be randomly generated as the predetermined miRNA sponge modules.
S153, obtaining a second Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each preset miRNA sponge module.
Wherein, referring to S152, the average co-expression level between lncRNA and mRNA in 1000 pre-defined miRNA sponge modules (i.e. the second Pearson correlation coefficient value) can be calculated as the pre-defined co-expression level for each miRNA sponge module.
And S154, acquiring significance data of the miRNA sponge module and the preset miRNA sponge module according to the first Pearson correlation coefficient value and the second Pearson correlation coefficient value through a preset pairing algorithm.
In some embodiments, the significance of differences in the level of co-expression between a miRNA sponge module and a pre-set miRNA sponge module can be compared by the Welch paired t-test (Welch's two sample t-test). The Welch paired t-test formula is as follows:
wherein, among others,andrespectively represent the average co-expression level of the miRNA sponge module and the preset miRNA sponge module,andrespectively represent the variance of the co-expression level of the miRNA sponge module and the preset miRNA sponge module, N1And N2Identical and represents the number of miRNA sponge modules.
It should be noted that the larger the calculated t value is, the smaller the difference significance p value is, which indicates that the co-expression level of the miRNA sponge module is significantly higher than the co-expression level of the predetermined miRNA sponge module, and in this embodiment, the significance p value is less than 0.05.
Fig. 8 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 8, the obtaining of significance data of the miRNA sponge module and the target gene comprises:
s155, obtaining the quantity of LncRNA and mRNA in the preset data set and the quantity of LncRNA and mRNA co-expressed with the target gene in the preset data set.
Wherein the preset data set comprises a plurality of lncrnas and mrnas, and the lncrnas and mrnas related to the target gene in the plurality of lncrnas and mrnas, and the preset data set may be a data set obtained a priori.
S156, obtaining the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA co-expressed by the miRNA sponge module and the target gene.
Wherein the quantity of LncRNA and mRNA co-expressed by the miRNA sponge modules and the target gene, namely the quantity of the sponge gene-target gene co-expression modules is the quantity of the miRNA sponge modules.
S157, calculating significance data of the miRNA sponge module and the target gene by a hyper-geometric distribution test method according to the quantity of LncRNA and mRNA in the preset data set and the quantity of LncRNA and mRNA co-expressed with the target gene in the preset data set, the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA co-expressed with the target gene in the miRNA sponge module.
In some embodiments, the target genes related to breast cancer are exemplified and calculated as follows:
wherein N is2Representing the number of genes (lncRNA and mRNA) in the data set, M2Representing the number of breast cancer genes (lncRNA and mRNA) in the data set, K2Indicates the number of genes (lncRNA and mRNA) in the miRNA sponge Module, L2Indicates the number of breast cancer genes (lncRNA and mRNA) in the miRNA sponge module.
Fig. 9 is a schematic flow chart of a method for identifying a miRNA sponge module according to another embodiment of the present invention.
In an alternative embodiment, as shown in fig. 9, determining whether a miRNA sponge module is a marker for a target gene comprises:
and S158, calculating a risk value of each matched sample through a first preset algorithm according to the miRNA.
In some embodiments, a target gene associated with breast cancer is exemplified.
Wherein the risk value for each matching sample may be calculated by the following formula:
h(t,Z)=h0(t)exp(βZ)=h0(t)exp(β1Z12Z2+...+βkZk)
where h (t, Z) is the risk function value for a breast cancer sample with a covariate Z at time t, t is the time of survival, and Z ═ Z (Z1,Z2,...,Zk) ' is a gene (LncRNA and mRNA) which may affect survival time, h0(t) is the risk function value β ═ β (β) when all covariates are taken to be 01,β2,...,βk) ' is the regression coefficient of the Cox model.
And S159, binarizing the risk value of each matched sample to obtain a first risk value set and a second risk value set, wherein the risk value in the first risk value set is larger than the risk value in the second risk value set.
In some embodiments, according to the risk function value h (t, Z) of each sample, 500 breast cancer samples are equally divided into two sets of high-risk and low-risk samples, i.e. a first set of risk values (high-risk) and a second set of risk values (low-risk), wherein the binarization may be performed by: sorting the risk function values of each sample in a descending order, dividing according to preset percentages, and determining a first risk value set and a second risk value set, for example, the first 50% can be used as the first risk value set, and the second 50% can be used as the second risk value set; or the first 20% as the first risk value set and the second 80% as the second risk value set, but not limited thereto. It should be noted that the set of first risk values should account for at least the first 50% of all risk function values.
And S1510, calculating a risk ratio of the first risk value set and the second risk value set through a second preset algorithm according to the first risk value set and the second risk value set.
In some embodiments, the risk ratio may be calculated by:
HR=h(t,Zh)/h(t,Zl)=exp[β(Zh-Zl)]
wherein, h (t, Z)h) Risk function value of high risk group for breast cancer, h (t, Z)l) The risk function value for the breast cancer low risk group,are high risk genes (LncRNA and mRNA) that may affect survival time,are low risk genes (LncRNA and mRNA) that may affect survival time.
In one possible implementation, the threshold value for HR may be set to 2.
S1511, according to the first risk value set and the second risk value set, and according to the inspection algorithm, obtaining the difference significance of the first risk value set and the second risk value set.
In some embodiments, the Log-rank test (Log-rank test) can be used to compare the survival time of the high-risk and low-risk groups of breast cancer samples to be the same, with the test statistic chi2The calculation is as follows:
wherein A is the number of observed breast cancer death cases, and T is the theoretical number of breast cancer death cases. Calculated χ2The larger the value, the smaller the difference significance p-value, indicating that the survival time of the two groups of samples with high and low risk of breast cancer is more different.
And S1512, determining whether the miRNA is a marker of the target gene according to the risk ratio and the difference significance.
Wherein, the miRNA sponge module is identified as the breast cancer module biomarker when the HR value is more than 2 and the significance p value of the logarithmic rank test is less than 0.05, but not limited thereto.
Here, the application of the identification method of the miRNA sponge module provided in the present application is explained by the following two application scenarios, and it is clear to those skilled in the art that the following examples are only examples, and do not necessarily represent such an implementation.
Scene one
First, an isomeric data source is obtained, and miRNA, lncRNA and mRNA expression profile data of a breast cancer matching sample and survival data information of the breast cancer sample are collected from the cancer gene expression profile database TCGA (https:// cancer gene. nih. gov /). 674 miRNAs, 12711 lncRNA and 18344 mRNA expression profile data of 500 breast cancer matched samples are finally obtained through pretreatment (removing repeated items and miRNA, lncRNA and mRNA without gene names). In this scenario, RNA1Is LncRNA, RNA2Is mRNA. Thus:
D1={G1,1;G1,2;...;G1,500}∈R500×1271
D2={G2,1;G2,2;...;G2,500)∈R500×18344
D3={G3,1;G3,2;...;G3,500}∈R500×674
then, identifying the lncRNA-mRNA co-expression module, giving the lncRNA and mRNA expression profile data of the matched sample, and identifying the lncRNA-mRNA co-expression module by adopting a WGCNA co-expression network analysis method. Among them, the minimum unscaled topology fitting index R in the WGCNA method2Set to 0.8.
Based on the lncRNA-mRNA co-expression module, the miRNA sponge module is identified by calculating three measurement indexes of the shared miRNA significance p value, the canonical correlation coefficient and the sensitivity canonical correlation coefficient. The number of lncRNA and mRNA of each miRNA sponge module is not less than 2, and the conditions are met: sharing significance p-value of miRNAs<0.05, typical correlation coefficientCorrelation coefficient typical of sensitivity
And finally, evaluating the miRNA sponge module, and determining the number of biomarkers of the breast cancer module in the miRNA sponge module.
Fig. 10 is a schematic diagram illustrating a comparison of co-expression levels of miRNA sponge modules and corresponding random modules in scenario one according to the present invention.
In scenario one, a total of 17 miRNA sponge modules were identified as shown in table 1. As shown in fig. 10, the co-expression level of the mined 17 miRNA sponge modules was significantly higher than the corresponding random module co-expression level (significance p-value 1.55E-05). As shown in tables 2 and 3, of the 17 miRNA sponge modules, 10 are associated with breast cancer enrichment, and 15 can serve as breast cancer module biomarkers.
Table 1 miRNA sponge module mined in scene one
Table 2 scene one miRNA sponge modules associated with breast cancer enrichment
Table 3 scene one Breast cancer miRNA sponge Module as biomarker
Scene two
In the second scenario, given lncRNA and mRNA expression profile data of the matched sample, and identifying the lncRNA-mRNA coexpression module by adopting an SGFA sparse group factor analysis method. Here, the threshold value of absolute association degree (AVA) in the SGFA method is set to 0.8. The other steps are the same as scenario one.
Fig. 11 is a schematic diagram illustrating a comparison of co-expression levels of miRNA sponge modules and corresponding random modules in scenario two in the present embodiment.
In scenario two, a total of 51 miRNA sponge modules were identified as shown in table 4. As shown in fig. 11, the co-expression level of the mined 51 miRNA sponge modules was significantly higher than the corresponding random module co-expression level (significance p-value 1.55E-14). As shown in tables 5 and 6, of the 51 miRNA sponge modules, 3 were associated with breast cancer enrichment, and 49 could serve as breast cancer module biomarkers.
Table 4 MiRNA sponge module mined in scene two
Second in Table 5 MiRNA sponge modules related to breast cancer enrichment
Table 6 Scenario two Breast cancer miRNA sponge module serving as biomarker
Fig. 12 is a schematic structural diagram of an identification apparatus for a miRNA sponge module according to an embodiment of the present invention
As shown in fig. 12, the identification device of miRNA sponge module includes:
the obtaining module 210 is configured to obtain an expression matrix of the sponge gene and an expression matrix of the target gene that match the sample. The obtaining module 210 is further configured to obtain a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes, where each sponge gene-target gene co-expression module represents a gene that can be co-expressed by the sponge genes and the target genes. The obtaining module 210 is further configured to obtain a significance value of the shared miRNA, a typical correlation coefficient between the sponge gene and the target gene, and a sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module. The determining module 220 is configured to determine whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of sponge genes and target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes, and the sensitivity typical correlation coefficient when the miRNA is shared.
In an optional embodiment, the obtaining module 210 is specifically configured to cluster the expression matrix of the sponge genes and the expression matrix of the target genes according to a clustering algorithm, so as to obtain a plurality of clustering results of the sponge genes and the target genes. And taking the sponge gene and the target gene in each clustering result as a sponge gene-target gene co-expression module.
In an alternative embodiment, the clustering algorithm comprises: a one-way clustering algorithm or a two-way clustering algorithm. And if the clustering algorithm is a one-way clustering algorithm, clustering the sponge genes and the target genes according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matched sample. And if the clustering algorithm is a bidirectional clustering algorithm, clustering the sponge genes, the target genes and the matching samples of the preset part according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matching samples of the preset part.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain, according to a preset miRNA-target gene regulation relationship, a significance value of a miRNA shared between a sponge gene and a target gene in the sponge gene-target gene co-expression module by a hyper-geometric distribution test method.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the sponge genes and the expression matrix of the target genes. And calculating the typical correlation coefficient according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain a typical correlation coefficient between the shared miRNA and the spongin gene and a typical correlation coefficient between the shared miRNA and the target gene. And calculating a partial canonical correlation coefficient between the sponge gene and the target gene according to the canonical correlation coefficient between the sponge gene and the target gene, the canonical correlation coefficient between the shared miRNA and the sponge gene, and the canonical correlation coefficient between the shared miRNA and the target gene. And subtracting the partial typical correlation coefficient between the sponge gene and the target gene from the typical correlation coefficient between the sponge gene and the target gene to obtain the sensitivity typical correlation coefficient when the miRNA is shared.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain an expression matrix of shared mirnas. And acquiring the column vector of the shared miRNA, the column vector of the sponge gene and the column vector of the target gene according to the expression matrix of the shared miRNA, the expression matrix of the sponge gene and the expression matrix of the target gene. And acquiring a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene. And calculating a typical correlation coefficient between the shared miRNA and the sponge gene according to the column vector of the shared miRNA, the column vector of the sponge gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene, the covariance matrix and a preset typical vector. Obtaining a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene. And calculating a typical correlation coefficient between the shared miRNA and the target gene according to the column vector of the shared miRNA, the column vector of the target gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, the covariance matrix and a preset typical vector.
In an alternative embodiment, the determining module 220 is specifically configured to determine that the sponge gene-target gene co-expression module is the miRNA sponge module if the numbers of the sponge genes and the target genes in the sponge gene-target gene co-expression module are both greater than or equal to 2, the significance value of the shared miRNA is less than 0.05, the typical correlation coefficient between the sponge genes and the target genes is greater than 0.8, and the sensitivity typical correlation coefficient when the miRNA is shared is greater than 0.1.
In an optional embodiment, the obtaining module 210 is further configured to obtain correlation data between the miRNA sponge module and the target gene if the sponge gene-target gene co-expression module is the miRNA sponge module, where the correlation data includes significance data of the miRNA sponge module and a preset miRNA sponge module, and whether the miRNA sponge module and the miRNA sponge module are markers of the target gene, and the miRNA sponge module includes Long non-coding RNA (lncRNA) and mRNA.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain a first Pearson (Pearson) correlation coefficient value of the mean absolute value of all LncRNA-mRNA co-expression pairs in each miRNA sponge module. And generating a plurality of preset miRNA sponge modules according to all LncRNA and mRNA in each miRNA sponge module by a preset random algorithm. And obtaining a second Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each preset miRNA sponge module. And acquiring significance data of the miRNA sponge module and the preset miRNA sponge module according to the first Pearson correlation coefficient value and the second Pearson correlation coefficient value by a preset pairing difference detection algorithm.
In an alternative embodiment, the obtaining module 210 is specifically configured to obtain the number of LncRNA and mRNA in the preset data set and the number of LncRNA and mRNA co-expressed with the target gene in the preset data set. And acquiring the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA coexpressed by the miRNA sponge module and the target gene. And calculating significance data of the miRNA sponge module and the target gene through a hyper-geometric distribution inspection algorithm according to the number of LncRNA and mRNA in the preset data set and the number of LncRNA and mRNA co-expressed with the target gene in the preset data set, the number of LncRNA and mRNA in the miRNA sponge module and the number of LncRNA and mRNA co-expressed with the target gene in the miRNA sponge module.
In an alternative embodiment, the obtaining module 210 is specifically configured to calculate the risk value of each matching sample according to miRNA by a first preset algorithm. Binarizing the risk value of each matched sample to obtain a first set of risk values and a second set of risk values, wherein the risk values in the first set of risk values are greater than the risk values in the second set of risk values. And calculating the risk ratio of the first risk value set and the second risk value set through a second preset algorithm according to the first risk value set and the second risk value set. And acquiring the difference significance of the first risk value set and the second risk value set according to the first risk value set and the second risk value set and the test algorithm. And determining whether the miRNA is a marker of the target gene according to the risk ratio and the difference significance.
Since the identification device of the miRNA sponge module is used for realizing the identification method of the miRNA sponge module, the beneficial effects are the same, and the description is omitted here.
Fig. 11 is a schematic structural diagram of an identification device of a miRNA sponge module according to an embodiment of the present invention.
These above modules may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more microprocessors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), among others. For another example, when one of the above modules is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a Central Processing Unit (CPU) or other processor capable of calling program code. For another example, these modules may be integrated together and implemented in the form of a System-On-a-Chip (SOC).
Fig. 13 is a schematic structural diagram of an identification apparatus for a miRNA sponge module according to an embodiment of the present disclosure.
As shown in fig. 13, the identification device of miRNA sponge module includes: the identification device comprises a processor 301, a storage medium 302 and a bus 303, wherein the storage medium 302 stores machine readable instructions executable by the processor, when the identification device of the miRNA sponge module runs, the processor 301 and the storage medium 302 communicate with each other through the bus 303, and the processor 301 executes the machine readable instructions to execute the steps of the identification method of the miRNA sponge module.
The identification device of the miRNA sponge module may be a general-purpose computer, a server, or a mobile terminal, and is not limited herein. The identification device of the miRNA sponge module is used to implement the above-described method embodiments of the present application.
It is noted that processor 301 may include one or more processing cores (e.g., a single-core processor or a multi-core processor). Merely by way of example, a Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.
The storage medium 302 may include: including mass storage, removable storage, volatile Read-and-write Memory, or Read-Only Memory (ROM), among others, or any combination thereof. By way of example, mass storage may include magnetic disks, optical disks, solid state drives, and the like; removable memory may include flash drives, floppy disks, optical disks, memory cards, zip disks, tapes, and the like; volatile read-write Memory may include Random Access Memory (RAM); the RAM may include Dynamic RAM (DRAM), Double data Rate Synchronous Dynamic RAM (DDR SDRAM); static RAM (SRAM), Thyristor-Based Random Access Memory (T-RAM), Zero-capacitor RAM (Zero-RAM), and the like. By way of example, ROMs may include Mask Read-Only memories (MROMs), Programmable ROMs (PROMs), erasable Programmable ROMs (PERROMs), Electrically Erasable Programmable ROMs (EEPROMs), compact disk ROMs (CD-ROMs), digital versatile disks (ROMs), and the like.
For ease of illustration, only one processor 301 is depicted in the identification apparatus of the miRNA sponge module. However, it should be noted that the identification device of the miRNA sponge module in the present application may also comprise a plurality of processors 301, and thus the steps performed by one processor described in the present application may also be performed by a plurality of processors in combination or individually. For example, if the processor 301 of the identification device of the miRNA sponge module performs steps a and B, it should be understood that steps a and B may also be performed by two different processors together or separately in one processor. For example, a first processor performs step a and a second processor performs step B, or the first processor and the second processor perform steps a and B together.
Optionally, the present invention further provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, performs the steps of the identification method of the miRNA sponge module.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a Processor (Processor) to execute some steps of the methods according to the embodiments of the present invention. And the aforementioned storage medium includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (13)

1. A method for identifying a micro ribonucleic acid (miRNA) sponge module is characterized by comprising the following steps:
obtaining an expression matrix of the sponge genes and an expression matrix of the target genes which are matched with the sample;
obtaining a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes, wherein each sponge gene-target gene co-expression module represents a gene which can be co-expressed by the sponge genes and the target genes;
obtaining a significance value of a shared miRNA, a typical correlation coefficient between a sponge gene and a target gene and a sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module;
and determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of the sponge genes and the target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared.
2. The method of claim 1, wherein obtaining a plurality of sponge gene-target gene co-expression modules from the expression matrix of the sponge genes and the expression matrix of the target genes comprises:
clustering the expression matrix of the sponge genes and the expression matrix of the target genes according to a clustering algorithm to obtain a plurality of clustering results of the sponge genes and the target genes;
and taking the sponge gene and the target gene in each clustering result as one sponge gene-target gene co-expression module.
3. The method of claim 2, wherein the clustering algorithm comprises: a unidirectional clustering algorithm or a bidirectional clustering algorithm;
if the clustering algorithm is the unidirectional clustering algorithm, clustering the sponge genes and the target genes according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matched sample;
and if the clustering algorithm is the bidirectional clustering algorithm, clustering the sponge genes, the target genes and the matching samples of the preset part according to the expression matrix of the sponge genes, the expression matrix of the target genes and the matching samples of the preset part.
4. The method of any one of claims 1-3, wherein obtaining a significance value for shared miRNAs comprises:
and obtaining the significance value of the miRNA shared between the sponge gene and the target gene in the sponge gene-target gene co-expression module by a super-geometric distribution test method according to a preset miRNA-target gene regulation relation.
5. The method of any one of claims 1 to 3, wherein obtaining a canonical correlation coefficient between the spongiform gene and the target gene comprises:
acquiring column vectors of the sponge genes and column vectors of the target genes according to the expression matrix of the sponge genes and the expression matrix of the target genes;
acquiring a variance matrix and a covariance matrix between the expression matrix of the sponge genes and the expression matrix of the target genes;
and calculating the typical correlation coefficient according to the column vector of the sponge gene, the column vector of the target gene, the variance matrix, the covariance matrix and a preset typical vector.
6. The method of any one of claims 1-3, wherein obtaining a sensitivity canonical correlation coefficient when sharing miRNAs comprises:
obtaining a typical correlation coefficient between a shared miRNA and the sponge gene and a typical correlation coefficient between the shared miRNA and the target gene;
calculating a partial canonical correlation coefficient between the sponge gene and the target gene according to the canonical correlation coefficient between the sponge gene and the target gene, the canonical correlation coefficient between the shared miRNA and the sponge gene, and the canonical correlation coefficient between the shared miRNA and the target gene;
and subtracting the partial canonical correlation coefficient between the sponge gene and the target gene from the canonical correlation coefficient between the sponge gene and the target gene to obtain the sensitivity canonical correlation coefficient when the miRNA is shared.
7. The method of claim 6, wherein obtaining the canonical correlation coefficient between shared miRNA and the sponge gene and the canonical correlation coefficient between shared miRNA and the target gene comprises:
obtaining an expression matrix of the shared miRNA;
obtaining column vectors of the shared miRNA, the sponge genes and the target genes according to the expression matrix of the shared miRNA, the expression matrix of the sponge genes and the expression matrix of the target genes;
obtaining a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene;
calculating a typical correlation coefficient between the shared miRNA and the sponge gene according to the column vector of the shared miRNA, the column vector of the sponge gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the sponge gene, a covariance matrix and a preset typical vector;
obtaining a variance matrix and a covariance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene;
calculating a typical correlation coefficient between the shared miRNA and the target gene according to the column vector of the shared miRNA, the column vector of the target gene, a variance matrix between the expression matrix of the shared miRNA and the expression matrix of the target gene, a covariance matrix and a preset typical vector.
8. The method of claim 1, wherein the determining whether each of the sponge gene-target gene co-expression modules is a miRNA sponge module according to the number of sponge genes and target genes in each of the sponge gene-target gene co-expression modules, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes, and the sensitivity typical correlation coefficient when the miRNA is shared comprises:
and if the number of the sponge genes and the target genes in the sponge gene-target gene co-expression module is more than or equal to 2, the significance value of the shared miRNA is less than 0.05, the typical correlation coefficient between the sponge genes and the target genes is more than 0.8, and the sensitivity typical correlation coefficient when the miRNA is shared is more than 0.1, determining that the sponge gene-target gene co-expression module is the miRNA sponge module.
9. The method of claim 1, further comprising, after determining whether each of the sponge gene-target gene co-expression modules is a miRNA sponge module:
and if the sponge gene-target gene co-expression module is an miRNA sponge module, acquiring correlation data of the miRNA sponge module and a target gene, wherein the correlation data comprises significance data of the miRNA sponge module and a preset miRNA sponge module, and whether the miRNA sponge module and the miRNA sponge module are markers of the target gene, and the miRNA sponge module comprises a long-chain non-coding ribonucleic acid (LncRNA) and a protein coding ribonucleic acid (mRNA).
10. The method of claim 9, wherein obtaining significance data for the miRNA sponge modules and pre-established miRNA sponge modules comprises:
obtaining a first Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each miRNA sponge module;
generating a plurality of preset miRNA sponge modules according to all LncRNA and mRNA in each miRNA sponge module by a preset random algorithm;
obtaining a second Pearson correlation coefficient value of the average absolute value of all LncRNA-mRNA coexpression pairs in each preset miRNA sponge module;
and acquiring significance data of the miRNA sponge module and a preset miRNA sponge module according to the first Pearson correlation coefficient value and the second Pearson correlation coefficient value by a preset pairing difference inspection algorithm.
11. The method of claim 9, wherein obtaining significance data of the miRNA sponge module and the target gene comprises:
acquiring the quantity of LncRNA and mRNA in a preset data set and the quantity of LncRNA and mRNA coexpressed with a target gene in the preset data set;
acquiring the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA coexpressed by the miRNA sponge module and a target gene;
and calculating significance data of the miRNA sponge module and the target gene by a hyper-geometric distribution test method according to the quantity of LncRNA and mRNA in the preset data set and the quantity of LncRNA and mRNA co-expressed with the target gene in the preset data set, the quantity of LncRNA and mRNA in the miRNA sponge module and the quantity of LncRNA and mRNA co-expressed with the target gene in the miRNA sponge module.
12. The method of claim 9, wherein determining whether the miRNA sponge module is a marker for a target gene comprises:
calculating a risk value of each matched sample through a first preset algorithm according to the miRNA;
binarizing the risk value of each of the matched samples to obtain a first set of risk values and a second set of risk values, wherein the risk value in the first set of risk values is greater than the risk value in the second set of risk values;
calculating a risk ratio of the first risk value set and the second risk value set through a second preset algorithm according to the first risk value set and the second risk value set;
obtaining, from the first set of risk values and the second set of risk values, a difference significance of the first set of risk values and the second set of risk values according to a test algorithm;
determining whether the miRNA is a marker of the target gene according to the risk ratio and the difference significance.
13. An identification device for miRNA sponge modules, comprising:
the acquisition module is used for acquiring an expression matrix of the sponge genes and an expression matrix of the target genes which are matched with the sample;
the acquisition module is further used for acquiring a plurality of sponge gene-target gene co-expression modules according to the expression matrix of the sponge genes and the expression matrix of the target genes, wherein each sponge gene-target gene co-expression module represents a gene which can be co-expressed by the sponge genes and the target genes;
the acquisition module is also used for acquiring the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared in each sponge gene-target gene co-expression module;
and the determining module is used for determining whether each sponge gene-target gene co-expression module is a miRNA sponge module according to the number of the sponge genes and the target genes in each sponge gene-target gene co-expression module, the significance value of the shared miRNA, the typical correlation coefficient between the sponge genes and the target genes and the sensitivity typical correlation coefficient when the miRNA is shared.
CN201910684738.7A 2019-07-26 2019-07-26 Identification method and device of miRNA sponge module Expired - Fee Related CN110322926B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910684738.7A CN110322926B (en) 2019-07-26 2019-07-26 Identification method and device of miRNA sponge module

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910684738.7A CN110322926B (en) 2019-07-26 2019-07-26 Identification method and device of miRNA sponge module

Publications (2)

Publication Number Publication Date
CN110322926A true CN110322926A (en) 2019-10-11
CN110322926B CN110322926B (en) 2021-06-08

Family

ID=68124564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910684738.7A Expired - Fee Related CN110322926B (en) 2019-07-26 2019-07-26 Identification method and device of miRNA sponge module

Country Status (1)

Country Link
CN (1) CN110322926B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110993020A (en) * 2019-11-01 2020-04-10 电子科技大学 Identification method of miRNA sponge interaction
CN111370062A (en) * 2020-03-03 2020-07-03 电子科技大学 miRNA causal control network identification method and device, electronic equipment and storage medium
CN111383709A (en) * 2020-03-09 2020-07-07 电子科技大学 Recognition method and device for CERNA competition module, electronic equipment and storage medium
CN114446396A (en) * 2021-12-17 2022-05-06 广州保量医疗科技有限公司 Group matching method, system, equipment and storage medium for intestinal flora transplantation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719195A (en) * 2009-12-03 2010-06-02 上海大学 Inference method of stepwise regression gene regulatory network
CN103514381A (en) * 2013-07-22 2014-01-15 湖南大学 Protein biological network motif identification method integrating topological attributes and functions
CN106874706A (en) * 2017-01-18 2017-06-20 湖南大学 Disease association factor identification method and system based on functional module
CN109033750A (en) * 2018-07-18 2018-12-18 温州大学 A method of miRNA is to related disease gene influence degree for quantization
CN109712717A (en) * 2018-12-27 2019-05-03 湖南大学 A kind of cancer correlation MicroRNA recognition methods based on miRNA- gene regulation module

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719195A (en) * 2009-12-03 2010-06-02 上海大学 Inference method of stepwise regression gene regulatory network
CN103514381A (en) * 2013-07-22 2014-01-15 湖南大学 Protein biological network motif identification method integrating topological attributes and functions
CN106874706A (en) * 2017-01-18 2017-06-20 湖南大学 Disease association factor identification method and system based on functional module
CN109033750A (en) * 2018-07-18 2018-12-18 温州大学 A method of miRNA is to related disease gene influence degree for quantization
CN109712717A (en) * 2018-12-27 2019-05-03 湖南大学 A kind of cancer correlation MicroRNA recognition methods based on miRNA- gene regulation module

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黎桑等: "基于基因共表达网络分析的扩张和限制型心肌病分子特征比较", 《中国生物医学工程学报》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110993020A (en) * 2019-11-01 2020-04-10 电子科技大学 Identification method of miRNA sponge interaction
CN111370062A (en) * 2020-03-03 2020-07-03 电子科技大学 miRNA causal control network identification method and device, electronic equipment and storage medium
CN111370062B (en) * 2020-03-03 2021-04-09 电子科技大学 miRNA causal control network identification method and device, electronic equipment and storage medium
CN111383709A (en) * 2020-03-09 2020-07-07 电子科技大学 Recognition method and device for CERNA competition module, electronic equipment and storage medium
CN111383709B (en) * 2020-03-09 2021-06-08 电子科技大学 Recognition method and device for CERNA competition module, electronic equipment and storage medium
CN114446396A (en) * 2021-12-17 2022-05-06 广州保量医疗科技有限公司 Group matching method, system, equipment and storage medium for intestinal flora transplantation

Also Published As

Publication number Publication date
CN110322926B (en) 2021-06-08

Similar Documents

Publication Publication Date Title
CN110322926B (en) Identification method and device of miRNA sponge module
Lall et al. Structure-aware principal component analysis for single-cell RNA-seq data
Larsson et al. Comparative microarray analysis
CN111913999B (en) Statistical analysis method, system and storage medium based on multiple groups of study and clinical data
Li et al. scImpute: accurate and robust imputation for single cell RNA-seq data
Gu et al. Bayesian inference of rna velocity from multi-lineage single-cell data
CN111028887B (en) Method and device for identifying ncRNA (non-coding ribonucleic acid) cooperative competition network
CN111383709B (en) Recognition method and device for CERNA competition module, electronic equipment and storage medium
Saei et al. A glance at DNA microarray technology and applications
CN115148291A (en) Single-sample CERNA competition module identification method and device, electronic equipment and storage medium
Yan et al. Bayesian bi-clustering methods with applications in computational biology
Qin et al. An efficient method to identify differentially expressed genes in microarray experiments
CN117616505A (en) Systems and methods for correlating compounds with physiological conditions using fingerprinting
Hassani et al. Active learning for microRNA prediction
CN113724789A (en) Single-sample CERNA network identification method, device, electronic equipment and storage medium
CN105893789A (en) Significance analysis method
CN113947149B (en) Similarity measurement method and device for gene module group, electronic device and storage medium
Tsai et al. Significance analysis of ROC indices for comparing diagnostic markers: applications to gene microarray data
CN116486908B (en) Single cell miRNA sponge network reasoning method, device, equipment and storage medium
Aris et al. A method to improve detection of disease using selectively expressed genes in microarray data
KR20190069929A (en) miRNA DATA ANALYSIS METHOD FOR SERVER
Patel et al. Cross-validation and cross-study validation of chronic lymphocytic leukemia with exome sequences and machine learning
CN112382339B (en) Method and device for identifying zygotic gene activation ZGA genes
CN111247590A (en) Detection device and method
CN110675917B (en) Biomarker identification method for individual cancer sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220831

Address after: No.21 Nie'Er Road, Hongta District, Yuxi City, Yunnan Province

Patentee after: Sun Shaoping

Address before: 611731, No. 2006, West Avenue, Chengdu hi tech Zone (West District, Sichuan)

Patentee before: University of Electronic Science and Technology of China

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210608