CN112992347A - lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection - Google Patents

lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection Download PDF

Info

Publication number
CN112992347A
CN112992347A CN202110428827.2A CN202110428827A CN112992347A CN 112992347 A CN112992347 A CN 112992347A CN 202110428827 A CN202110428827 A CN 202110428827A CN 112992347 A CN112992347 A CN 112992347A
Authority
CN
China
Prior art keywords
lncrna
disease
matrix
similarity
comprehensive
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110428827.2A
Other languages
Chinese (zh)
Inventor
陈敏
邓英伟
黎昂
谭艳
李泽军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Institute of Technology
Original Assignee
Hunan Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Institute of Technology filed Critical Hunan Institute of Technology
Priority to CN202110428827.2A priority Critical patent/CN112992347A/en
Publication of CN112992347A publication Critical patent/CN112992347A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to a lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection. Compared with the existing prediction method, the method can simultaneously predict the association of all diseases and lncRNA, can be used for predicting isolated diseases and new lncRNA, has the advantages of no need of negative samples and only one parameter, and has higher prediction accuracy.

Description

lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection
Technical Field
The invention relates to the technical field of biological information, in particular to a lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection.
Background
Long non-coding RNA (lncRNA) is a non-coding RNA with a length of more than 200 nucleotides. In recent years, there is a lot of evidence that many lncrnas are closely related to human diseases, and mutation and disorder of lncrnas can cause various diseases including cervical cancer, ovarian cancer and the like, so that identification and prediction of the relationship between lncrnas and diseases can help to explore the pathogenesis of diseases, which also makes identification and confirmation of the correlation between lncrnas and diseases an important subject in the field of biological research in recent years. However, it is very time-consuming and labor-consuming to determine the association between lncRNA and disease through biological experiments, and the use of computer technology to predict potential disease-associated lncRNA can greatly reduce the working strength, thereby saving cost and time.
Disclosure of Invention
The invention aims to provide an lncRNA-disease associated prediction method based on Laplace regularization least square and network projection, which is simple to implement and high in result accuracy.
In order to achieve the above purpose, the lncRNA-disease associated prediction method based on laplacian regularized least squares and network projection in the present invention adopts the following means:
step one, combining the similarity of disease Gaussian nuclear spectrums on the basis of the semantic similarity of diseases to obtain a comprehensive disease similarity matrix; on the basis of lncRNA functional similarity, combining lncRNA Gaussian nuclear spectrum similarity to obtain a comprehensive lncRNA similarity matrix;
step two, implementing a Laplace regularization least square method in the comprehensive disease similarity matrix to obtain a disease prediction score matrix, implementing the Laplace regularization least square method in the comprehensive lncRNA similarity matrix to obtain an lncRNA prediction score matrix, and integrating the disease prediction score matrix and the lncRNA prediction score matrix to obtain an lncRNA and disease association composite prediction score matrix;
and step three, projecting the comprehensive disease similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain a projection score matrix, projecting the comprehensive lncRNA similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain another projection score matrix, combining a transpose matrix of the projection score matrix based on the comprehensive disease similarity matrix with the projection score matrix based on the comprehensive lncRNA similarity matrix, and calculating the average value of the transpose matrix to obtain the final lncRNA and disease associated prediction score, thereby obtaining a disease associated lncRNA prediction result.
In the first step, the similarity of the disease gaussian nuclear spectrum is expressed as:
Figure RE-DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 465111DEST_PATH_IMAGE002
for disease
Figure RE-DEST_PATH_IMAGE003
And disease
Figure RE-DEST_PATH_IMAGE005
Gaussian nuclear spectrum similarity between;
Figure 50814DEST_PATH_IMAGE006
correlating lncRNA matrices for known diseases
Figure RE-DEST_PATH_IMAGE007
In the ith column of the disease in (1),
Figure RE-DEST_PATH_IMAGE009
is a matrix
Figure 399100DEST_PATH_IMAGE010
Column j of the disease; parameter(s)
Figure RE-DEST_PATH_IMAGE011
For controlling
Figure 578408DEST_PATH_IMAGE012
The bandwidth of the kernel of (a),
Figure 40483DEST_PATH_IMAGE011
calculated by the following formula:
Figure RE-DEST_PATH_IMAGE013
further, in step one, lncRNA gaussian nuclear spectrum similarity is expressed as:
Figure 621637DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 409333DEST_PATH_IMAGE015
is lncRNA
Figure RE-DEST_PATH_IMAGE016
And lncRNA
Figure 544779DEST_PATH_IMAGE017
Gaussian nuclear spectrum similarity between;
Figure 889173DEST_PATH_IMAGE018
is a matrix
Figure 839680DEST_PATH_IMAGE010
In column i of the middle lncRNA,
Figure RE-DEST_PATH_IMAGE019
is a matrix
Figure 17852DEST_PATH_IMAGE010
Column j of middle lncRNA; parameter(s)
Figure 689529DEST_PATH_IMAGE020
For controlling
Figure 572034DEST_PATH_IMAGE015
The bandwidth of the kernel of (a),
Figure 862201DEST_PATH_IMAGE020
calculated by the following formula:
Figure RE-DEST_PATH_IMAGE021
furthermore, in the step one, a comprehensive disease similarity matrix is obtained by combining the similarity of the Gaussian nuclear spectrum of the disease on the basis of the semantic similarity of the disease
Figure 460542DEST_PATH_IMAGE022
Comprises the following steps:
Figure RE-DEST_PATH_IMAGE023
on the basis of lncRNA functional similarity, lncRNA Gaussian nuclear spectrum similarity is combined to obtain a comprehensive lncRNA similarity matrix
Figure 101739DEST_PATH_IMAGE024
Comprises the following steps:
Figure RE-DEST_PATH_IMAGE025
in addition, in the second step, Laplace regularization least square method is implemented in the comprehensive disease similarity matrix to obtain a disease estimation scoring matrix
Figure 709306DEST_PATH_IMAGE026
Figure RE-DEST_PATH_IMAGE027
Wherein, the disease estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 853980DEST_PATH_IMAGE026
Figure 419959DEST_PATH_IMAGE028
Figure RE-DEST_PATH_IMAGE029
Is a diagonal matrix;
Figure 17294DEST_PATH_IMAGE030
is composed of
Figure RE-DEST_PATH_IMAGE031
The sum of all elements of row i of (1);
Figure 714640DEST_PATH_IMAGE032
is a balance parameter;
Figure RE-DEST_PATH_IMAGE033
is Frobenius norm.
Further, in the second step, a Laplace regularization least square method is implemented in the integrated lncRNA similarity matrix to obtain an lncRNA estimation score matrix
Figure 713820DEST_PATH_IMAGE034
Figure RE-DEST_PATH_IMAGE035
Wherein, the lncRNA estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 185121DEST_PATH_IMAGE034
Figure 535331DEST_PATH_IMAGE036
Figure RE-DEST_PATH_IMAGE037
Is a diagonal matrix;
Figure 484701DEST_PATH_IMAGE038
is composed of
Figure RE-DEST_PATH_IMAGE039
The sum of all elements of row i of (1);
Figure 603967DEST_PATH_IMAGE040
is a balance parameter;
Figure 59219DEST_PATH_IMAGE033
is Frobenius norm.
Further, in the second step, the disease prediction score matrix and the lncRNA prediction score matrix are integrated in the following way to obtain a lncRNA and disease associated composite prediction score matrix
Figure RE-DEST_PATH_IMAGE041
Figure 83676DEST_PATH_IMAGE042
Wherein the content of the first and second substances,
Figure RE-DEST_PATH_IMAGE043
is composed of
Figure 105246DEST_PATH_IMAGE044
The transposed matrix of (2).
In addition, in the third step, the comprehensive disease similarity matrix is projected on the lncRNA and disease association composite estimation score matrix to obtain a projection score matrix based on the comprehensive disease similarity matrix
Figure RE-DEST_PATH_IMAGE045
Comprises the following steps:
Figure 813439DEST_PATH_IMAGE046
wherein the content of the first and second substances,
Figure RE-DEST_PATH_IMAGE047
is composed of
Figure 626543DEST_PATH_IMAGE048
2 norm of (d);
projecting the comprehensive lncRNA similarity matrix on the lncRNA and disease association composite estimation score matrix to obtain a projection score matrix based on the comprehensive lncRNA similarity matrix
Figure 748083DEST_PATH_IMAGE050
Comprises the following steps:
Figure RE-DEST_PATH_IMAGE051
wherein the content of the first and second substances,
Figure 242518DEST_PATH_IMAGE052
is composed of
Figure RE-DEST_PATH_IMAGE053
2 norm of (d).
Further, in the third step, the transpose matrix of the projection score matrix based on the integrated disease similarity matrix is combined with the projection score matrix based on the integrated lncRNA similarity matrix, and the average value is calculated to obtain the final lncRNA and disease association prediction score
Figure 336376DEST_PATH_IMAGE054
Comprises the following steps:
Figure 133431DEST_PATH_IMAGE055
wherein the content of the first and second substances,
Figure RE-DEST_PATH_IMAGE056
is composed of
Figure 929217DEST_PATH_IMAGE057
The transposed matrix of (2).
Finally, the invention also relates to an lncRNA-disease associated prediction system based on laplacian regularized least squares and network projections, comprising:
the data preparation unit is used for constructing a comprehensive disease similarity matrix according to the disease semantic similarity and the disease Gaussian nuclear spectrum similarity; constructing a comprehensive lncRNA similarity matrix according to the lncRNA functional similarity and lncRNA Gaussian nuclear spectrum similarity;
the lncRNA and disease association score estimation unit is used for implementing a Laplace regularization least square method in the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit to construct an lncRNA and disease association composite estimation score matrix;
the lncRNA and disease association score refining unit is used for projecting the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit on the lncRNA and disease association composite estimation score matrix constructed by the lncRNA and disease association score estimation unit respectively, and fusing two projection scores to obtain a disease association lncRNA prediction result;
the lncRNA-disease association prediction system based on the Laplace regularization least square and the network projection predicts the association between lncRNA and diseases according to the prediction method.
Most of the existing prediction methods measure similarity between diseases and functional similarity between lncRNA by using semantic similarity and lncRNA function similarity of the diseases, the similarity between a plurality of diseases and the functional similarity between lncRNA are zero due to data loss, so that the accuracy of a prediction result is influenced, a large amount of associated negative sample data must be provided as a support to ensure the accuracy of the prediction result, and the selection of the negative sample is very difficult. Different from the existing disease-associated lncRNA prediction method, the invention firstly utilizes the disease Gaussian nuclear spectrum similarity and the disease semantic similarity to construct a comprehensive disease similarity matrix, utilizes the lncRNA Gaussian nuclear spectrum similarity and the lncRNA functional similarity to construct a comprehensive lncRNA similarity matrix, compensates the judgment inaccuracy of the disease semantic similarity and the lncRNA functional similarity by citing the disease Gaussian nuclear spectrum similarity and the lncRNA Gaussian nuclear spectrum similarity, and more accurately describes the similarity among diseases and the functional similarity among lncRNA; and then respectively implementing a Laplace regularization least square method in the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix, relieving the known lncRNA-disease associated data sparseness problem through the step, then integrating two estimated score matrixes obtained by implementing the Laplace regularization least square method to obtain an lncRNA and disease associated composite estimated score matrix, and then respectively projecting the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix on the lncRNA and disease associated composite estimated score matrix by combining a network projection method to finally obtain an lncRNA and disease associated prediction result. Compared with the existing prediction method, the lncRNA-disease association prediction method based on Laplace regularization least square and network projection is a global prediction method, can predict the association of all diseases and lncRNA at the same time, can be used for predicting isolated diseases and new lncRNA, has the advantages of no need of negative samples and only one parameter, has higher prediction accuracy on unknown lncRNA-disease interaction prediction, and has stronger generalization capability.
Description of the drawings:
fig. 1 is a graph of the results of leave-one-out cross-validation (LOOCV) experiments performed on datasets 1 and 2 using the lncRNA-disease association prediction method based on laplace regularized least squares and network projections, as described in the examples.
FIG. 2 is a graph comparing ROC curves and AUC values in data set1 for the lncRNA-disease association prediction method based on Laplace regularization least squares and network projection and two other prior methods involved in the example.
Fig. 3 is a graph comparing ROC curves and AUC values in data set2 for the lncRNA-disease association prediction method based on laplace regularized least squares and network projection and two other prior methods involved in the example.
Fig. 4 is a graph of ROC curves and AUC values for isolated diseases and new lncrnas prediction in data set1 and data set2 by the lncRNA-disease association prediction method based on laplace regularization least squares and network projection involved in the example.
Detailed Description
For the understanding of those skilled in the art, the present invention will be further described with reference to the following examples and drawings, which are not intended to limit the present invention.
In this embodiment, the lncRNA-disease associated prediction method based on laplacian regularized least squares and network projection mainly includes the following steps:
firstly, preparing data: obtaining a comprehensive disease similarity matrix by combining the similarity of the disease Gaussian nuclear spectrum on the basis of the semantic similarity of the disease; and combining lncRNA Gaussian nuclear spectrum similarity to obtain a comprehensive lncRNA similarity matrix on the basis of lncRNA functional similarity.
1.1 lncRNA-disease association: acquiring two databases of 2013 and 2015 from an lncrnadeisense database for recording correlation between lncRNA and human diseases, processing, and extracting 156 lncRNA, 190 diseases and 352 known experimentally-verified lncRNA-disease correlation from the 2013 database to obtain a data set 1; 285 lncrnas, 226 diseases, 621 known experimentally verified lncRNA-disease associations were extracted from the 2015 database as dataset 2; wherein, in data set1 and data set2, both use matrix
Figure RE-DEST_PATH_IMAGE058
Figure 774814DEST_PATH_IMAGE059
Representing the lncRNA pool, all using the matrix
Figure RE-DEST_PATH_IMAGE060
Figure 907199DEST_PATH_IMAGE005
Representing a set of diseases, all using a Boolean matrix
Figure 875155DEST_PATH_IMAGE010
Represents the lncRNA-disease association set, if lncRNA node
Figure 158238DEST_PATH_IMAGE016
Node of disease
Figure 869842DEST_PATH_IMAGE005
There is an experimentally verified association, then
Figure 938292DEST_PATH_IMAGE061
Set to 1, otherwise set to 0.
1.2 semantic similarity of diseases: in the prior art, each disease corresponds to a DAG (directed acyclic graph) in MeSH (medical topic vocabulary), semantic similarity between diseases can be measured according to the DAG graphs of the two diseases, and if more disease items are shared by the two diseases, the similarity between the two diseases is larger. Since the method for calculating semantic similarity belongs to the prior art, it is not expanded and described herein.
1.3 lncRNA functional similarity: in the prior art, lncRNA functional similarity is usually calculated by using semantic similarity of diseases and known correlation of disease-lncRNA, and the steps are as follows:
1) selecting any two lncRNA as lncRNA
Figure 77149DEST_PATH_IMAGE016
And lncRNA
Figure 581949DEST_PATH_IMAGE059
The disease sets associated with these two lncrnas are represented as:
Figure RE-DEST_PATH_IMAGE062
and
Figure 34927DEST_PATH_IMAGE063
(ii) a Then lncRNA
Figure 20200DEST_PATH_IMAGE016
And lncRNA
Figure 516910DEST_PATH_IMAGE059
The functional similarity of (a) is defined as follows:
Figure RE-DEST_PATH_IMAGE064
wherein m and n are respectively known as LncRNA
Figure 259738DEST_PATH_IMAGE016
、lncRNA
Figure 503025DEST_PATH_IMAGE059
The number of associated diseases;
2)
Figure 342805DEST_PATH_IMAGE065
for a given disease
Figure RE-DEST_PATH_IMAGE066
With a given set of diseases
Figure 761148DEST_PATH_IMAGE067
The correlation score of (2) is calculated as follows:
Figure RE-DEST_PATH_IMAGE068
this example uses the above method to calculate the functional similarity between lncRNAs and uses the matrix
Figure 506119DEST_PATH_IMAGE069
The functional similarity between lncrnas is shown, and since the above method for calculating the functional similarity between lncrnas belongs to the prior art, it is not expanded and described herein.
1.4 disease gaussian nuclear spectrum similarity to lncRNA gaussian nuclear spectrum similarity: considering that when the semantic similarity of diseases is used to measure the similarity between diseases, because of data loss, the semantic similarity between many diseases is zero, which affects the accuracy of the prediction result, the gaussian kernel spectrum similarity of diseases is introduced in this embodiment to balance the above problems:
Figure 35320DEST_PATH_IMAGE070
wherein the content of the first and second substances,
Figure 182137DEST_PATH_IMAGE012
for disease
Figure 833698DEST_PATH_IMAGE003
And disease
Figure 551118DEST_PATH_IMAGE005
Gaussian nuclear spectrum similarity between;
Figure RE-DEST_PATH_IMAGE071
correlating lncRNA matrices for known diseases
Figure 133278DEST_PATH_IMAGE010
In the ith column of the disease in (1),
Figure 947650DEST_PATH_IMAGE009
is a matrix
Figure 707796DEST_PATH_IMAGE010
Column j of the disease; parameter(s)
Figure 974829DEST_PATH_IMAGE011
For controlling
Figure 357750DEST_PATH_IMAGE012
The bandwidth of the kernel of (a),
Figure 964312DEST_PATH_IMAGE011
calculated by the following formula:
Figure 957676DEST_PATH_IMAGE072
similarly, lncRNA gaussian nuclear spectrum similarity was calculated as follows:
Figure 898956DEST_PATH_IMAGE014
wherein the content of the first and second substances,
Figure 901547DEST_PATH_IMAGE015
is lncRNA
Figure 549566DEST_PATH_IMAGE016
And lncRNA
Figure 713831DEST_PATH_IMAGE017
Gaussian nuclear spectrum similarity between;
Figure 893140DEST_PATH_IMAGE018
is a matrix
Figure 433843DEST_PATH_IMAGE010
In column i of the middle lncRNA,
Figure 998685DEST_PATH_IMAGE019
is a matrix
Figure 537114DEST_PATH_IMAGE010
Column j of middle lncRNA; parameter(s)
Figure 456DEST_PATH_IMAGE020
For controlling
Figure 534730DEST_PATH_IMAGE015
The bandwidth of the kernel of (a),
Figure 767128DEST_PATH_IMAGE020
calculated by the following formula:
Figure 476458DEST_PATH_IMAGE021
1.5 constructing a comprehensive disease similarity matrix and a comprehensive lncRNA similarity matrix: integrating the semantic similarity of diseases and the Gaussian nuclear spectrum similarity of diseases to obtain a comprehensive disease similarity matrix
Figure 614048DEST_PATH_IMAGE022
Integrating lncRNA functional similarity and lncRNA Gaussian nuclear spectrum similarity to obtain a comprehensive lncRNA similarity matrix
Figure 496553DEST_PATH_IMAGE024
Figure RE-DEST_PATH_IMAGE073
Figure 786720DEST_PATH_IMAGE074
II, estimating the correlation score of lncRNA and diseases: in order to solve the problem of sparse known lncRNA-disease associated network nodes, a Laplace regularization least square method is implemented in a comprehensive disease similarity matrix to obtain a disease estimation score matrix, the Laplace regularization least square method is implemented in the comprehensive lncRNA similarity matrix to obtain an lncRNA estimation score matrix, and then the disease estimation score matrix and the lncRNA estimation score matrix are integrated to obtain an lncRNA and disease associated composite estimation score matrix.
2.1 the integrated lncRNA similarity matrix implements Laplace regularization least squares: performing priority ordering on lncRNA-disease interaction by using Laplace regularization least square method in the comprehensive lncRNA similarity matrix to obtain an lncRNA estimation scoring matrix
Figure 916219DEST_PATH_IMAGE034
Figure 354153DEST_PATH_IMAGE035
Wherein, the lncRNA estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 978033DEST_PATH_IMAGE034
Figure RE-DEST_PATH_IMAGE075
Figure 371974DEST_PATH_IMAGE037
Is a diagonal matrix;
Figure 423107DEST_PATH_IMAGE038
is composed of
Figure 348337DEST_PATH_IMAGE039
The sum of all elements of row i of (1);
Figure 756666DEST_PATH_IMAGE076
to balance the parameters, in this embodiment
Figure 83742DEST_PATH_IMAGE040
Take a value of
Figure RE-DEST_PATH_IMAGE077
Figure 227148DEST_PATH_IMAGE033
Is Frobenius norm.
2.2 implementing Laplace regularization least squares method in the integrated disease similarity matrix: similar to 2.1, the lncRNA-disease interaction is prioritized in the comprehensive disease similarity matrix by using Laplace regularization least square method to obtain a disease estimation scoring matrix
Figure 639674DEST_PATH_IMAGE026
Figure 605356DEST_PATH_IMAGE078
Wherein, the disease estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 911573DEST_PATH_IMAGE026
Figure 553776DEST_PATH_IMAGE028
Figure 125702DEST_PATH_IMAGE031
Is a diagonal matrix;
Figure 691813DEST_PATH_IMAGE030
is composed of
Figure 183362DEST_PATH_IMAGE029
The sum of all elements of row i of (1);
Figure RE-DEST_PATH_IMAGE079
to balance the parameters, in this embodiment
Figure 747198DEST_PATH_IMAGE032
Value and
Figure 868738DEST_PATH_IMAGE040
same, also is
Figure 425490DEST_PATH_IMAGE077
Figure 519348DEST_PATH_IMAGE033
Is Frobenius norm.
2.3 integration of two estimated score matrices: integrating the disease prediction score matrix and the lncRNA prediction score matrix to obtain a lncRNA and disease association composite prediction score matrix
Figure 316403DEST_PATH_IMAGE041
Figure 112189DEST_PATH_IMAGE080
Wherein the content of the first and second substances,
Figure 20103DEST_PATH_IMAGE043
is composed of
Figure 968467DEST_PATH_IMAGE044
The transposed matrix of (2).
Thirdly, refining lncRNA and disease association score: projecting the comprehensive disease similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain a projection score matrix, projecting the comprehensive lncRNA similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain another projection score matrix, combining a transpose matrix of the projection score matrix based on the comprehensive disease similarity matrix with the projection score matrix based on the comprehensive lncRNA similarity matrix, and calculating the average value of the transpose matrix and the projection score matrix to obtain the final lncRNA and disease associated prediction score, thereby obtaining a disease associated lncRNA prediction result.
3.1 network projection: after the lncRNA-disease-related estimated score is obtained by using the laplace regularization least square method, a projection score is obtained by network projection.
Firstly, projecting the comprehensive lncRNA similarity matrix on an lncRNA and disease associated composite estimation score matrix to obtain a projection score matrix based on the comprehensive lncRNA similarity matrix
Figure RE-DEST_PATH_IMAGE081
Figure 123374DEST_PATH_IMAGE051
Wherein the content of the first and second substances,
Figure 219506DEST_PATH_IMAGE052
is composed of
Figure 868793DEST_PATH_IMAGE053
2 norm of (d);
then projecting the comprehensive disease similarity matrix on the lncRNA and disease association composite estimation score matrix to obtain a projection score matrix based on the comprehensive disease similarity matrix
Figure 941439DEST_PATH_IMAGE045
Figure 80296DEST_PATH_IMAGE046
Wherein the content of the first and second substances,
Figure 601408DEST_PATH_IMAGE047
is composed of
Figure 116703DEST_PATH_IMAGE048
2 norm of (d).
3.2 fusion projection score: finally, combining the transpose matrix of the projection score matrix based on the comprehensive disease similarity matrix with the projection score matrix based on the comprehensive lncRNA similarity matrix, and calculating the average value of the transpose matrix and the projection score matrix to obtain the final lncRNA and disease association prediction score
Figure 23348DEST_PATH_IMAGE054
And obtaining a prediction result:
Figure 333106DEST_PATH_IMAGE055
wherein the content of the first and second substances,
Figure 341513DEST_PATH_IMAGE056
is composed of
Figure 394920DEST_PATH_IMAGE057
The transposed matrix of (2).
Fourthly, evaluation test: the performance of the prediction method described above (in the examples described above, the prediction method is referred to as "LRLSNP" below) was evaluated using leave-one-out-of-cross validation (LOOCV), and specifically, each pair of lncRNA-disease associations was used as a test sample, and the remaining lncRNA-disease associations were used as training samples for model training in sequence until each pair of lncRNA-disease associations was tested once as a test sample. The performance index of the evaluation adopts an ROC curve and an AUC value. The ROC curve, also called the receiver operating characteristic curve or Sensitivity curve, is a comprehensive index reflecting Sensitivity (Sensitivity) and Specificity (Specificity). The area under the ROC curve line is the AUC, the more convex the ROC curve is, the closer the ROC curve is to the upper left corner, the larger the AUC value is, and the better the prediction performance is.
It should be noted that the present embodiment performs leave-one-out cross validation (LOOCV) experiments on data set1 and data set2, respectively, and the above prediction method includes
Figure 421651DEST_PATH_IMAGE032
And
Figure 839994DEST_PATH_IMAGE040
two balance parameters, which may be set for simplicity
Figure 398014DEST_PATH_IMAGE032
And
Figure 176483DEST_PATH_IMAGE040
the values of (a) are set to be the same. To obtain the optimal parameters, the balance parameters may be adjusted
Figure 870770DEST_PATH_IMAGE079
And
Figure 725593DEST_PATH_IMAGE040
is taken from
Figure RE-DEST_PATH_IMAGE083
Is gradually increased to
Figure 695211DEST_PATH_IMAGE084
And AUC values were calculated, respectively. In two dataThe results of the LOOCV experiments performed on the set are shown in FIG. 2, where the dataset1 curve represents the change in AUC values on data set1, the dataset2 curve represents the change in AUC values on data set2, and the change trends in AUC values on both data sets are nearly the same. As can be seen from FIG. 2, when the parameters are balanced
Figure 90420DEST_PATH_IMAGE079
And
Figure 842475DEST_PATH_IMAGE040
is taken from
Figure RE-DEST_PATH_IMAGE085
Is increased to
Figure 851889DEST_PATH_IMAGE086
The AUC values remained almost unchanged; when balancing the parameters
Figure 56605DEST_PATH_IMAGE079
And
Figure 255505DEST_PATH_IMAGE076
is taken from
Figure 111335DEST_PATH_IMAGE086
Is increased to
Figure 104698DEST_PATH_IMAGE088
The AUC values decreased slightly with time; when balancing the parameters
Figure 796711DEST_PATH_IMAGE079
And
Figure 533723DEST_PATH_IMAGE076
is taken from
Figure 244059DEST_PATH_IMAGE088
Is increased to
Figure DEST_PATH_IMAGE089
AUC values decreased significantly; when it is flatBalance parameter
Figure 611586DEST_PATH_IMAGE079
And
Figure 37232DEST_PATH_IMAGE076
is taken from
Figure DEST_PATH_IMAGE091
Is increased to
Figure 250039DEST_PATH_IMAGE092
The AUC values slightly changed. Thus, the balance parameters of the two data sets can be adjusted
Figure 893510DEST_PATH_IMAGE079
And
Figure 415627DEST_PATH_IMAGE040
are all set as
Figure 144549DEST_PATH_IMAGE077
4.1 Performance comparison with other methods: the two methods of IIRWR and LDAI-ISPS in the prior art are selected to carry out comparison tests with LRLSNP. LOOCV was deployed on datasets to evaluate their predicted performance for three methods, IIRWR, LDAI-ISPS, and LRLSNP, respectively. IIRWR, LDAI-ISPS and LRLSNP are all set according to optimal parameters. Figures 3 and 4 show ROC curves and AUC values for the three methods for performing the LOOCV experiments in data set1 and data set2, respectively. In data set1, the AUC of LRLSNP was 0.9446, while the AUC of IIRWR and LDAI-ISPS were 0.7883 and 0.9154, respectively; in data set2, the AUC for LRLSNP was 0.9386, while the AUC for IIRWR and LDAI-ISP were 0.8230 and 0.8341, respectively. Clearly, LRLSNP showed the best prediction performance.
4.2 isolated diseases and New lncRNA prediction: an isolated disease refers to a disease in which the information associated with lncrnas is completely unknown. To mimic an isolated disease, the known associations of the disease to be queried to all lncrnas were removed. In cross validation in data set1 and data set2, one disease was modeled as an isolated disease at a time, and then LRLSNP was performed with the remaining known information for prediction, so that each disease was predicted once as a test sample. The prediction results were evaluated using ROC curves and AUC values, and the prediction results are shown in fig. 4, where AUC values in data set1 and data set2 are 0.8688 and 0.8865, respectively.
In recent years, more and more new lncrnas are discovered, but the relationship with diseases is mostly unknown, and great challenges are provided for prediction algorithms. Many existing prediction methods cannot well solve the problems, in order to verify the effectiveness of LRLSNP in correlation prediction of new lncRNA and diseases, correlation information of lncRNA to be predicted in data sets 1 and 2 and all diseases is also removed, then LRLSNP is implemented for prediction, the prediction result is shown in FIG. 4, for prediction of new lncRNA, AUC values in data sets 1 and 2 respectively reach 0.8335 and 0.8078, which shows that LRLSNP has a generalization capability of predicting lncRNA without any known correlation, and therefore LRLSNP also has good performance for correlation prediction of new lncRNA and diseases.
4.3 case analysis: to further evaluate the effect of LRLSNP on prediction of potential lncRNA-disease association, two diseases, ovarian cancer (ovarian cancer) and cervical cancer (cervical cancer), were selected below for case analysis, and the experimental sample was selected as data set 2.
Using known data, experiments were performed for ovarian cancer using LRLSNP. Of the first 5 ovarian cancer-associated lncRNAs predicted by LRLSNP, 4 lncRNAs can find supporting evidence from the LncRNADISEASE database, and the top 5 ovarian cancer-associated lncRNAs predicted by LRLSNP are shown in Table 1 below, where only HOST2 is not certified by this database, but evidence of "Sinomenine hydrochloride impurities or monomers in ovarian cancer Cells by inhibition of their expression" hanging non-coding RNA HOST2 expression "is mentioned by evidence of the fact that although the first 5 ovarian cancer-associated lncRNAs predicted by LRLSNP are not certified by this database, the Nanomedicine and Biotechnology { arthritis Cells, Nanomedicine, and Biotechnology 2019, 47: 4131. 4138}, Sinomenine hydrochloride impurities activity or monomers in ovarian cancer Cells by inhibition of their expression of their own non-coding RNA HOST2 expression". For cervical cancer, the 5 th cervical cancer-associated lncrnas predicted using LRLSNP are shown in table 1 below, and all of the 5 th cervical cancer-associated lncrnas predicted can find supporting evidence from lncrnodisease database.
To evaluate the predictive performance of LRLSNP for isolated diseases, lncRNA associations known to be associated with the identified disease were deleted, an operation that ensured that only lncRNA information associated with and similarity to the identified disease and other diseases was utilized. For ovarian cancer, LRLSNP was used to predict the association of potential lncRNA with ovarian cancer, and the top 5 lncRNA candidates predicted to be associated with ovarian cancer under conditions that deleted all known associations of ovarian cancer with lncRNA are shown in table 2 below, and all of the top 5 lncRNA predicted can find supporting evidence in lncrnodisease database. For cervical cancer, the LRLSNP is used to predict the association of potential lncrnas with cervical cancer, and the top 5 lncRNA candidates associated with cervical cancer predicted by LRLSNP under the condition of deleting all known associations of cervical cancer with lncRNA are shown in table 2 below, and all of the top 5 lncrnas predicted can find supporting evidence in lncrnodisease database.
TABLE 1
Disease lncRNA name Evidences RANK
ovarian cancer HOTAIR LncRNADisease 1
ovarian cancer MALAT1 LncRNADisease 2
ovarian cancer MEG3 LncRNADisease 3
ovarian cancer HOST2 4
ovarian cancer CDKN2B-AS1 LncRNADisease 5
cervical cancer MEG3 LncRNADisease 1
cervical cancer PVT1 LncRNADisease 2
cervical cancer CDKN2B-AS1 LncRNADisease 3
cervical cancer LSINCT5 LncRNADisease 4
cervical cancer GAS5 LncRNADisease 5
TABLE 2
Disease lncRNA name Evidences RANK
ovarian cancer H19 LncRNADisease 1
ovarian cancer DNM3OS LncRNADisease 2
ovarian cancer CDKN2B-AS1 LncRNADisease 3
ovarian cancer MALAT1 LncRNADisease 4
ovarian cancer HOTAIR LncRNADisease 5
cervical cancer H19 LncRNADisease 1
cervical cancer TUSC8 LncRNADisease 2
cervical cancer CDKN2B-AS1 LncRNADisease 3
cervical cancer MALAT1 LncRNADisease 4
cervical cancer HOTAIR LncRNADisease 5
In conclusion, LRLSNP not only has higher performance in predicting unknown lncRNA-disease interaction, but also can effectively predict isolated diseases and new lncRNA. By comparing the performance with two relatively advanced prediction methods (IIRWR and LDAI-ISPS) in the prior art, in the data set1, AUC values of LRLSNP, IIRWR and LDAI-ISPS are 0.9446, 0.7883 and 0.9154 respectively; in data set2, the AUC values for LRLSNP, IIRWR, and LDAI-ISP were 0.9386, 0.8230, and 0.8341, respectively. The prediction results of the LRLSNP are all superior to those of other methods, and the accuracy of the prediction results is high. In addition, when evaluating the predicted performance of LRLSNP for isolated diseases and new lncrnas, cross validation was performed for each disease (lncRNA) under the condition that each disease (lncRNA) was individually modeled as an isolated disease (new lncRNA), whose AUC values were 0.8688 and 0.8335 in data set1, respectively; in the data set2, the AUC values are 0.8865 and 0.8078, respectively, which shows that LRLSNP has a good prediction effect on isolated diseases and prediction of new lncRNA, and has a strong generalization ability. In general, the LRLSNP is simple to realize, can be used for prediction of isolated diseases and new lncRNA, has strong interpretability, has few parameters, only has one parameter, can predict with only few resources, and can be used as a powerful auxiliary tool for biological experiments.
Based on the LRLSNP prediction method, in the last embodiment, there is provided an lncRNA-disease correlation prediction system based on laplacian regularized least squares and network projection, where the prediction system predicts the correlation between lncRNA and disease according to the LRLSNP prediction method, and specifically, the LRLSNP prediction method at least includes:
the data preparation unit is used for constructing a comprehensive disease similarity matrix according to the disease semantic similarity and the disease Gaussian nuclear spectrum similarity; constructing a comprehensive lncRNA similarity matrix according to the lncRNA functional similarity and lncRNA Gaussian nuclear spectrum similarity;
the lncRNA and disease association score estimation unit is used for implementing a Laplace regularization least square method in the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit to construct an lncRNA and disease association composite estimation score matrix;
and the lncRNA and disease association score refining unit is used for projecting the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit on the lncRNA and disease association composite estimation score matrix constructed by the lncRNA and disease association score estimation unit respectively, and fusing the two projection scores to obtain a disease association lncRNA prediction result.
It should be noted that the disease-associated lncRNA prediction system can be packaged in a portable storage medium to operate, and can also be stored in a cloud end to operate online; the process of implementing the prediction of the disease-associated lncRNA may be executed by a computer capable of running the prediction system, or may be executed by a server located in the cloud.
The above embodiments are preferred implementations of the present invention, and the present invention can be implemented in other ways without departing from the spirit of the present invention.
Finally, it should be emphasized that some of the descriptions of the present invention have been simplified to facilitate the understanding of the improvements of the present invention over the prior art by those of ordinary skill in the art, and that other elements have been omitted from this document for the sake of clarity, and those of ordinary skill in the art will recognize that such omitted elements may also constitute the subject matter of the present invention.

Claims (10)

1. The lncRNA-disease associated prediction method based on Laplace regularization least square and network projection is characterized by comprising the following steps of:
step one, combining the similarity of disease Gaussian nuclear spectrums on the basis of the semantic similarity of diseases to obtain a comprehensive disease similarity matrix; on the basis of lncRNA functional similarity, combining lncRNA Gaussian nuclear spectrum similarity to obtain a comprehensive lncRNA similarity matrix;
step two, implementing a Laplace regularization least square method in the comprehensive disease similarity matrix to obtain a disease prediction score matrix, implementing the Laplace regularization least square method in the comprehensive lncRNA similarity matrix to obtain an lncRNA prediction score matrix, and integrating the disease prediction score matrix and the lncRNA prediction score matrix to obtain an lncRNA and disease association composite prediction score matrix;
and step three, projecting the comprehensive disease similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain a projection score matrix, projecting the comprehensive lncRNA similarity matrix on the lncRNA and disease associated composite estimation score matrix to obtain another projection score matrix, combining a transpose matrix of the projection score matrix based on the comprehensive disease similarity matrix with the projection score matrix based on the comprehensive lncRNA similarity matrix, and calculating the average value of the transpose matrix to obtain the final lncRNA and disease associated prediction score, thereby obtaining a disease associated lncRNA prediction result.
2. The lncRNA-disease associated prediction method based on laplacian regularized least squares and network projection as claimed in claim 1, wherein in the first step, the gaussian nuclear spectrum similarity of the disease is expressed as:
Figure DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 113057DEST_PATH_IMAGE002
for disease
Figure DEST_PATH_IMAGE003
And disease
Figure DEST_PATH_IMAGE005
Gaussian nuclear spectrum similarity between;
Figure 801527DEST_PATH_IMAGE006
correlating lncRNA matrices for known diseases
Figure DEST_PATH_IMAGE007
In the ith column of the disease in (1),
Figure DEST_PATH_IMAGE009
is a matrix
Figure 313280DEST_PATH_IMAGE007
Column j of the disease; parameter(s)
Figure 479819DEST_PATH_IMAGE010
For controlling
Figure 635994DEST_PATH_IMAGE002
The bandwidth of the kernel of (a),
Figure 498295DEST_PATH_IMAGE010
calculated by the following formula:
Figure DEST_PATH_IMAGE011
3. the lncRNA-disease associated prediction method based on laplacian regularized least squares and network projection as claimed in claim 2, wherein in the first step, lncRNA gaussian nuclear spectrum similarity is expressed as:
Figure 169448DEST_PATH_IMAGE012
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE013
is lncRNA
Figure 405257DEST_PATH_IMAGE014
And lncRNA
Figure DEST_PATH_IMAGE015
Gaussian nuclear spectrum similarity between;
Figure 478255DEST_PATH_IMAGE016
is a matrix
Figure 508528DEST_PATH_IMAGE007
In column i of the middle lncRNA,
Figure DEST_PATH_IMAGE017
is a matrix
Figure 666977DEST_PATH_IMAGE007
Column j of middle lncRNA; parameter(s)
Figure 113002DEST_PATH_IMAGE018
For controlling
Figure 306086DEST_PATH_IMAGE013
The bandwidth of the kernel of (a),
Figure 444943DEST_PATH_IMAGE018
calculated by the following formula:
Figure DEST_PATH_IMAGE019
4. the lncRNA-disease associated prediction method based on Laplace regularization least squares and network projection as claimed in claim 3, wherein, in the first step,
on the basis of the semantic similarity of the diseases, the Gaussian nuclear spectrum similarity of the diseases is combined to obtain a comprehensive disease similarity matrix
Figure 150075DEST_PATH_IMAGE020
Comprises the following steps:
Figure DEST_PATH_IMAGE021
on the basis of lncRNA functional similarity, lncRNA Gaussian nuclear spectrum similarity is combined to obtain a comprehensive lncRNA similarity matrix
Figure 462108DEST_PATH_IMAGE022
Comprises the following steps:
Figure DEST_PATH_IMAGE023
5. laplace regularized least squares and network projection based lncRNA-disease as claimed in claim 4The disease correlation prediction method is characterized in that in the second step, a Laplace regularization least square method is implemented in the comprehensive disease similarity matrix to obtain a disease prediction score matrix
Figure 509699DEST_PATH_IMAGE024
Figure DEST_PATH_IMAGE025
Wherein, the disease estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 881774DEST_PATH_IMAGE024
Figure 952498DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE027
Is a diagonal matrix;
Figure 130539DEST_PATH_IMAGE028
is composed of
Figure 970319DEST_PATH_IMAGE027
The sum of all elements of row i of (1);
Figure DEST_PATH_IMAGE029
is a balance parameter;
Figure 513296DEST_PATH_IMAGE030
is Frobenius norm.
6. The lncRNA-disease associated prediction method based on Laplace regularization least square and network projection as claimed in claim 5, wherein in step twoAnd implementing Laplace regularization least square method in the comprehensive lncRNA similarity matrix to obtain an lncRNA estimation scoring matrix
Figure DEST_PATH_IMAGE031
Figure 868053DEST_PATH_IMAGE032
Wherein, the lncRNA estimation scoring matrix is obtained by solving the optimization problem of the following formula
Figure 790398DEST_PATH_IMAGE031
Figure DEST_PATH_IMAGE033
Figure 547001DEST_PATH_IMAGE034
Is a diagonal matrix;
Figure DEST_PATH_IMAGE035
is composed of
Figure 260879DEST_PATH_IMAGE036
The sum of all elements of row i of (1);
Figure DEST_PATH_IMAGE037
is a balance parameter;
Figure 368512DEST_PATH_IMAGE030
is Frobenius norm.
7. The lncRNA-disease associated prediction method based on Laplace regularization least square and network projection as claimed in claim 6, wherein in the second step, the disease prediction score matrix and lncRNA pre-prediction are integrated as followsEstimating the score matrix to obtain a composite estimation score matrix of lncRNA and disease association
Figure 826039DEST_PATH_IMAGE038
Figure DEST_PATH_IMAGE039
Wherein the content of the first and second substances,
Figure 437149DEST_PATH_IMAGE040
is composed of
Figure DEST_PATH_IMAGE041
The transposed matrix of (2).
8. The lncRNA-disease associated prediction method based on laplacian regularized least squares and network projection as claimed in claim 7, characterized by the following steps:
projecting the comprehensive disease similarity matrix on the lncRNA and disease association composite estimation score matrix to obtain a projection score matrix based on the comprehensive disease similarity matrix
Figure 587507DEST_PATH_IMAGE042
Comprises the following steps:
Figure DEST_PATH_IMAGE043
wherein the content of the first and second substances,
Figure 916857DEST_PATH_IMAGE044
is composed of
Figure DEST_PATH_IMAGE045
2 norm of (d);
projecting the integrated lncRNA similarity matrix on the lncRNA and disease association composite estimation scoring matrix to obtain a projection based on the integrated lncRNA similarity matrixScore matrix
Figure DEST_PATH_IMAGE047
Comprises the following steps:
Figure 971882DEST_PATH_IMAGE048
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE049
is composed of
Figure 703078DEST_PATH_IMAGE050
2 norm of (d).
9. The lncRNA-disease associated prediction method based on Laplace regularization least square and network projection as claimed in claim 8, wherein in the third step, the transpose matrix of the projection score matrix based on the synthesized disease similarity matrix is combined with the projection score matrix based on the synthesized lncRNA similarity matrix, and the average value is calculated to obtain the final lncRNA-disease associated prediction score
Figure DEST_PATH_IMAGE051
Comprises the following steps:
Figure 24338DEST_PATH_IMAGE052
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE053
is composed of
Figure 575405DEST_PATH_IMAGE054
The transposed matrix of (2).
10. An lncRNA-disease associated prediction system based on Laplace regularization least squares and network projection is characterized by comprising:
the data preparation unit is used for constructing a comprehensive disease similarity matrix according to the disease semantic similarity and the disease Gaussian nuclear spectrum similarity; constructing a comprehensive lncRNA similarity matrix according to the lncRNA functional similarity and lncRNA Gaussian nuclear spectrum similarity;
the lncRNA and disease association score estimation unit is used for implementing a Laplace regularization least square method in the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit to construct an lncRNA and disease association composite estimation score matrix;
the lncRNA and disease association score refining unit is used for projecting the comprehensive disease similarity matrix and the comprehensive lncRNA similarity matrix constructed by the data preparation unit on the lncRNA and disease association composite estimation score matrix constructed by the lncRNA and disease association score estimation unit respectively, and fusing two projection scores to obtain a disease association lncRNA prediction result;
the lncRNA-disease association prediction system based on laplacian regularized least squares and network projection predicts the association between lncRNA and disease according to the prediction method of any one of claims 2 to 9.
CN202110428827.2A 2021-04-21 2021-04-21 lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection Pending CN112992347A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110428827.2A CN112992347A (en) 2021-04-21 2021-04-21 lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110428827.2A CN112992347A (en) 2021-04-21 2021-04-21 lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection

Publications (1)

Publication Number Publication Date
CN112992347A true CN112992347A (en) 2021-06-18

Family

ID=76341492

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110428827.2A Pending CN112992347A (en) 2021-04-21 2021-04-21 lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection

Country Status (1)

Country Link
CN (1) CN112992347A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539372A (en) * 2021-06-27 2021-10-22 中南林业科技大学 Efficient prediction method for LncRNA and disease association relation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539372A (en) * 2021-06-27 2021-10-22 中南林业科技大学 Efficient prediction method for LncRNA and disease association relation

Similar Documents

Publication Publication Date Title
CN109243538B (en) Method and system for predicting association relation between disease and LncRNA
Mohammadi et al. Bayesian structure learning in sparse Gaussian graphical models
CN112464638B (en) Text clustering method based on improved spectral clustering algorithm
CA3096678A1 (en) Multi-assay prediction model for cancer detection
Hu et al. Improving one-shot NAS with shrinking-and-expanding supernet
CN107577924B (en) Long-chain non-coding RNA subcellular position prediction method based on deep learning
Hanczar et al. Ensemble methods for biclustering tasks
CN108681659B (en) Method for predicting protein complex based on sample data
Zhang et al. Protein complex prediction in large ontology attributed protein-protein interaction networks
CN110674865B (en) Rule learning classifier integration method oriented to software defect class distribution unbalance
CN110688479B (en) Evaluation method and sequencing network for generating abstract
WO2019196208A1 (en) Text sentiment analysis method, readable storage medium, terminal device, and apparatus
Yu et al. Predicting protein complex in protein interaction network-a supervised learning based method
CN114496092A (en) miRNA and disease association relation prediction method based on graph convolution network
CN116741397B (en) Cancer typing method, system and storage medium based on multi-group data fusion
Zhou et al. Personal credit default prediction model based on convolution neural network
CN112992347A (en) lncRNA-disease associated prediction method and system based on Laplace regularization least square and network projection
Schaid et al. Penalized models for analysis of multiple mediators
Vengatesan et al. The performance analysis of microarray data using occurrence clustering
Bolón-Canedo et al. Exploring the consequences of distributed feature selection in DNA microarray data
CN111584010B (en) Key protein identification method based on capsule neural network and ensemble learning
CN110987751B (en) Quantitative grading evaluation method for pore throat of compact reservoir in three-dimensional space
CN112885405A (en) Prediction method and system of disease-associated miRNA
Liu et al. An improved method for multi-objective clustering ensemble algorithm
CN115907775A (en) Personal credit assessment rating method based on deep learning and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination