CN117854592A - Gene regulation network construction method, device, equipment and storage medium - Google Patents

Gene regulation network construction method, device, equipment and storage medium Download PDF

Info

Publication number
CN117854592A
CN117854592A CN202410239151.6A CN202410239151A CN117854592A CN 117854592 A CN117854592 A CN 117854592A CN 202410239151 A CN202410239151 A CN 202410239151A CN 117854592 A CN117854592 A CN 117854592A
Authority
CN
China
Prior art keywords
gene regulation
regulation network
single cells
adjacent single
rna sequencing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410239151.6A
Other languages
Chinese (zh)
Other versions
CN117854592B (en
Inventor
刘杰
毛果
庞征斌
左克
王庆林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202410239151.6A priority Critical patent/CN117854592B/en
Priority claimed from CN202410239151.6A external-priority patent/CN117854592B/en
Publication of CN117854592A publication Critical patent/CN117854592A/en
Application granted granted Critical
Publication of CN117854592B publication Critical patent/CN117854592B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a method, a device, equipment and a storage medium for constructing a gene regulation network, which relate to the technical field of gene regulation networks and comprise the following steps: collecting time sequence single-cell RNA sequencing data, inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells; setting an objective function of an objective gene regulation network through shared gene information between adjacent single cells; and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network. And screening adjacent single cells from the time sequence single cell RNA sequencing data through the fusion pattern lasso model, controlling and generating a corresponding single cell comprehensive time-varying characteristic target gene regulation network, setting target parameters of the target gene regulation network by using shared gene information between the adjacent single cells, and stabilizing the topology structure of the gene regulation network.

Description

Gene regulation network construction method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of gene regulation networks, in particular to a method, a device, equipment and a storage medium for constructing a gene regulation network.
Background
Gene expression profiling is an important data source for studying gene regulatory networks, where bulk expression data and single cell transcriptome data are two common types. Batch expression data are readily available due to large sample volumes, but they can only reflect the average expression levels of all cells in the same sample or tissue, failing to capture the small and critical differences between cells, which are critical for inferring the transcriptional regulation of genes to each other. In contrast, single cell transcriptome data can provide more useful information. These data can cover the variation of gene expression levels during differentiation of each individual cell, reflect cell heterogeneity, and capture characteristics of different cell types. Therefore, single cell transcriptome data is of great importance for studying gene regulation networks and understanding changes in cell status. Currently, the rapid development of single cell sequencing technology provides new opportunities and challenges for the inference of single cell gene regulatory networks. However, current methods suffer from deficiencies in modeling the time-varying nature of gene regulatory networks, which results in time-instability of the network topology.
In summary, how to construct a gene regulation network to overcome the instability of the network topology in time and realize modeling and analysis of time-varying characteristics is a technical problem to be solved in the field.
Disclosure of Invention
In view of the above, the present invention aims to provide a method, an apparatus, a device, and a storage medium for constructing a gene regulation network, which can construct a gene regulation network, overcome the instability of the network topology structure in time, and realize modeling and analysis of time-varying characteristics. The specific scheme is as follows:
in a first aspect, the present application discloses a method of constructing a gene regulation network comprising:
collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells;
setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells;
and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network.
Optionally, after collecting the time-series single-cell RNA sequencing data, the method further comprises:
judging whether the time sequence single-cell RNA sequencing data meet the multi-element Gaussian distribution or not;
and triggering a calculation accuracy matrix if the time sequence single-cell RNA sequencing data meets the multi-element Gaussian distribution.
Optionally, the inputting the time sequence single cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control the construction of a target gene regulation network of the adjacent single cells includes:
inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model containing fusion dragline penalty so as to obtain an initial gene regulation network of each time sequence single-cell RNA sequencing data.
Optionally, after the obtaining of the initial gene regulation network of each of the time-series single-cell RNA sequencing data, the method further comprises:
and adjusting the topological structure of the initial gene regulation network based on the shared gene information between adjacent single cells so as to obtain the target gene regulation network.
Optionally, the setting the objective function of the objective gene regulation network through the shared gene information between the adjacent single cells includes:
and setting an objective function of the objective gene regulation network through presetting a weighting core and sharing gene information among the adjacent single cells.
Optionally, before the objective function of the objective gene regulation network is set by presetting a weighting kernel and the shared gene information between the adjacent single cells, the method further includes:
performing principal component analysis on the RNA sequencing data of the adjacent single cells to obtain corresponding low-dimensional data;
calculating Euclidean distance between corresponding single cell pairs based on the low-dimensional data, and constructing a K nearest neighbor graph between adjacent single cells according to the Euclidean distance;
acquiring manifold distances between the adjacent single cells from the K nearest neighbor map;
and determining a preset weighting core according to the manifold distance and the bandwidth superparameter between the adjacent single cells.
Optionally, the method for constructing a gene regulation network further includes:
and verifying the target gene regulation network by using the real cell data set, and obtaining the verified target gene regulation network.
In a second aspect, the present application discloses a gene regulation network construction device comprising:
the initial construction module is used for collecting time sequence single-cell RNA sequencing data, inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model, screening out adjacent single cells and controlling and constructing a target gene regulation network of the adjacent single cells;
the objective function setting module is used for setting an objective function of the objective gene regulation network through the shared gene information among the adjacent single cells;
and the function solving module is used for solving the objective function by using an alternate direction multiplier method so as to complete the construction of the objective gene regulation network.
In a third aspect, the present application discloses an electronic device comprising:
a memory for storing a computer program;
and a processor for executing the computer program to realize the steps of the disclosed gene regulation network construction method.
In a fourth aspect, the present application discloses a computer-readable storage medium for storing a computer program; wherein the computer program when executed by the processor implements the steps of the previously disclosed gene regulation network construction method.
Thus, the application discloses a method for constructing a gene regulation network, which comprises the following steps: collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells; setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells; and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network. Therefore, the collected time sequence single-cell RNA sequencing data are input into a fusion graphic lasso model, adjacent single cells are screened out from the time sequence single-cell RNA sequencing data through the fusion graphic lasso model, the generation of a corresponding single-cell comprehensive time-varying characteristic target gene regulation network is controlled, then target parameters of the target gene regulation network are set by utilizing shared gene information among the adjacent single cells, the topology structure of the gene regulation network is stabilized, and the construction of the target gene regulation network is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a gene regulation network disclosed in the present application;
FIG. 2 is a flowchart of a specific method of constructing a gene regulation network disclosed in the present application;
FIG. 3 is a flowchart of a specific gene regulation network construction and verification method disclosed in the present application;
FIG. 4 is a schematic structural diagram of a gene regulation network construction device disclosed in the present application;
fig. 5 is a block diagram of an electronic device disclosed in the present application.
Detailed Description
The following description of the technical solutions in the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Gene expression profiling is an important data source for studying gene regulatory networks, where bulk expression data and single cell transcriptome data are two common types. Batch expression data are readily available due to large sample volumes, but they can only reflect the average expression levels of all cells in the same sample or tissue, failing to capture the small and critical differences between cells, which are critical for inferring the transcriptional regulation of genes to each other. In contrast, single cell transcriptome data can provide more useful information. These data can cover the variation of gene expression levels during differentiation of each individual cell, reflect cell heterogeneity, and capture characteristics of different cell types. Therefore, single cell transcriptome data is of great importance for studying gene regulation networks and understanding changes in cell status. Currently, the rapid development of single cell sequencing technology provides new opportunities and challenges for the inference of single cell gene regulatory networks. However, current methods suffer from deficiencies in modeling the time-varying nature of gene regulatory networks, which results in time-instability of the network topology.
Therefore, the application provides a gene regulation network construction scheme which can realize construction of a gene regulation network, so that the network topology structure is overcome to be unstable in time, and modeling and analysis of time-varying characteristics are realized.
Referring to fig. 1, the embodiment of the invention discloses a method for constructing a gene regulation network, which comprises the following steps:
step S11: and collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells.
In this embodiment, the time-series single-cell RNA sequencing data is collected, so as to obtain time-varying characteristics of a single cell while obtaining single-cell RNA sequencing data, so as to fuse the time-varying characteristics into the whole gene regulation network generation process.
In this embodiment, after collecting the time-series single-cell RNA sequencing data, the method further comprises: judging whether the time sequence single-cell RNA sequencing data meet the multi-element Gaussian distribution or not; and triggering a calculation accuracy matrix if the time sequence single-cell RNA sequencing data meets the multi-element Gaussian distribution. It will be appreciated that determining whether the time series single cell RNA sequencing data satisfies a multivariate Gaussian distribution if the time series single cell RNA sequencing dataFollow a multivariate Gaussian distribution->Wherein->Is the number of cells,/->Is the number of genes,/->And->Is a positive +.>A matrix. Gene regulation networks are encoded in a distributed sparse precision matrix, i.e. an inverse covariance matrix, then from the gene tableThe data is to infer the gene interaction to be the accuracy matrix of the Gao Situ model>Reconstruction of the precision matrix is not a simple task, especially when the number of cells is +.>Far smaller than the dimension of the gene->When (1). Since the sample covariance matrix is typically irreversible, it is difficult to obtain the accuracy matrix directly from the sample covariance matrix. The precision matrix is calculated by equation (1) using the following graph fusion equation:
the method comprises the steps of carrying out a first treatment on the surface of the Formula (1)
Inputting the time sequence single-cell RNA sequencing data into a graph fusion formula, and calculating an accuracy matrix according to the graph fusion formula; wherein,is based on an empirical covariance matrix of n-time series single-cell RNA sequencing data,/I>Is the trace of the matrix,is a convex penalty, i.e. penalty factor, < ->Is a determinant of a matrix.
When the number of cellsFar less than the gene dimension->Reconstruction of an exact matrix is particularly difficult when, as the sample covariance matrix is typicallyIs irreversible and thus it is difficult to obtain the accuracy matrix directly from the sample covariance matrix. One solution is to infer the accuracy matrix using maximum log likelihood estimation, as shown in equation (2) below:
the method comprises the steps of carrying out a first treatment on the surface of the Formula (2)
In the embodiment, the time sequence single-cell RNA sequencing data is input into a fusion pattern lasso model, and adjacent single cells are screened out through the fusion pattern lasso model and the target gene regulation and control network of the adjacent single cells is controlled to be constructed.
In this embodiment, adjacent single cells are screened from the time-series single cell RNA sequencing data by a binary weight matrix and the change between networks of adjacent single cells is controlled, thereby improving cell homogeneity and specificity.
In this embodiment, in the process of obtaining the target gene regulation network, it is necessary to obtain the initial gene regulation network of each single cell, and then adjust the initial gene regulation network to obtain the target gene regulation network, and specifically, obtaining the initial gene regulation network includes: inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model containing fusion dragline penalty so as to obtain an initial gene regulation network of each time sequence single-cell RNA sequencing data. It will be appreciated that in order to solve the common structure between the gene regulatory networks of different single cells, multiple graphic models are jointly estimated from the gene regulatory networks of adjacent single cells by fusing graphic lasso models. The specific form is as formula (3):
the method comprises the steps of carrying out a first treatment on the surface of the Formula (3)
Punishment of generalized fusion cable by fusion graphic lasso modelIncorporating log likelihood, wherein->And->Non-negative adjustment parameters, K being the number of different conditions. The first term is a lasso penalty and the second term is a fusion lasso penalty. Therefore, through a fusion graphic lasso method, the homogeneity and the heterogeneity among the gene regulation networks of a plurality of single cells are improved, and the modeling precision is improved.
In this embodiment, after the obtaining of the initial gene regulation network of each of the time-series single-cell RNA sequencing data, the method further includes: and adjusting the topological structure of the initial gene regulation network based on the shared gene information between adjacent single cells so as to obtain the target gene regulation network. It can be understood that after the initial gene regulation network is obtained, the initial gene regulation network is further required to be adjusted according to the shared gene information between the adjacent single cells, the topology structure of the initial gene regulation network is stabilized by the shared gene information between the adjacent single cells, and the target gene regulation network is obtained.
Step S12: and setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells.
In this embodiment, the objective function of the target gene regulation network is set according to the shared gene information between the adjacent single cells, specifically, by giving the similarity between the target single cells and the adjacent single cells, and punishing the gene regulation network between the adjacent single cells according to the similarity. When the gene regulation network of the target single cell is deduced, the target single cell estimates a common edge by integrating the gene regulation networks of other adjacent single cells, thereby obtaining the target function of the target gene regulation network of the time sequence single cell RNA sequencing data.
Step S13: and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network.
In this embodiment, the ADMM (Alternating Direction Method of Multipliers, alternate direction multiplier method) is used to solve the objective function including the separable convex optimization problem, and because of its fast processing speed and good convergence performance, the ADMM is suitable for solving the distributed convex optimization problem, especially the statistical learning problem. The method is mainly applied to the condition of large solution space scale, and the block solution is forced, so that the parameter estimation process for solving the objective function is more efficient, the calculation speed is improved, and the construction of the objective gene regulation network is accelerated.
Thus, the application discloses a method for constructing a gene regulation network, which comprises the following steps: collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells; setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells; and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network. Therefore, the collected time sequence single-cell RNA sequencing data are input into a fusion graphic lasso model, adjacent single cells are screened out from the time sequence single-cell RNA sequencing data through the fusion graphic lasso model, the generation of a corresponding single-cell comprehensive time-varying characteristic target gene regulation network is controlled, then target parameters of the target gene regulation network are set by utilizing shared gene information among the adjacent single cells, the topology structure of the gene regulation network is stabilized, and the construction of the target gene regulation network is realized.
Referring to fig. 2, the embodiment of the invention discloses a specific method for constructing a gene regulation network, and compared with the previous embodiment, the embodiment further describes and optimizes the technical scheme. Specific:
step S21: and collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells.
Step S22: and setting an objective function of the objective gene regulation network through presetting a weighting core and sharing gene information among the adjacent single cells.
In this embodiment, the objective function of the target gene regulation network is set by the preset weighting kernel and the shared gene information, so that the defect that one gene regulation network GRN (Gene Regulatory Network) for deducing a group of cells cannot directly deduce the gene regulation network of a single cell can be overcome, and the topology of the gene regulation network is stabilized by introducing the preset weighting kernel through sharing the gene information between adjacent cells. The weighting kernel may be set by using gene expression data in a hypothetical manner. For example: it is assumed that the gene regulatory network GRN of the cell smoothly varies along the cell trajectory. When studying single cell gene expression data, the cell track can be learned from the data by a track inference method. These methods assume that cells with similar gene expression have similar gene regulatory networks GRN. Based on this assumption, the weighting kernel K (i, j) can be designed using gene expression data, representing transcriptome similarity between cell i and neighboring cell j.
In this embodiment, before the setting of the objective function of the objective gene regulation network by the preset weighting kernel and the shared gene information between the adjacent single cells, the method further includes: performing principal component analysis on the RNA sequencing data of the adjacent single cells to obtain corresponding low-dimensional data; calculating Euclidean distance between corresponding single cell pairs based on the low-dimensional data, and constructing a K nearest neighbor graph between adjacent single cells according to the Euclidean distance; acquiring manifold distances between the adjacent single cells from the K nearest neighbor map; and determining a preset weighting core according to the manifold distance and the bandwidth superparameter between the adjacent single cells. The method comprises the following steps:
principal component analysis (Principal Component Analysis, PCA) was performed on the scRNA-seq data and their low dimensional PCA representation was used to calculate the pairwise euclidean distance between cells.
Constructing a k-Nearest Neighbor (kNN) graph between cells using a pair of euclidean distances, wherein k is set to 5, since the k value is too large, which makes the distance between cells susceptible to short circuits; too small a value of k may cause the data stream to be split into disconnected regions.
Geodesic distances between two adjacent cells in the k-nearest neighbor graph are calculated to approximate manifold distances between them.
Computing weighting kernels using gaussian kernel functions
Wherein,is the geodesic distance between cells i and j, σ is the bandwidth hyper-parameter of the GRN difference between cells. The larger σ means that the change in GRN between cells is slower.
The purpose of the weighting kernel is to infer the GRN of a given cell by integrating the effect of other cells on that cell, thus more ensuring that the inferred GRN has a higher degree of accuracy. Thus, when the GRN of a given cell is deduced, i.eIntroducing a weighting kernel not only considers +.>For->Also consider the gene expression profile pairs of other cells +.>Is a possibility of (1). To obtain more accurate inference results, the gaussian graphical model can be jointly estimated by equation (4) that maximizes the following objective function:
the method comprises the steps of carrying out a first treatment on the surface of the Formula (4)
Wherein,representing a symmetric positive definite matrix, < >>Representing the number of genes,/->Is a correlation matrix.
Although the fusion pattern lasso method can improve the homogeneity and heterogeneity between the gene regulatory networks of multiple cells, there are two distinct drawbacks. First, this approach does not distinguish between shared edges, and therefore does not preserve the conditional specificity of edges. Secondly, the method applies a constant punishment parameter to all cell pairs, so that the obtained gene regulation network has a good effect of distinguishing cell groups with small similarity, but cannot effectively distinguish the cell groups if the similarity among the cells is large. To solve these problems, a gene regulation network between cells expressing similar genes is incorporated into a gene regulation network constructing a single cell by expanding a fusion pattern lasso method. By adding the binary screening matrix w (i, k) to the fused lasso penalty, equation (4) is updated with equation (5) as follows:
the method comprises the steps of carrying out a first treatment on the surface of the Formula (5)
Wherein,and->Gene regulatory networks representing cell i and cell k, respectively,/->Is an adjustment parameter for controlling the sparseness of the cellular i network, < >>Is a regulatory parameter controlling the difference in GRN between cells, if the heterogeneity between cells is more prominent,/-Can->Should be smaller. w (i, k) controls the similarity between a given cell i and its neighbor cell k and penalizes the regulatory network between cells based on the similarity between cells. When deducing aGiven the gene regulatory network of a cell, the cell integrates the gene regulatory network of other cells to estimate a common edge, while allowing different edges to be estimated in a cell-specific manner. Thus, this approach is known as adaptive Fused Graphics Lasso (FGL). The final objective function of the gaussian diagram model for single cell transcriptional sequencing data is as follows:
step S23: and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network.
The more detailed processing procedures in steps S21 and S23 refer to the content of the foregoing disclosed embodiments, and are not described herein.
Step S24: and verifying the target gene regulation network by using the real cell data set, and obtaining the verified target gene regulation network.
In the embodiment, after solving the objective function, the real data set is adopted to perform experimental verification of the target gene regulation network, which is helpful to ensure the accuracy and reliability of the target gene regulation network in the actual biological application scene, thereby providing powerful support for the research of the gene regulation network.
Referring to FIG. 3, time series single cell RNA sequencing (scRNA-seq, wherein RNA is Ribo Nucleic Acid) data, specifically including t, is first collected 1 Time cell 1, t 2 Time cell 2 and t k Time cell n. Secondly, designing an optimization problem and an algorithm, constructing the optimization problem according to the characteristics of the precision, sparsity and continuity of the dynamic gene regulation network (Dynamic Gene Regulatory Networks, DGRNs), and solving by adopting an alternate direction multiplier method (Alternating Direction Method of Multipliers, ADMM). And analyzing the network change in the specific biological process, and researching the evolution condition of the gene regulation network in the specific biological process. Finally, the inferred network is validated by sampling the real dataset and the effect of the identified regulatory factors is measured.
Therefore, by setting the adaptive fusion graphic lasso penalty term combined with the binary weight matrix and the continuous fusion lasso penalty term, the common information and the cell specific edge of the gene regulation network between cells can be effectively captured. And the homogeneity and the heterogeneity of the gene regulation network among cells are considered better when the single-cell gene regulation network is constructed, so that the modeling precision is improved.
Referring to fig. 4, the embodiment of the invention also correspondingly discloses a device for constructing a gene regulation network, which comprises:
the initial construction module 11 is used for collecting time sequence single-cell RNA sequencing data, inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model, screening out adjacent single cells and controlling and constructing a target gene regulation network of the adjacent single cells;
an objective function setting module 12 for setting an objective function of the objective gene regulation network through shared gene information between the adjacent single cells;
and the function solving module 13 is used for solving the objective function by using an alternate direction multiplier method so as to complete the construction of the objective gene regulation network.
Thus, the application discloses collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells; setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells; and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network. Therefore, the collected time sequence single-cell RNA sequencing data are input into a fusion graphic lasso model, adjacent single cells are screened out from the time sequence single-cell RNA sequencing data through the fusion graphic lasso model, the generation of a corresponding single-cell comprehensive time-varying characteristic target gene regulation network is controlled, then target parameters of the target gene regulation network are set by utilizing shared gene information among the adjacent single cells, the topology structure of the gene regulation network is stabilized, and the construction of the target gene regulation network is realized.
Further, the embodiment of the present application further discloses an electronic device, and fig. 5 is a block diagram of the electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 5 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement the relevant steps in the gene regulation network construction method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
Processor 21 may include one or more processing cores, such as a 4-core processor, an 8-core processor, etc. The processor 21 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 21 may also comprise a main processor, which is a processor for processing data in an awake state, also called CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 21 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 21 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and the computer program 222, so as to implement the operation and processing of the processor 21 on the mass data 223 in the memory 22, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program capable of performing other specific tasks in addition to the computer program capable of performing the gene regulation network construction method performed by the electronic device 20 disclosed in any of the foregoing embodiments. The data 223 may include, in addition to data received by the electronic device and transmitted by the external device, data collected by the input/output interface 25 itself, and so on.
Further, the application also discloses a computer readable storage medium for storing a computer program; wherein the computer program when executed by the processor implements the previously disclosed gene regulation network construction method. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The method, the device, the equipment and the storage medium for constructing the gene regulation network provided by the invention are described in detail, and specific examples are applied to the description of the principle and the implementation mode of the invention, and the description of the examples is only used for helping to understand the method and the core idea of the invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (10)

1. A method of constructing a gene regulation network, comprising:
collecting time sequence single-cell RNA sequencing data, and inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model to screen out adjacent single cells and control and construct a target gene regulation network of the adjacent single cells;
setting an objective function of the objective gene regulation network through the shared gene information between the adjacent single cells;
and solving the objective function by using an alternate direction multiplier method to complete the construction of the objective gene regulation network.
2. The method of claim 1, further comprising, after the collecting the time-series single-cell RNA sequencing data:
judging whether the time sequence single-cell RNA sequencing data meet the multi-element Gaussian distribution or not;
and triggering a calculation accuracy matrix if the time sequence single-cell RNA sequencing data meets the multi-element Gaussian distribution.
3. The method of claim 1, wherein inputting the time-series single-cell RNA sequencing data into a fusion pattern lasso model to screen out adjacent single cells and control construction of a target gene regulatory network of the adjacent single cells comprises:
inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model containing fusion dragline penalty so as to obtain an initial gene regulation network of each time sequence single-cell RNA sequencing data.
4. The method according to claim 3, wherein after the initial gene regulation network for obtaining each of the time-series single-cell RNA sequencing data, further comprising:
and adjusting the topological structure of the initial gene regulation network based on the shared gene information between adjacent single cells so as to obtain the target gene regulation network.
5. The method according to claim 1, wherein the setting of the objective function of the objective gene regulatory network by the shared gene information between the adjacent single cells comprises:
and setting an objective function of the objective gene regulation network through presetting a weighting core and sharing gene information among the adjacent single cells.
6. The method according to claim 5, wherein before setting the objective function of the objective gene regulatory network by the preset weighting kernel and the shared gene information between the adjacent single cells, further comprising:
performing principal component analysis on the RNA sequencing data of the adjacent single cells to obtain corresponding low-dimensional data;
calculating Euclidean distance between corresponding single cell pairs based on the low-dimensional data, and constructing a K nearest neighbor graph between adjacent single cells according to the Euclidean distance;
acquiring manifold distances between the adjacent single cells from the K nearest neighbor map;
and determining a preset weighting core according to the manifold distance and the bandwidth superparameter between the adjacent single cells.
7. The method for constructing a gene regulatory network according to any one of claims 1 to 6, further comprising:
and verifying the target gene regulation network by using the real cell data set, and obtaining the verified target gene regulation network.
8. A gene regulation network construction apparatus, comprising:
the initial construction module is used for collecting time sequence single-cell RNA sequencing data, inputting the time sequence single-cell RNA sequencing data into a fusion graphic lasso model, screening out adjacent single cells and controlling and constructing a target gene regulation network of the adjacent single cells;
the objective function setting module is used for setting an objective function of the objective gene regulation network through the shared gene information among the adjacent single cells;
and the function solving module is used for solving the objective function by using an alternate direction multiplier method so as to complete the construction of the objective gene regulation network.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to realize the steps of the gene regulation network construction method according to any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program; wherein the computer program when executed by a processor implements the steps of the gene regulation network construction method according to any one of claims 1 to 7.
CN202410239151.6A 2024-03-04 Gene regulation network construction method, device, equipment and storage medium Active CN117854592B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410239151.6A CN117854592B (en) 2024-03-04 Gene regulation network construction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410239151.6A CN117854592B (en) 2024-03-04 Gene regulation network construction method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN117854592A true CN117854592A (en) 2024-04-09
CN117854592B CN117854592B (en) 2024-06-04

Family

ID=

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110491442A (en) * 2019-08-15 2019-11-22 电子科技大学 Recognition methods, device, equipment and the storage medium of unicellular miRNA regulated and control network
US20210090686A1 (en) * 2019-09-25 2021-03-25 Regeneron Pharmaceuticals, Inc. Single cell rna-seq data processing
CN112967755A (en) * 2021-03-04 2021-06-15 深圳大学 Cell type identification method for single cell RNA sequencing data
CN114783526A (en) * 2022-05-11 2022-07-22 南开大学 Depth unsupervised single cell clustering method based on Gaussian mixture graph variation self-encoder
CN115240772A (en) * 2022-08-22 2022-10-25 南京医科大学 Method for analyzing active pathway in unicellular multiomics based on graph neural network
CN116486913A (en) * 2023-05-23 2023-07-25 浙江大学 System, apparatus and medium for de novo predictive regulatory mutations based on single cell sequencing
CN116525006A (en) * 2023-03-27 2023-08-01 深圳大学 Single cell classification method, device, equipment and storage medium
CN116564410A (en) * 2023-05-23 2023-08-08 浙江大学 Method, equipment and medium for predicting mutation site cis-regulatory gene
WO2023182847A1 (en) * 2022-03-23 2023-09-28 한국과학기술원 Method for constructing gene network through single-cell transcriptome and method for discovering key gene in differentiation using same

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110491442A (en) * 2019-08-15 2019-11-22 电子科技大学 Recognition methods, device, equipment and the storage medium of unicellular miRNA regulated and control network
US20210090686A1 (en) * 2019-09-25 2021-03-25 Regeneron Pharmaceuticals, Inc. Single cell rna-seq data processing
CN112967755A (en) * 2021-03-04 2021-06-15 深圳大学 Cell type identification method for single cell RNA sequencing data
WO2023182847A1 (en) * 2022-03-23 2023-09-28 한국과학기술원 Method for constructing gene network through single-cell transcriptome and method for discovering key gene in differentiation using same
CN114783526A (en) * 2022-05-11 2022-07-22 南开大学 Depth unsupervised single cell clustering method based on Gaussian mixture graph variation self-encoder
CN115240772A (en) * 2022-08-22 2022-10-25 南京医科大学 Method for analyzing active pathway in unicellular multiomics based on graph neural network
CN116525006A (en) * 2023-03-27 2023-08-01 深圳大学 Single cell classification method, device, equipment and storage medium
CN116486913A (en) * 2023-05-23 2023-07-25 浙江大学 System, apparatus and medium for de novo predictive regulatory mutations based on single cell sequencing
CN116564410A (en) * 2023-05-23 2023-08-08 浙江大学 Method, equipment and medium for predicting mutation site cis-regulatory gene

Similar Documents

Publication Publication Date Title
Lu et al. An asymmetric encoder–decoder model for Zn-ion battery lifetime prediction
Sun et al. A new fitness estimation strategy for particle swarm optimization
Zou et al. A modified differential evolution algorithm for unconstrained optimization problems
CN112116090B (en) Neural network structure searching method and device, computer equipment and storage medium
CN113852432B (en) Spectrum Prediction Sensing Method Based on RCS-GRU Model
Devidze et al. Explicable reward design for reinforcement learning agents
CN111259738A (en) Face recognition model construction method, face recognition method and related device
Zhang et al. Modeling IoT equipment with graph neural networks
Zhang et al. A novel fuzzy hybrid quantum artificial immune clustering algorithm based on cloud model
US20240014651A1 (en) Probability estimation method for photovoltaic power based on optimized copula function and photovoltaic power system
CN111611435A (en) Video classification method and device and storage medium
CN110555530B (en) Distributed large-scale gene regulation and control network construction method
CN116629352A (en) Hundred million-level parameter optimizing platform
Howard et al. Physics-informed CoKriging model of a redox flow battery
Mehdipour et al. Spatial-temporal pattern synthesis in a network of locally interacting cells
CN117854592B (en) Gene regulation network construction method, device, equipment and storage medium
CN117854592A (en) Gene regulation network construction method, device, equipment and storage medium
JP5909943B2 (en) Information processing apparatus, estimator generation method, and program
Yang et al. A surrogate assisted evolutionary multitasking optimization algorithm
Chang et al. A survey of some simulation-based algorithms for Markov decision processes
CN115577290A (en) Distribution network fault classification and source positioning method based on deep learning
JP7488375B2 (en) Method, device and computer-readable storage medium for generating neural networks
Wang et al. Deep learning-based state prediction of the Lorenz system with control parameters
Kent et al. Controlling chaotic maps using next-generation reservoir computing
CN114462294A (en) Two-stage agent model auxiliary parameter estimation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant