CN117198381A - Method and device for constructing digital cell model, medium, equipment and system - Google Patents

Method and device for constructing digital cell model, medium, equipment and system Download PDF

Info

Publication number
CN117198381A
CN117198381A CN202210616258.9A CN202210616258A CN117198381A CN 117198381 A CN117198381 A CN 117198381A CN 202210616258 A CN202210616258 A CN 202210616258A CN 117198381 A CN117198381 A CN 117198381A
Authority
CN
China
Prior art keywords
biochemical
digital cell
model
cell model
digital
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210616258.9A
Other languages
Chinese (zh)
Inventor
姜树嘉
李国亮
李林峰
胡健
闫峻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yidu Cloud Beijing Technology Co Ltd
Original Assignee
Yidu Cloud Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yidu Cloud Beijing Technology Co Ltd filed Critical Yidu Cloud Beijing Technology Co Ltd
Priority to CN202210616258.9A priority Critical patent/CN117198381A/en
Priority to PCT/CN2022/115811 priority patent/WO2023231202A1/en
Publication of CN117198381A publication Critical patent/CN117198381A/en
Pending legal-status Critical Current

Links

Abstract

The disclosure provides a method and a device for constructing a digital cell model, a storage medium, electronic equipment and a digital cell system, and belongs to the technical field of digital cell models. The construction method of the digital cell model comprises the following steps: constructing an initial digital cell model based on the biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information; the signal path unit is used for simulating a signal path of the biological cell; performing iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells; in the iterative simulation process, judging whether the digital cell model reaches a steady state or a failure state; updating the initial digital cell model when the failure state is reached; after reaching a steady state condition, a target digital cell model is determined based on the current initial digital cell model. The digital cell model can improve the research efficiency of occurrence, development and treatment of diseases.

Description

Method and device for constructing digital cell model, medium, equipment and system
Technical Field
The embodiment of the disclosure relates to the technical field of digital cell models, in particular to a method and a device for constructing a digital cell model, a storage medium, electronic equipment and a digital cell system.
Background
The occurrence and progression of diseases often involve complex biochemical reactions, in particular involving multiple signaling pathways, which are superimposed on each other to affect the phenotype of the cell. At present, the research on occurrence, development and treatment of diseases mainly adopts biological means, and is limited by long period, multiple influencing factors and high cost of the biological means.
With the current increase in biochemical information, particularly, the accumulation of signal pathway-related information, protein network-related information, gene network-related information, biomarker-related information, kinetic information related to biochemical processes, and the like, it is necessary to construct a digital cell model to improve the efficiency of research on the occurrence, development, and treatment of diseases.
It should be noted that the information of the present invention in the above background section is only for enhancing understanding of the background of the present disclosure, and thus may include information that does not form the prior art that is already known to those of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide a method and a device for constructing a digital cell model, a storage medium, electronic equipment and a digital cell system.
According to a first aspect of the present disclosure, there is provided a method for constructing a digital cell model, comprising:
constructing an initial digital cell model based on the biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
performing iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells;
in the iterative simulation process, judging whether the digital cell model reaches a steady state or a failure state; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
According to a second aspect of the present disclosure, there is provided a construction apparatus of a digital cell model, comprising:
a construction module configured to construct an initial digital cell model based on the biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
a simulation module configured to perform iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells;
the judging module is configured to judge whether the digital cell model reaches a steady state or a failure state in the iterative simulation process; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
According to a third aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the above-described method of constructing a digital cell model.
According to a fourth aspect of the present disclosure, there is provided an electronic device comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the above-described method of constructing a digital cell model via execution of the executable instructions.
According to a fifth aspect of the present disclosure, there is provided a digital cell system, including the apparatus for constructing a digital cell model described above, further including:
a digital cell engine for running a digital cell model;
the data analysis engine is used for constructing a multi-group chemical database of the mutant cells according to the multi-group chemical data of the mutant cells;
a digital drug library for providing drug information;
a data mapping engine for mapping drug information in the multiple sets of chemical databases and/or the digital drug library to the digital cell model;
and the pharmacodynamic analysis engine is used for predicting the curative effect of the medicine according to the operation result of the digital cell engine.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
Fig. 1 schematically shows a flow diagram of a method of constructing a digital cell model according to an example embodiment of the present disclosure.
Fig. 2 schematically illustrates a diagram of an operational logic architecture of a digital cellular model according to an example embodiment of the present disclosure.
Fig. 3 schematically shows a flow diagram of a method of constructing a digital cell model according to an example embodiment of the present disclosure.
Fig. 4 schematically shows a flow diagram of a method of cell phenotype index construction according to an example embodiment of the disclosure.
Fig. 5 schematically illustrates biochemical process definition information according to an exemplary embodiment of the present disclosure.
Fig. 6 schematically illustrates a schematic diagram of acquiring biochemical constituent information according to a machine learning algorithm according to an example embodiment of the present disclosure.
Fig. 7 schematically illustrates a schematic diagram of acquiring a mathematical model of a biochemical process according to a machine learning algorithm according to an example embodiment of the present disclosure.
Fig. 8 schematically illustrates a structural diagram of a construction apparatus of a digital cell model according to an exemplary embodiment of the present disclosure.
Fig. 9 schematically illustrates a schematic structure of a digital cell system according to an example embodiment of the present disclosure.
Fig. 10 schematically illustrates a structural schematic of an electronic device according to an example embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the present disclosure. One skilled in the relevant art will recognize, however, that the aspects of the disclosure may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
The embodiment of the disclosure provides a method for constructing a digital cell model, referring to fig. 1, the method for constructing a digital cell model may at least include the following steps:
step S110, constructing an initial digital cell model based on biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
Step S120, performing iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells;
step S130, judging whether the digital cell model reaches a steady state or a failure state in the iterative simulation process; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
In the disclosed embodiments, biochemical component information refers to information of individual biochemical components involved in biochemical processes, which may be involved in biochemical processes as reaction substrates, reaction products, catalysts, carriers, or in other roles. From a chemical perspective, the types of biochemical component information may include, but are not limited to, monomeric proteins, multimeric proteins, variant proteins, glycoproteins, polypeptides, amino acids, DNA, RNA, nucleotides, polysaccharides, and the like, of various types of components involved in biochemical processes. In an embodiment of the present disclosure, the biochemical constituent information includes at least a concentration of the biochemical constituent. In some embodiments, the biochemical constituent information may also include positional information of the biochemical constituent. In some embodiments, different biochemical components may be assigned different IDs or names, e.g., assigned different numbers, the IDs or names assigned to the biochemical components being used to effect differentiation of the different biochemical components. The ID or name of the biochemical components may also be added to the biochemical component information.
In some embodiments of the present disclosure, the same biochemical components may play different roles in different biochemical processes. For example, a protein may act as an enzyme in an enzymatic reaction process (an exemplary biochemical reaction process); in a protein synthesis process (another example biochemical reaction process), the above-mentioned protein as an enzyme may be synthesized in the process, and thus, as a reaction product in the protein synthesis process.
In some embodiments of the present disclosure, the same substance (e.g., a substance having a chemical structure) may be treated as two different biochemical components, e.g., given two different numbers, respectively, due to the location or some difference in nature (e.g., difference in charge characteristics, difference in degree of dissociation, difference in spatial characteristics, etc.). In other words, in some embodiments, at least two biochemical components are the same chemical and are located at different locations within the cell, respectively. For example, in a protein synthesis process, the synthesized protein may serve as a biochemical component; in a protein targeting delivery process, the synthesized protein can be delivered to a specific site to perform its function; in this example, the protein that is synthesized (also the protein before targeted delivery) may be taken as one biochemical component, while the protein that is targeted for delivery to a particular site may be taken as another biochemical component, although the two biochemical components are chemically identical or similar.
In an embodiment of the present disclosure, the concentration of each biochemical constituent is recorded in the biochemical constituent pool; the concentration of the biochemical components may be mass concentration (e.g., g/mL, μg/mL, etc.), molar concentration (e.g., μmol/L, mol/L, etc.), or units employing other characterization concentrations. In one example, the concentration of the biochemical components may be in μmol/L.
In embodiments of the present disclosure, biochemical processes refer to processes of changing biochemical components, which may be processes of changing components (chemicals) themselves (e.g., changing from one chemical to another), spatially distributed processes of changing components (e.g., transporting chemicals within cells), or other processes that result in changes in the type, distribution, function, etc. of chemicals. In one example of the present disclosure, the biochemical process includes at least a biochemical reaction process and a biochemical transport process.
The biochemical process may result in a change in the concentration of at least a portion of the biochemical components involved. For example, during a biochemical reaction, biochemical components that are substrates of the reaction may be consumed, and biochemical components that are reaction products may be generated; considering only the biochemical reaction process, the biochemical process may decrease the concentration of the reaction substrate and cause the concentration of the reaction product to increase. For another example, in a biochemical transport process, the pre-transport biochemical components may be consumed during biochemical transport and the post-transport biochemical components may be generated during biochemical transport; considering only this biochemical transport process, this biochemical process may cause the concentration of the biochemical components before transport to decrease and the concentration of the biochemical components after transport to increase.
In embodiments of the present disclosure, a biochemical reaction module for simulating a biochemical process may be constructed according to the biochemical process, and may include a biochemical process equation set. In other words, the biochemical reaction module is used for simulating the biochemical process occurring in the signal path unit by using the biochemical process equation set. In one example, a system of biochemical process equations may be used to model the concentration change of a biochemical constituent after a time step resulting from a biochemical process. The same system of biochemical process equations may include a plurality of concentration variation functions, each concentration variation function describing a concentration variation of a biochemical constituent of the biochemical process, the concentration variation of each biochemical constituent of the biochemical process being represented by a concentration variation function. Thus, the concentration change function of each biochemical component in the biochemical process equation set and the interrelation between the concentration change functions can reflect roles (such as reaction substrates, reaction products, enzymes and the like) of each biochemical component in the biochemical process, mass conservation in the biochemical process, signal dependence in the biochemical process and the like.
Illustratively, if a biochemical process is a dimerization reaction catalyzed by an enzyme, the biochemical process equation set simulating the biochemical process includes at least:
When c (cat) is not equal to 0,
Δc(cat)=0,Δc(Ant)=-2v(Product)*t,Δc(Product)=v(Product)*t;
in the biochemical process equation set of the above example, c (cat) represents the concentration of the enzyme in the biochemical process; the subsequent solution is carried out only when the concentration of the enzyme is not zero, which shows the dependence of the biochemical process on the enzyme; the concentration of c (cat) changes to 0 before and after the biochemical process, indicating that the enzyme plays a catalytic role in the biochemical process without being consumed by itself. Δc (Ant) represents the concentration change of the reaction substrate after a time step t in the biochemical process. Δc (Product) represents the amount of change in the concentration of the reaction Product in the biochemical process after a time step t, v (Product) represents the rate of change in the concentration of the reaction Product in the biochemical process, and t represents a time step.
The various biochemical processes such as intracellular signaling may be at least partially divided into different signal pathways, with the various signal pathways being related to form a signal network within the cell. In the present disclosure, the desired individual signal paths may be obtained by looking up medical literature, biological literature, making link predictions, or by other methods; a signal path element is then constructed that describes the signal path.
In one embodiment of the present disclosure, the signal pathway includes one or more biochemical processes. A system of biochemical process equations describing the biochemical process may be used to construct a signal path unit describing the signal path. In this way, from the system of biochemical process equations in the signal path unit, the individual biochemical processes in the signal path described by the signal path unit and the biochemical components involved in the individual biochemical processes can be obtained. Alternatively, the signal pathway units may be divided into at least a signaling module, a protein transport module, a cell cycle module, a cell programmed death module, and a gene expression module according to the type of signal pathway simulated by the signal pathway unit. In other words, in embodiments of the present disclosure, the signal pathway unit is used to mimic at least one of intracellular signaling, protein transport, cell cycle, programmed cell death, or to mimic intracellular gene expression. It will be appreciated that the signal pathway units of embodiments of the present disclosure may also be used to mimic other biological intracellular signaling processes or biological processes.
Alternatively, the mathematical paradigm used by the biochemical reaction module in the signal path unit in modeling processes such as intracellular gene expression or protein transport may be different from the modeling processes such as signal transduction, programmed cell death, cell cycle, etc.
In one embodiment of the present disclosure, a visualization engine may be constructed that obtains a relationship between each biochemical process by parsing each biochemical reaction module in each signal path unit, and obtains a relationship between each biochemical process as a process node and a biochemical component as a component node. By outputting or displaying these relationships, the network of intracellular signals simulated by the digital cellular model can be output or displayed.
In one embodiment of the present disclosure, the relationships between process nodes and component nodes within cells simulated by a digital cell model may be presented in the form of graphs, and in particular may be presented as cell signal network graphs, according to a visualization engine. In this way, various processes within the cell that are simulated by the digital cell model can be visually, completely and clearly presented. Furthermore, in the iterative simulation process of the digital cell model, the visualization engine can dynamically present the change state of at least part of biochemical components. For example, the nodes (i.e., one biochemical component) may be displayed as different colors when the concentration increases or decreases, respectively; for example, if the concentration of a constituent node increases, the constituent node appears red in the cell signaling network graph; if the concentration of a constituent node decreases, the constituent node is displayed green in the cell signaling network diagram. As another example, the constituent nodes may also be displayed as different volumes depending on the amount by which the concentration of the constituent node (i.e., one biochemical constituent) is increased or decreased. For example, the greater the concentration change of a constituent node, the greater the volume of the point of presentation of that constituent node; the smaller the concentration change of a component node, the smaller the volume of the presented dots of that component node. Illustratively, if one component node appears as a smaller green dot, this indicates that the concentration of that component node drops slightly in the iteration.
Of course, the visualization engine may also present the change in concentration of the single or multiple biochemical components during the iterative simulation process, such as displaying the real-time concentration of the selected biochemical component, or plotting the concentration change curve of the selected biochemical component, and the like.
In some embodiments of the present disclosure, referring to fig. 2, the partial signal paths simulated by the digital cell model are parallel, i.e., there is no apparent order or temporal dependency between the two. In this case, these parallel signal paths may be regarded as one signal path group. Accordingly, the signal path elements of these parallel signal paths will be described in the digital cell model as parallel signal path elements, labeled with the same sequence of element processes. In a logic architecture, the signal path units with the same unit process sequence can be used as a signal path unit group, and the signal path unit groups with different unit process sequences are sequentially cascaded. Of course, in the digital cell model of the presently disclosed embodiments, there may be no signal path element groups and cascades between signal path element groups, such divisions and cascades are merely for the purpose of describing the operational logical relationships between the signal path elements of the digital cell model of the presently disclosed embodiments, and not for the purpose of defining the architecture of the digital cell model itself. It will be appreciated that a single signal path element may be included in a single signal path element group or a plurality of different signal path elements may be included. In embodiments of the present disclosure, individual signal path elements may be labeled with a process order, and a set of biochemical process equations with the same biochemical process order may form a set of signal path elements.
For example, referring to FIG. 2, a digital cell model may logically include a plurality of sequentially cascaded signal path element groups, signal path element group A, signal path element group B. In this example, the signal path element group M is used to represent the last stage signal path element group. During simulation, each signal path unit group is sequentially solved to simulate the overall ordering of biochemical processes of biological cells and the directionality of signal transduction. Any one of the signal path cell groups may have one signal path cell or a plurality of signal path cell groups. For example, the signal path element group a of this example has one signal path element, and includes, for example, a signal path element a1, a signal path element a2, and a signal path element a3. For another example, the exemplary signal path element group B has one signal path element, i.e., signal path element B1.
Alternatively, the individual signal path elements may be unordered from one signal path element to another within the same signal path element group. In solving a signal path element group having a plurality of signal path elements, the solving order of the respective signal path elements may be determined at random. In this way, the parallel nature of the biochemical processes of biological cells can be simulated.
In the embodiment of the present disclosure, each signal path unit group (i.e., each signal path unit) is executed once as a simulation of the digital cell model, and the result after each simulation is used as the basis of the simulation of the next simulation. During each simulation, the individual signal path elements may be solved in sequence in the process order. Thus, the digital cell model, although comprising a large number of biochemical process equations, is not solved simultaneously, but sequentially in the process order of the signal path elements.
In one embodiment of the present disclosure, each signal path element is performed in unit process order as the digital cell model is simulated any one time.
In an embodiment of the present disclosure, the biochemical reaction modules in the signal path unit have a module process sequence. And when any one of the signal path units is executed, executing each biochemical reaction module in the signal path unit according to the sequence of the module processes.
When executing any one biochemical reaction module, a biochemical process equation set in the biochemical reaction module can be solved, and the biochemical process equation set refers to the current concentration of biochemical components in the biochemical component pool; after the solution of any one of the biochemical process equation sets is completed, updating the information of each biochemical component in the biochemical component pool according to the solution result, in particular updating the concentration of each biochemical component.
In some embodiments of the present disclosure, in solving signal path elements, if multiple signal path elements have the same element process order, then the signal path elements may be solved in parallel as a whole. In particular, in one example, the order of execution between these signal path elements having the same element process order is random at each simulation; during a number of iterations, the individual signal path elements may be considered to be parallel. Thus, the parallel characteristic among the signal path units in the same signal path unit group can be simulated, and the robustness of the digital cell model is improved.
In some embodiments of the present disclosure, at least one signal path unit comprises a plurality of signal path subunits, any one of which comprises one biochemical reaction module or a plurality of biochemical reaction modules having a process order. When solving a signal path unit having a plurality of signal path subunits, a solving order of the plurality of signal path subunits may be randomly determined, and the plurality of signal path subunits may be sequentially solved according to the solving order.
In other words, at the logic level, there may also be some parallel signal path sub-units inside the same signal path unit; each signal path subunit may simulate a portion of the signal path, where the portion of the signal path may include a biochemical process, or may include a plurality of biochemical processes cascaded in sequence, where each biochemical process is simulated by a system of biochemical process equations. Of course, in a further embodiment, each signal path subunit may also have a further-stage submodule.
In embodiments of the present disclosure, each biochemical process shares a biochemical component pool, i.e., a biochemical process equation set corresponding to each biochemical process needs to be solved based on the biochemical component pool and the result of the solution needs to be reflected into the biochemical component pool. Specifically, when performing calculations according to the biochemical process equations, it is necessary to use the concentrations of the respective biochemical components from the biochemical components tank; then, the concentration of the biochemical components involved in the biochemical process equation set in the biochemical component pool is required to be updated according to the calculation result; this can simulate the consumption, increase, dependence, etc. of the biochemical process on the surrounding biochemical components. In some embodiments of the present disclosure, after each completion of the solution of the system of biochemical process equations, the updated biochemical components pool may be determined using the solution results and the updated biochemical components pool (the biochemical components pool before the solution of the system of biochemical process equations).
In this way, in the iterative simulation process of the digital cell model, the biochemical component pool is updated according to the solving result of each biochemical process equation set so as to simulate the dynamic change process of biochemical components of the digital cell model in the cell activity process. In some embodiments of the present disclosure, data for a biochemical constituent cell at a particular stage of each simulation, such as the biochemical constituent cell after each simulation has ended (i.e., the concentration of each biochemical constituent), may be recorded. Wherein, in the iterative simulation process of the digital cell model, the data can be used as the historical data of the biochemical component pool for evaluating the iterative simulation process of the digital cell model. It will be appreciated that the historical data of the biochemical constituent cells also includes how many times the respective data was generated by the simulation process. In other words, the historical data of the biochemical components cell may include the number of simulations of the digital cell model and biochemical components cell data corresponding to the number of simulations.
Referring to fig. 3, the digital cell model may be evaluated during iterative simulation of the digital cell model to determine whether the digital cell model reaches a steady state during the iterative simulation. For example, the state of the digital cell model may be evaluated according to the process evaluation model library, to determine whether the digital cell model reaches a steady state or a failure state, and further determine whether the iterative simulation process of the digital cell model reaches an end condition. In a further example, if the digital cell model enters a failure state during the iterative simulation process, the initial digital cell model may be updated and the iterative simulation and evaluation performed again; the loop is then repeated until the digital cell model is evaluated as a steady state during the iterative simulation process. The digital cell model is evaluated as a steady state in the iterative simulation process, which indicates that the current initial digital cell model can reflect some cell activity rules of the biological cells in the iterative simulation process, and indicates that the initial biochemical component cell and each signal path unit group defined in the current initial digital cell model can effectively simulate the biochemical component distribution and the cell activity process rules of the biological cells to a certain extent. The target digital cell model determined according to the current initial digital cell model can more effectively simulate at least part of the cell activities of biological cells.
Thus, according to the method for constructing the digital cell model, whether the biochemical component pool and each signal path unit of the initial digital cell model are suitable or not can be judged according to the evaluation result of the initial digital cell model in the iterative simulation process; when the initial digital cell model is evaluated as a failure state in the iterative simulation process, it may be determined that the current initial digital cell model is inconvenient in terms of the kind, concentration, simulation of each biochemical process, etc., and then the initial digital cell model may be reacquired by changing at least one of the biochemical component tank and the signal path unit, that is, updating the initial digital cell model by updating at least one of the biochemical component tank and the signal path unit. The loop is then repeated until the initial digital cell model is evaluated as a steady state during the iterative simulation process. Thus, the method for constructing the digital cell model can overcome the defect of lack of accurate cell biology knowledge at present, and can determine a more proper biochemical component pool and a signal path unit through updating the initial digital cell model-iterative simulation and process evaluation-the cyclic process, so as to obtain the digital cell model capable of achieving a steady state to achieve effective simulation of at least one part of cell activities.
In one embodiment of the present disclosure, the biochemical information upon which the digital cell model is constructed includes at least one of signal pathway information, protein network information, gene network information, biomarker information, information related to biochemical processes, and the like. Of course, the digital cell model may also be constructed based on other more biochemical information. Further, at least part of this biochemical information is obtained from the publications by knowledge extraction.
In some embodiments of the present disclosure, the initial digital cell model is updated when the digital cell model reaches a failure state during iterative simulation. When updating the initial digital cell model, at least one of the initial biochemical component cell and the plurality of signal path units may be updated such that the biochemical component cell and the plurality of signal path units of the new initial digital cell model are different from the initial digital cell model before updating as a whole.
For example, in one embodiment of the present disclosure, the initial digital cell model may be updated by updating the initial biochemical component cell (the initial digital cell model before any iterative simulation is performed), for example, one or more new biochemical component information may be added to the initial biochemical component cell before updating, or one or more biochemical component information may be deleted from the initial biochemical component cell before updating, or the concentration of the biochemical component of the one or more biochemical component information in the initial biochemical component cell before updating may be changed, or two or three of the above strategies of adding new biochemical component information, deleting biochemical component information, and changing the concentration of the biochemical component may be applied simultaneously. Of course, in other embodiments of the present disclosure, a new initial biochemical components pool may also be created (i.e., the initial biochemical components pool is regenerated), and the initial biochemical components pool before the update is replaced with a new initial biochemical components Chi Laiti after the new initial biochemical components pool is judged to be different from the initial biochemical components pool before the update.
As another example, in another embodiment of the present disclosure, the initial digital cell model may be updated by updating one or more signal path elements. For example, at least one of strategies such as adjusting a signal path unit to which the signal path unit belongs, adding a new signal path unit to the current signal path unit, deleting a signal path unit from the current signal path unit, updating the type or number of biochemical process equations in the current signal path unit, updating the constant value of any biochemical process equation set in the current signal path unit may be adopted. Of course, in other embodiments of the present disclosure, a new signal path unit may also be generated (i.e., regenerated), and the signal path unit before the update may be replaced with a new signal path unit after determining that the new signal path unit is different from the signal path unit before the update.
In some embodiments of the present disclosure, the strategy of updating the initial pool of biochemical components may be employed separately for each update of the initial digital cell model, the strategy of updating one or more signal path elements may be employed separately, and the strategy of updating the initial pool of biochemical components and the strategy of updating one or more signal path elements may be employed simultaneously. The strategy used in any two re-acquisitions of the initial digital cell model may be different or the same.
For example, when the initial digital cell model is updated the previous time, the initial digital cell model may be updated by independently adopting a strategy of changing the concentration of the biochemical components; when the initial digital cell model is updated at the next time, four strategies of adding new biochemical component information, deleting biochemical component information, adding a new biochemical process equation set, deleting a biochemical process equation set are adopted to update the initial digital cell model.
It will be appreciated that in some cases there is a correlation between different policies. For example, if the biochemical component information in the biochemical component reservoir is increased, it is often necessary to increase the biochemical process equations associated with the newly added biochemical component information, which results in an adjustment of at least one signal path element. For another example, if a portion of the biochemical component information is deleted from the biochemical component reservoir, it is often necessary to delete the biochemical process equations associated with the deleted biochemical component information, which results in an adjustment of at least one signal path element.
In some embodiments of the present disclosure, the policy used for each update of the initial digital cell model may be determined according to a preset rule, for example, the same policy is used multiple times in succession, or the policy is cycled according to a preset policy sequence, where the policy sequence records the policy used for each update of the initial digital cell model in a policy cycle period.
In other embodiments of the present disclosure, the strategy employed in updating the initial digital cell model may be determined based on the reason the digital cell model is evaluated as a failure state during the iterative simulation process, or the specific state evaluated as a failure state, etc., for example, an expert system may be introduced to improve the accuracy of updating the initial digital cell model, or a technical expert intervenes to provide a more appropriate updating strategy. Of course, in other embodiments of the present disclosure, other ways of determining the strategy to be used each time an initial digital cell model is updated may also be used to enable the initial digital cell model to be updated.
In one embodiment of the present disclosure, a parameter adjustment module and a structure adjustment module may be constructed and the current initial digital cell model is adjusted by means of the parameter adjustment module and the structure adjustment module to form a new initial digital cell model. Specifically, the value of the constant of the biochemical process equation set in the biochemical component concentration in the biochemical component information can be adjusted through the parameter adjusting module; the information of the biochemical components in the biochemical component tank can be increased or decreased by the structure adjusting module, the biochemical process equation set included in the signal path unit can be adjusted, or the signal path unit total included in the signal path unit can be adjusted. In other words, the parameter adjustment module does not cause the types and the numbers of the biochemical components in the digital cell model to change, nor does it cause the types, the numbers and the positions of the biochemical process equations in the respective signal path units to change, but only adjusts the concentrations of the biochemical components or the parameters of the biochemical process equations. The adjustment mode can generally have smaller adjustment amplitude and better adjustment precision, and is beneficial to fine adjustment of the digital cell model. The structure adjustment module can adjust the structural characteristics of the digital cell model such as the types and the quantity of the biochemical components in the biochemical component pool, the types, the quantity and the positions of the biochemical process equation sets in the signal path unit and the like. By means of the adjustment mode, the architecture of the digital cell model can be changed to a large extent, and further the possible architecture of the digital cell model can be obtained in a rough adjustment mode.
In one embodiment of the present disclosure, the initial digital cell model may be updated first using the structure adjustment module until a viable architecture of a digital cell model with good prospects is obtained. Then, the initial digital cell model after structure adjustment is updated through the parameter adjustment module, so that the digital cell model capable of simulating partial functions and processes of the biochemical cells is obtained under the feasible architecture. Of course, in other embodiments of the present disclosure, the structure adjustment module and the structure adjustment module may alternatively be used to update the initial digital cell model.
Optionally, the current initial digital cell model is updated based on at least one of the following information when the initial digital cell model is updated at least once:
information derived from medical literature, information derived from biological literature, information derived from high-throughput cell experiments, information derived from high-throughput sequencing, information predicted based on literature information, experimental information, or sequencing information.
For example, parameters involved in a biochemical process, in particular constants in a system of biochemical process equations, can be determined by searching for documents on PubMed for the biochemical process and integrating the information of the documents. In other words, in the embodiments of the present disclosure, various information required for constructing a digital cell model, such as parameters of biochemical components and parameters of biochemical processes, information of signal pathways, etc., may be obtained using the publications, or the required information may be obtained experimentally. Through high-throughput cell experiments or high-throughput sequencing technology, required data can be obtained under the same experimental standard conditions, so that the data have higher credibility on one hand, and the defect that information is difficult to reference mutually due to different technical standards and technical means adopted in different documents can be overcome on the other hand.
In one embodiment of the present disclosure, the key parameters required to construct a digital cell model may be determined by searching and analyzing the publications. At least some of these key parameters, such as those that are easily verified or obtained by high-throughput cell experiments or high-throughput sequencing techniques, or parameters that have large differences, such as those in different documents, are verified by experiments or obtained by themselves, to ensure the accuracy and effectiveness of these key parameters, thereby improving the efficiency of obtaining digital cell models and improving the closeness of the digital cell models to biological cell simulations.
In some embodiments of the present disclosure, referring to fig. 3, the method of constructing a digital cell model may further include: acquiring a biochemical component pool (namely an initial biochemical component pool) of the initial digital cell model according to a biochemical component database; the biochemical components database comprises a plurality of biochemical components setting information, and the biochemical components setting information comprises the concentration range of the biochemical components.
Alternatively, a plurality of biochemical component setting information may be selected from the biochemical component database, and the biochemical component information may be determined according to the selected biochemical component setting information, each biochemical component information forming an initial biochemical component pool. Specifically, the biochemical component setting information includes a concentration range of a biochemical component, the biochemical component information includes a concentration of the same biochemical component, and the concentration in the biochemical component information is within the concentration range of the biochemical component setting information. In this way, the types of the biochemical components involved in the initial biochemical components pool do not exceed the types of the biochemical components in the biochemical components database, and the concentration of the biochemical components in the initial biochemical components pool is within the concentration range of the biochemical components in the biochemical components database. Thus, the biochemical components database can be used as the basis and restriction for generating the initial biochemical components pool; of course, the method can also be used as the basis and restriction for updating the initial biochemical component pool.
Illustratively, the biochemical components database includes N1 (N1 is a positive integer) biochemical components setting information, i.e., N1 biochemical components are involved. When an initial biochemical component pool is generated according to a biochemical component database, N2 (N2 is a positive integer and N2 is not more than N1) biochemical component setting information can be extracted from the biochemical component database, corresponding N2 biochemical component information is generated based on the N2 biochemical component setting information, and the concentration of the biochemical components of each biochemical component information is within the concentration range of the biochemical components of the corresponding biochemical component setting information; the N2 biochemical components information may constitute an initial biochemical components pool.
Alternatively, the biochemical components database may be acquired before the initial biochemical components pool of the initial digital cell model is acquired from the biochemical components database.
In the embodiments of the present disclosure, the existing biochemical components database may be directly obtained, the required biochemical components database may be obtained by modification and supplementation based on the existing database, or the biochemical components database may be constructed de novo.
In one embodiment of the present disclosure, the biochemical components database may be constructed de novo, and the data used to construct the biochemical components database may be derived at least in part from existing databases, publications (e.g., journals, conference papers, treatises, etc.) in biological fields (e.g., cell biology), obtained through biological experiments and studies, and in some cases, specific data may also be obtained by targeted performance of cell biology studies (e.g., high throughput cell experiments), which may include components in biological cells and concentrations of those components. Of course, it is understood that there may be differences in the composition and concentration of the biological cells from data of different sources.
For example, the biochemical components database may be searched for at least a portion of the biochemical components and their associated data from existing databases and/or publications in the biological domain, and these data may be corrected or uncorrected to form biochemical components setting information and added to the biochemical components database.
In some cases, according to the database and/or the biological domain disclosure, if the plurality of data sources each provide a concentration of a particular biochemical component and the concentrations of biochemical components of different data sources are all relatively close, e.g., the concentrations of biochemical components of each data source all fluctuate within the same order of magnitude, the concentration of the particular biochemical component may be considered to be highly deterministic, and the concentration range of the biochemical component may be further formed uncorrected into biochemical component setting information and added to the biochemical component database.
In other cases, according to the database and/or the biological domain disclosure, if the plurality of data sources each provide a concentration of a particular biochemical component and the concentrations of the biochemical components of different data sources differ widely, for example, the fluctuation range of the concentration of the biochemical component of each data source exceeds an order of magnitude, the concentration of the particular biochemical component may be considered to have a large uncertainty, and the concentration range of the particular biochemical component may be corrected to form biochemical component setting information and added to the biochemical component database. For example, deleting data which deviate significantly or statistically analyzing the concentrations of biochemical components from the respective data sources, determining the concentration range of the specific biochemical component, and generating biochemical component setting information based on the determined concentration range.
It will be appreciated that for different biochemical components, different criteria need to be employed to determine whether the concentrations disclosed in the existing data have a large uncertainty. For example, some biochemical components need to participate in many biochemical processes, or for other reasons, the concentration itself does not fluctuate too much (e.g., concentration fluctuates little more than an order of magnitude), at which time stricter criteria need to be employed to determine if the concentration disclosed by the existing data has a large uncertainty. In other examples, some biochemical components have large concentration differences during biochemical processes or in different states of cells, where a more relaxed standard is required to determine whether the concentration disclosed by the existing data has large uncertainty.
In some cases, if the concentration of a biochemical component is found to have a large uncertainty (e.g., concentration differences are large) or difficult to determine based on existing data (e.g., existing database and/or biological field publications), it is contemplated that the concentration of the biochemical component and its concentration range may be determined by biological experiments (e.g., high-throughput cell experiments, high-throughput sequencing techniques, etc.) to increase the accuracy of the biochemical component database. Of course, not every biochemical component with a large uncertainty in its concentration range is subject to biological experimentation, i.e., it is not mandatory that every biochemical component in every biochemical component database have a large certainty in its concentration range.
In some cases, if a new biochemical component is found in a biological experiment or study, biochemical component setting information of the new biochemical component may be added to a biochemical component database; therefore, a knowledge base can be provided for constructing a more accurate digital cell model by continuously perfecting a biochemical component database, so that the constructed digital cell model can simulate the biochemical process of biological cells more effectively.
In some cases, the concentrations of the partial biochemical components may be predicted using a machine learning algorithm, and biochemical component setting information may be formed based on the prediction results and added to a biochemical component database. For example, when the concentration of a particular biochemical constituent is difficult to measure directly, a biochemical network model associated with the biochemical constituent may be constructed from existing data and then the concentration or concentration range of the biochemical constituent may be predicted using a machine learning algorithm.
Referring to the example provided in fig. 6, in order to calculate the concentration of a specific biochemical component At, a biochemical network model may be constructed with the biochemical component At as one node, the biochemical network model further including nodes A1, A2, A3, and A4; nodes A1, A2 and A3 represent biochemical processes generating the specific biochemical components At, and node A4 represents biochemical processes involving the specific biochemical components At. By obtaining the data of the nodes A1, A2, A3 and A4, the concentration or concentration range of the specific biochemical components At can be deduced.
In some cases, new biochemical components and new biochemical processes may be predicted by means of link prediction techniques based on known biochemical networks. The concentration ranges of the predicted biochemical components can be directly verified (such as biological verification, especially high-throughput cell experiment verification) or indirectly verified (such as verification through different biochemical network models), and biochemical component setting information is formed and added into a biochemical component database, so that the biochemical component database is perfected, and a digital cell model constructed based on the biochemical component database can be more similar to real cells.
In one embodiment of the present disclosure, a database parser may be constructed that parses existing data materials to generate biochemical component setting information, such as generating biochemical component setting information from an existing database or generating biochemical component setting information from a publication. Further, at least part of the biochemical components setting information constructed by the database analyzer can be used as initial biochemical components setting information; the initial biochemical component setting information may be considered corrected or otherwise corrected as final available biochemical component setting information for use in constructing or updating an initial biochemical component reservoir. For example, the database analyzer may analyze existing documents to obtain the concentration of a particular biochemical constituent in different documents, and aggregate the concentrations and information about the documents as initial biochemical constituent setting information. The technician may make corrections to the concentrations involved in the initial biochemical constituent setting information, such as deleting some obvious erroneous concentrations or setting concentration ranges based on the pooled concentrations, to obtain final useful biochemical constituent setting information.
In some embodiments of the present disclosure, the initial biochemical components pool of the current initial digital cell model is updated in accordance with the biochemical components database when the initial biochemical components pool is updated. Specifically, the type of the biochemical component information in the initial biochemical component tank and the concentration of the biochemical component in the biochemical component information can be adjusted according to the biochemical component database, or the type of the biochemical component information in the initial biochemical component tank and the concentration of the biochemical component of at least one biochemical component information can be adjusted at the same time, so that a new initial biochemical component tank is obtained. Of course, in other embodiments of the present disclosure, a new initial biochemical component tank may be regenerated according to the initial biochemical component tank, and in the case that the new initial biochemical component tank is different from the initial biochemical component tank before updating, the initial biochemical component tank before updating is replaced with the new initial biochemical component tank, thereby completing updating of the initial biochemical component tank.
In one embodiment of the present disclosure, the biochemical components setting information in the biochemical components database may further include a concentration search step of each biochemical component. In other words, the biochemical components setting information records at least the concentration ranges and concentration search steps of the respective biochemical components. When the concentration of a specific biochemical component in the initial biochemical component tank is updated according to the biochemical component database, the new concentration of the specific biochemical component in the initial biochemical component tank can be determined according to the current concentration of the specific biochemical component in the initial biochemical component tank, the concentration range of the specific biochemical component in the biochemical component database and the concentration searching step length, and then the biochemical component information of the specific biochemical component is updated. Of course, in other embodiments of the present disclosure, the concentration range of the biochemical components of the at least one biochemical component-setting information may be a plurality of discrete concentration values; when updating the biochemical components information of the biochemical components in the initial biochemical components cell, a new concentration value may be selected from the concentration range of the biochemical components setting information of the biochemical components.
In some embodiments of the present disclosure, the concentration of the biochemical components of the at least one biochemical component setting information is a specific concentration (i.e., a dot value), and the concentration range of the biochemical components is the specific concentration. Optionally, in the process of constructing the biochemical component database, the concentration change of some biochemical components has little influence on the biochemical process of the cells, or the concentration or the content of the biochemical components is relatively stable, and the concentration range of the biochemical components in the biochemical component database can be set to a specific concentration, i.e. the concentration range of the biochemical component setting information can be a specific concentration, so that the acquisition difficulty of the digital cell model is reduced while the partial biological rules of the biological cells are reflected.
In some embodiments of the present disclosure, each biochemical constituent setting information in the biochemical constituent database is also labeled with a reliability parameter (e.g., a reliability level) that is used to characterize the concentration reliability of the biochemical constituent. In one example, when the biochemical component setting information has a large concentration range, the reliability of the concentration of the biochemical component is low; when the biochemical component setting information has a smaller concentration range, the reliability of the concentration of the biochemical component is higher. In another example, when the concentration range of the biochemical component setting information is derived from highly reliable data, such as from biological experimental data, the reliability of the concentration of the biochemical component is high; when the concentration range of the biochemical component setting information is derived from a data material with low reliability, for example, from a non-authoritative journal, newspaper, or just a predicted value, the reliability of the concentration of the biochemical component is low.
In one example, when generating an initial biochemical components pool from a biochemical components database, high confidence biochemical components can be preferentially employed or the biochemical components can be extracted from the biochemical components database at least in part according to the confidence of the biochemical components.
In one example, when updating the initial biochemical components pool according to the biochemical components database, the biochemical components information corresponding to the biochemical components setting information of low reliability may be preferentially adjusted, for example, the concentration of the biochemical components of at least part of these biochemical components information (biochemical components information corresponding to the biochemical components setting information of low reliability) is deleted or at least part of these biochemical components information (biochemical components setting information of low reliability) is changed.
In some embodiments of the present disclosure, referring to fig. 3, the method of constructing a digital cell model further comprises generating or updating an initial digital cell model from the signal path database; specifically, signal pathway elements in the initial digital cell model are generated or updated from the signal pathway database.
In one embodiment of the present disclosure, the following method may be employed to generate or update signal pathway units in an initial digital cell model from a signal pathway database:
Generating initial signal path information according to the signal path database;
based on the initial signal path information and the biochemical process Fan Shichi, individual signal path elements of the initial digital cell model are acquired.
Wherein the signal path database includes a plurality of signal path information and biochemical processes Fan Shichi; the biochemical process paradigm pool comprises a plurality of biochemical process paradigms for describing biochemical process rules; the signal path information includes biochemical process information describing respective biochemical processes of the signal path, and is marked with a process order; each piece of biochemical process information comprises a biochemical process paradigm cited by a biochemical process, meanings of variables in the cited biochemical process paradigm, a value range of constants in the cited biochemical process paradigm, and an order in signal path information; the initial signal path information includes a plurality of signal path information acquired from the signal path database, and constants in a biochemical process paradigm referred to by biochemical process information in the initial signal path information are defined as point values.
Alternatively, each signal path element in the initial digital cell model may be acquired from a signal path database when the initial digital cell model is constructed. When updating the initial digital cell model by updating the signal path element, the signal path element in the initial digital cell model may be updated from the signal path database.
Alternatively, each signal path may be constructed based on existing knowledge, for example, knowledge about intracellular signal paths in biological literature or medical literature, that is, each signal path information may be constructed; the individual biochemical process information in the signal path information can be converted into the desired signal path element by referencing a biochemical process paradigm cell. According to the process sequence of the information of each signal path, the combination mode of each signal path unit can be determined, and each signal path unit is further constructed.
In one embodiment of the present disclosure, the biochemical process paradigm has variables (including independent and dependent variables) and constants, and a mathematical description of mathematical relationships between the variables, constants. In the biochemical process paradigm, individual variables and constants are not assigned specific meanings, nor are constants assigned. In other words, the biochemical process paradigm is used only to represent mathematical laws.
In one embodiment of the present disclosure, the meaning of the variables in the biochemical process paradigm referred to by the biochemical process information may refer to the biochemical process paradigm in which the biochemical component information, and in particular the concentration of the biochemical components in the referred biochemical component information, is referred to by the respective variables.
In this way, when the biochemical process information invokes the referenced biochemical process paradigm, one or more biochemical process mathematical models can be generated from the biochemical process paradigm that are capable of simulating the biochemical process defined by the biochemical process information. Specifically, according to the concentration of each biochemical component defined by replacing each variable in the biochemical process paradigm with biochemical process information, a constant can be determined as a constant according to the definition of the value range of the constant in the biochemical process information. In the embodiment of the disclosure, the signal path database is provided with the biochemical process paradigm pool and the signal path information, so that on one hand, the construction of mathematical models of various biochemical processes can be simplified, and the complexity of the signal path database is reduced; on the other hand, the constructed signal path database can meet the requirements of knowledge expression and calculation expression simultaneously. In particular, in one example, descriptions of meanings of variables in the biochemical process information and labels or names of the biochemical process information may employ abbreviations or custom expression rules that are conventional in the art, thereby making the biochemical process information easier to construct and easier to update or modify.
As an example, one biochemical process paradigm is of the type "mm" and includes the following two equations:
activators.v_max=activators.k_cat*activators.concentration;
v_t=activators.v_max*substrate.concentrationmultimer/
(substrate.k_m+substrate.concentration)multimer
the biochemical process paradigm may represent a kinase-catalyzed multimerization process, wherein none of activators, substrate, etc. represents a specific biochemical component, none of k_m, k_cat, etc. represents a specific constant. FIG. 4 illustrates a biochemical process information and a mathematical model of a biochemical process that the biochemical process information can generate when invoking a biochemical process paradigm represented by "mm". In the example of fig. 4, the mathematical model of the biochemical process describes the reaction rate of the kinase-catalyzed multimerization process, embodied in the rate of change of the concentration of the product v_t. Based on the reaction rate of the kinase-catalyzed multimerization process, in combination with the time step, the concentration change of the individual biochemical components involved after a time step can be determined.
In some embodiments of the present disclosure, initial signal path information may be generated from signal path information, the initial signal path information including a plurality of biochemical process information extracted from the signal path information, and constants in the biochemical process information being point values instead of range values in the initial signal path information; the values of the constants of the biochemical process information in the initial signal path information satisfy the ranges in the signal path information. In this manner, the initial signal path information can be transformed into a plurality of specific biochemical process mathematical models by calling a biochemical process paradigm cell, which is used to construct individual signal path elements of the digital cell model. When an initial digital cell model needs to be updated, the signal path unit of the digital cell model may be updated by updating the initial signal path information.
In one example, the biochemical process information in the initial signal path information may directly generate a system of biochemical process equations by calling a biochemical process paradigm in a biochemical process paradigm pool, and determining a signal path unit to which the system of biochemical process equations belongs according to a process order of the initial signal path information.
By way of example, a biochemical process paradigm may include a set of paradigm equations consisting of a plurality of paradigm equations, each of which is used to model a change in a variable over a time step, such as an increase or decrease in a variable over a time step. In a normal equation set, the increase or decrease of a variable can reflect the role played by the variable; in general, when a variable increases after a time step, then the variable is a dependent variable, i.e., represents the concentration of the reaction product during the biochemical process; when a variable decreases after a time step, then the variable is an independent variable, i.e. represents the concentration of the reaction substrate during the biochemical process; when a variable remains unchanged after a time step, it is indicated that the variable represents an enzyme or the like.
In the embodiment of the disclosure, in the signal path database, the biochemical process information further includes a constant value range in the biochemical process paradigm, where the value range may be a constant value (i.e., the value range is a point value) or a range value. In some examples, a constant in the biochemical process information may be a point value when it is highly deterministic or well defined; for example, when a constant in the biochemical process information represents a Mie constant, it may be a point value. When the certainty of the constants in the biochemical process information is not very high, it can be configured as a range value.
Optionally, at least one biochemical process is determined based on at least one of medical literature, biological literature, using high throughput cell experiments, using machine learning algorithms, using link prediction algorithms, biochemical process information is generated from the biochemical process and added to the signal path information of the signal path database. Thus, by continuously perfecting the signal path database, the accuracy of the digital cell model can be improved.
In one example of the present disclosure, the constants in a portion of a biochemical process may be determined by high-throughput biological assay techniques (e.g., high-throughput cell assay techniques or high-throughput sequencing techniques) that may be used as a basis for determining the range of values of the constants in biochemical process information.
In other examples of the present disclosure, the constants of the partial biochemical processes may be determined by a machine learning algorithm, and the range of values of the constants of the corresponding biochemical process information may be determined from the constants of the biochemical processes.
Referring to the example provided in fig. 7, in order to obtain a constant of a specific biochemical process, a biochemical network model may be constructed using the specific biochemical process Bt as a node, the biochemical network model further including node B1, node B2, node B3, and node B4; node B1, node B2, node B3 represent biochemical component information involved in a biochemical process Bt, and node B4 represents biochemical component information generated by a specific biochemical process Bt. By obtaining data of node B1, node B2, node B3 and node B4, for example, concentration data of node B1, node B2, node B3 and node B4 in different concentration combinations through high-throughput cell experiments, a biochemical process mathematical model of the biochemical process Bt can be determined through machine learning, and thus constants of the biochemical process can be determined according to the biochemical process mathematical model.
In one embodiment of the present disclosure, the types and amounts of the biochemical components included in the biochemical components tank are identical to the types and amounts of the biochemical component setting information in the biochemical components database; alternatively, the biochemical process information or the signal path information referred to in the signal path database is embodied in a digital cell model. This approach can maximize the proximity of the digital cell model to the biological cells. It will then be appreciated that, subject to the limitations of the simulation means and the knowledge of the biological cells, the digital cell model may have difficulty in fully simulating the individual functions of the biological cells at certain stages, for example, multiple biochemical processes actually performed in sequence may be considered as one biochemical process to be simulated to try to simulate the end result of these processes rather than the specific process of the individual biochemical processes. Thus, in other embodiments of the present disclosure, the digital cell model is constructed based on the biochemical components database and the signal path database, but rather than pursuing the digital cell model to be complete and without missing the various knowledge and information that is embodied in the biochemical components database and the signal path database, the simulation of the biological cells is pursued with a certain degree of precision and functionality. Of course, as the knowledge and information recorded in the biochemical components database and the signal path database increases, the description of the intracellular signal network becomes finer and finer, and the simulation accuracy of the new versions of the digital cell model retrieved from the biochemical components database and the signal path database on the biological cells will become higher and higher. In other words, in the embodiments of the present disclosure, the biochemical components database and the signal path database may be further improved continuously, and a more improved digital cell model may be constructed using the more improved biochemical components database and the signal path database according to the need.
In one embodiment of the present disclosure, the obtained target digital cell model may be saved as a biochemical component pool and individual signal path units to facilitate direct application of the target digital cell model. In another embodiment, the digital cell model of interest may be saved as a model database, which may include a component sub-database, a process sub-database, and a Fan Shizi database. The molecular database is used for storing the information of each biochemical component in the biochemical component pool of the target digital cell model. The process sub-database is used for storing various parameters of various biochemical process equation sets in the signal path unit, namely, the biochemical process equation sets are stored in the form of biochemical process information. Fan Shizi database is used to store various biochemical process paradigms involved in various biochemical process equations, such as PM that may reference or replicate the signal path database. Thus, when the target digital cell model is modified to simulate a new cell, the biochemical component information in the molecular database and the biochemical process information in the process sub-database can be directly modified, and then a new digital cell model is constructed according to the modified molecular database, the modified process sub-database and the paradigm sub-database to simulate a new cell. Specifically, the modified molecular database can be used for constructing a biochemical component pool of a new digital cell model; the modified process sub-database can construct various signal path elements of the new digital cell model by referencing the paradigm sub-database.
In some embodiments of the present disclosure, the signal path elements may be parameterized prior to first acquiring the initial digital cell model to obtain individual target signal path elements. And then, the target signal path units with the correlation are taken as a whole for parameter adjustment, for example, a plurality of signal path units on the same signal path are taken as a signal path model group with the correlation for parameter adjustment, and the target signal path model group is obtained. Then, an initial digital cell model is obtained from the set of target signal path models.
In one embodiment of the disclosure, the method for constructing a digital cell model further includes obtaining a plurality of target signal path units; when an initial digital cell model is acquired, determining signal path elements in the initial digital cell model according to the target signal path elements. In other words, the parameter adjustment can be performed on each signal path unit first, and then the parameter adjustment can be performed on the whole level of the digital cell model according to the parameter adjustment result of the signal path unit.
The acquiring any one of the target signal path units includes:
constructing an initial signal path model, wherein the initial signal path model comprises an initial signal path unit and biochemical component information related to the initial signal path unit, and the biochemical component information comprises the concentration of biochemical components;
Determining screening conditions of the signal pathway model, wherein the screening conditions comprise concentration variation trends of one or more biochemical components serving as markers;
performing simulation on the initial signal path model, and screening according to screening conditions in a simulation process; for example, it is determined whether the concentration of the biochemical component as the marker in the simulation process is increased or decreased, so as to determine whether the change rule of the marker simulated by the current signal path model significantly violates the biological intracellular rule.
If the simulation result of the initial signal path model does not meet the screening conditions, updating the initial signal path model until the screening conditions are met;
and if the simulation result of the initial signal path model meets the screening condition, determining the initial signal path model as a target signal path model.
In one embodiment of the disclosure, the method of constructing a digital cell model further comprises obtaining a plurality of target signal path model sets; each of the target signal path model sets includes a plurality of signal path elements; when an initial digital cell model is acquired, determining signal path units in the initial digital cell model according to the signal path units in the target signal path model group. In other words, a plurality of correlated (e.g., signaling-connected) signal path units may be first referred to as a whole, and then referred to as a whole at the digital cell model level after the reference. Thus, the efficiency of parameter adjustment of the digital cell model can be improved as a whole, and the digital cell model can simulate biological cells more effectively.
Optionally, determining any one of the set of target signal path models includes:
acquiring an initial signal path model group, wherein the initial signal path model group comprises a plurality of target signal path units and biochemical component information related to each target signal path unit, and the biochemical component information comprises the concentration of biochemical components;
determining screening conditions for a set of correlated signal pathway models, the screening conditions comprising a trend in concentration of one or more biochemical components as markers;
performing simulation on the initial signal path model group, and screening according to screening conditions in a simulation process; for example, it is judged whether the concentration of the biochemical components as markers in the simulation process is increased or decreased;
if the simulation result of the initial signal path model set does not meet the screening conditions, updating the initial signal path model set until the screening conditions are met; updating the initial set of signal path models includes updating at least one signal path element in the set of signal path models having a correlation;
and if the simulation result of the initial signal path model group meets the screening condition, determining the initial signal path model group as a target signal path model group.
Judgment of validity of #
In some embodiments of the present disclosure, in step S110, after the initial digital cell model is acquired, the validity of the initial digital cell model may also be determined; when the initial digital cell model meets the legal requirement, step S120 is further performed to iteratively simulate the digital cell model.
The validity determination of the initial digital cell model can adopt a white list strategy or a black list strategy, and the method is not particularly limited. The white list policy is to preset at least one validity rule; and judging that the initial digital cell model meets the legal requirement as long as the initial digital cell model meets any one legal rule, or else, judging that the initial digital cell model does not meet the legal requirement. The blacklist strategy is to preset at least one illegal rule; and if the initial digital cell model meets any one of the illegal rules, judging that the initial digital cell model does not meet the legal rules, otherwise, judging that the initial digital cell model meets the legal requirements.
The method for constructing a digital cell model provided by the present disclosure may further include, between step S110 and step S120, judging whether the initial digital cell model satisfies any one of the illegal rules in the illegal rule library after acquiring the initial digital cell model; when the initial digital cell model meets any one of the illegal rules, judging that the initial digital cell model does not meet the legal conditions, and returning to the step S110 to acquire the initial digital cell model again; when the initial digital cell model does not satisfy any of the illegal rules, it is determined that the initial digital cell model satisfies the legal requirements, and the process proceeds to step S120.
As one example, the illegitimate rules may include, but are not limited to, the following rules: at least one biochemical component information in the biochemical component pool is not called by any biochemical process equation set; the biochemical component information called by at least one biochemical process equation set is not contained in the biochemical component pool; the concentration of at least one biochemical component in the biochemical component tank does not meet the requirement of the biochemical process equation set for the concentration of the biochemical component.
In one embodiment of the present disclosure, the method for constructing a digital cell model provided by the present disclosure may further include constructing an illegal rule base, where one or more illegal rules are recorded in the illegal rule base.
In step S130, the iterative simulation process of the digital cell model may be evaluated to determine the state of the digital cell model in the iterative simulation process, and whether the end condition is reached is determined according to the evaluation result. In some embodiments of the present disclosure, the state of the digital cell model may be evaluated according to a process evaluation model library, to determine whether the digital cell model reaches a steady state or reaches a failure state, and further to determine whether an iterative simulation process of the digital cell model reaches an end condition.
In one example, the method of constructing a digital cell model of the present disclosure may further include obtaining a process assessment model library, e.g., constructing a process assessment model library.
In the present disclosure, in order to evaluate the process or state of a digital cell model, the concentration of biochemical components as markers needs to be monitored or data processed. It will be appreciated that the markers may be different in different processes for different purposes. The biochemical components used as markers in each process can be obtained from prior knowledge, for example by searching biological or medical literature, to determine which biochemical components can be used as markers reflecting apoptosis and which biochemical indicators can be used as markers reflecting cell division.
In one embodiment of the present disclosure, the process evaluation model library includes a first process evaluation model that includes legal concentration ranges of a plurality of biochemical components.
In step S120, in the process of iterative simulation of the digital cell model, the biochemical components pool is evaluated using the first process evaluation model. When the evaluation finds that the concentration of one or a plurality of biochemical components in the biochemical component pool as markers exceeds the legal concentration range of the biochemical components, the digital cell model is judged to be in a failure state. Illustratively, the first process assessment model defines a concentration of a particular biochemical constituent that is not less than 0; if the concentration of the specific biochemical components in the updated biochemical component pool is negative, the digital cell model is judged to be in a failure state.
In one example, the first process evaluation model may be used to evaluate each updated biochemical component pool, so as to determine that the digital cell model is in a failure state and terminate the pushing when the concentration of the biochemical components in the updated biochemical component pool does not meet the legal concentration range, and update the initial digital cell model in time to reduce the operand of obtaining the digital cell model and improve the efficiency of obtaining the digital cell model.
Of course, in other examples of the present disclosure, sampling evaluations may also be performed for each updated biochemical component reservoir, e.g., every 3-10 updates to a biochemical component reservoir.
It will be appreciated that in the first process evaluation model, the legal concentration range of the biochemical components may be different from the concentration of the biochemical components in the initial biochemical components cell or may be different from the concentration range of the biochemical components defined in the biochemical components database. The definition of the concentration of the biochemical components in the initial biochemical component pool and the biochemical component database is the definition of the initial conditions of the initial digital cell model, and not the definition of the concentration change of the biochemical components in the iterative simulation process of the digital cell model. In the first process evaluation model, the legal concentration range of the biochemical components is defined as the concentration change of the biochemical components in the iterative simulation process of the digital cell model, so that the current iterative simulation process of the digital cell model is judged to have the situation of obviously not conforming to the biological rule under the condition that the concentration of the biochemical components obviously does not conform to the biological knowledge (for example, negative values occur, for example, extremely high concentration occurs), and then the current iterative simulation of the digital cell model is terminated and the current initial digital cell model is abandoned.
In one embodiment of the present disclosure, the process evaluation model library includes a second process evaluation model including an upper limit on the number of iterations of the digital cell model iterative simulation.
In step S120, the number of iterative simulations of the digital cell model may be evaluated according to the second process evaluation model; and when the iteration simulation times of the digital cell model reach the upper limit of the iteration times, judging that the digital cell model is in a failure state. Illustratively, the second process evaluation model defines an upper limit of 30000 iterations of the iterative simulation of the digital cell model; when the number of iterative simulation times of the digital cell model reaches 30000, judging that the digital cell model is in a failure state. Thus, if the iteration number of the initial digital cell model reaches the upper limit of the iteration number, whether the initial digital cell model is in a steady state or a failure state still cannot be judged, the iteration simulation is terminated in time to verify the new initial digital cell model, and the efficiency of acquiring the target digital cell model is improved.
In one example, the current number of iterations may be evaluated using a second process evaluation model each time the digital cell model completes a simulation, i.e., each time the solution of the last signal path element is completed. Of course, in other examples of the present disclosure, the evaluation may be performed before each start of the solution of the first signal path element, or at other times.
In one embodiment of the disclosure, the process evaluation model library includes a third process evaluation model including at least one concentration variation trend of the biochemical components as markers during the iterative simulation process, such as one or more of a concentration gradual increase, a concentration gradual decrease, a concentration fluctuation, a concentration stabilization after lifting on a flat step section, and the like.
Evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, evaluating historical data of a biochemical component pool according to the third process evaluation model; if the historical data of the biochemical component pool does not meet the third process evaluation model, the digital cell model reaches a failure state.
Further, the third process evaluation model may also detect whether there is a mutation (abrupt change) in the concentration change process of the biochemical components as the markers, specifically, detect whether there is a concentration mutation exceeding a mutation threshold for one or more biochemical components selected in the history data of the biochemical component pool, and regard the concentration mutation exceeding a preset threshold as a sufficient condition (reaching a failure state) to be evaluated without screening. Thus, the initial digital cell model through screening has continuity throughout the iterative simulation process.
In one embodiment of the disclosure, the process evaluation model library includes a fourth process evaluation model including a trend of change in concentration relationships between a plurality of biochemical components as markers in an iterative simulation process; for example, the concentration of some markers increases gradually and the concentration of some markers decreases gradually.
Evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, historical data of a biochemical component pool is evaluated according to the fourth process evaluation model; if the historical data of the biochemical component pool does not meet the fourth process evaluation model, the digital cell model reaches a failure state.
In one embodiment of the disclosure, the process evaluation model library comprises a fifth process evaluation model comprising at least one cell type model, each comprising a reference range of cell phenotype indices related to the concentration of the biochemical component.
In some embodiments of the present disclosure, referring to fig. 5, extraction of core phenotypes may be performed according to published literature, specifically, obtaining cellular phenotype parameters corresponding to these core phenotypes. Illustratively, these cell phenotype parameter reporters are not limited to survival phenotype parameters, proliferation phenotype parameters, apoptosis phenotype parameters, migration phenotype parameters, invasion phenotype parameters, colony formation phenotype parameters, autophagy phenotype parameters, angiogenesis phenotype parameters, epithelial cell interstitial transformation phenotype parameters, and the like. From a combination of these cellular phenotypic parameters, it is thus possible to obtain the cellular phenotypic parameters used to characterize the cell type.
For example, if in order to characterize tumor cells of different subtypes, the core phenotype of a tumor can be extracted from the literature of publicly published tumor aspects; the range of the respective cell phenotype parameters corresponding to the respective tumor cell subtype is then determined.
Evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, evaluating whether the digital cell model meets at least one cell type model after each simulation according to the fifth process evaluation model; and if the digital cell model meets the same cell type model in the continuous repeated iterative simulation process, the digital cell model is in a steady state.
Further, evaluating whether the digital cell model satisfies the at least one cell type model after any one simulation includes: determining a simulated cell phenotype index according to the simulated biochemical component pool; determining a cell type model which meets the cell phenotype index according to the cell phenotype index and a cell phenotype index reference range of each cell type model.
Alternatively, the cellular phenotype index may comprise a plurality of cellular phenotype parameters, any of which are related to the concentration of the biochemical constituent as a marker (e.g., to an increase or decrease in concentration). Accordingly, the reference range for any one of the cellular phenotype indices includes a reference range for a plurality of cellular phenotype parameters. When each cell phenotype parameter (collectively, as a cell phenotype index) determined from historical data or current data of the biochemical constituent pool is capable of satisfying a reference range of each cell phenotype parameter for a particular cell phenotype index, the digital cell model assumes a particular cell type.
Optionally, the cellular phenotype parameter comprises a plurality of a survival phenotype parameter, a proliferation phenotype parameter, an apoptosis phenotype parameter, a migration phenotype parameter, an invasion phenotype parameter, a clone formation phenotype parameter, an autophagy phenotype parameter, an angiogenesis phenotype parameter, an epithelial cell interstitium transformation phenotype parameter. The markers employed for each cell phenotype parameter may be obtained from medical or biological literature. It is understood that as biological knowledge is accumulated and cognition of cells is deepened, new cellular phenotype parameters may also be constructed for application to the cellular phenotype index of embodiments of the present disclosure.
In one embodiment of the present disclosure, the fifth process evaluation model may include only one cell type model having a reference range of cell phenotype indices for the cell type of interest. The digital cell model may exhibit a target cell type when a cell phenotype index determined from a pool of biochemical components of the digital cell model meets the cell phenotype index reference range. A digital cell model reaches steady state when it exhibits a target cell type in multiple iterations, e.g., the cellular phenotype index is substantially the same in successive iterations and meets the cellular phenotype index reference range.
In one example, the fifth process evaluation model includes a cell type model for modeling normal cells. Thus, if the cell phenotype index of the biochemical constituent pool at which the digital cell model reaches steady state can meet the above-described cell phenotype index reference range, the digital cell model appears to be a normal cell, particularly a wild-type (wildtype) normal surviving cell, at which steady state is reached.
In another embodiment of the present disclosure, the fifth process evaluation model comprises a plurality of multiple different cell type models, the different cell type models having a reference range of cell phenotype indices for different cell phenotypes. A digital cell model reaches steady state when it is capable of satisfying any one and the same cell type model in successive iterations. In other words, the evaluation process is non-targeted. In this manner, the initial digital cell model screened by the cell type model can exhibit or reach one of the cell types during the iterative simulation process. In a subsequent application, the application scenario of the obtained digital cell model may be determined according to the specific cell type model through which the initial digital cell model was passed.
In yet another embodiment of the present disclosure, the fifth process evaluation model may include a plurality of different cell type models, the different cell type models having a reference range of cell phenotype indices for different cell phenotypes; one particular cell type model is labeled as the target cell type model. When the iterative simulation process of the digital cell model is evaluated by the fifth process evaluation model, each cell type model is adopted to evaluate the biochemical component pool after each simulation. When the digital cell model meets the target cell type model in the continuous repeated iteration simulation process, the digital cell model is in a steady state. When the digital cell model meets the same cell type model except the target cell type model in the continuous repeated iteration simulation process, a cell type label is added to the current initial digital cell model, and the cell type label can be stored as a standby digital cell model. In a subsequent application, the application scenario of the spare digital cell model may be determined according to the cell types that can be reached by the spare digital cell model. In this way, the method for constructing the digital cell model can acquire the target digital cell model and can acquire the standby digital cell model applicable to other application scenes, or can accelerate the construction speed of the digital cell model applicable to other application scenes.
In some embodiments of the present disclosure, after the digital cell model is evaluated as a steady state during iterative simulation, determining a target digital cell model from the current initial digital cell model comprises:
and after the digital cell model reaches a steady state in the iterative simulation process, determining the current initial digital cell model as a target digital cell model.
In other embodiments of the present disclosure, after the digital cell model reaches a steady state during iterative simulation, determining a target digital cell model from the current initial digital cell model comprises:
after the digital cell model reaches a steady state in the iterative simulation process, determining the current initial digital cell model as a candidate digital cell model;
performing verification and evaluation on the candidate digital cell model according to the verification model library; updating the initial digital cell model when the candidate digital cell model does not satisfy the validation model library. Therefore, whether the candidate digital cell model can realize a specific function or not or whether the simulation precision and accuracy meet the requirements can be judged by performing a verification mode on the candidate digital cell model, so that the candidate digital cell model has a better application effect.
Referring to fig. 3, in some embodiments of the present disclosure, the validation model library includes a mutant intervention model including at least one mutant intervention sub-model; any mutation intervention sub-model comprises mutation information, exogenous information and intervention result information; the mutation information comprises at least one component mutation information and at least one biochemical process change information; the component mutation information is used for describing the change of a biochemical component pool caused by mutation in cells; biochemical process change information is used to describe changes in biochemical processes caused by mutations within the cell; the exogenous information comprises at least one exogenous component information and at least one exogenous component related process equation set; the exogenous component information includes a concentration of an exogenous component; the exogenous component related process equation set is used for simulating the concentration change of the biochemical components after a time step caused by adding exogenous components; the intervention result information comprises a classification label and a reference range of at least one intervention evaluation index, wherein the intervention evaluation index is used as a marker and is related to the concentration change of a biochemical component;
the step of verifying and evaluating the candidate digital cell model according to the verification model library comprises the step of verifying and evaluating the candidate digital cell model according to the mutation intervention model; validating the candidate digital cell model according to the mutant intervention model comprises: validating the candidate digital cell model according to at least one of the mutant intervention sub-models; validating the candidate digital cell model according to any one of the mutant intervention sub-models comprises:
Mapping the mutation information to the candidate digital cell model to construct a digital mutant cell model;
the digital mutant cell model carries out iterative simulation, a biochemical component pool of the digital mutant cell model reaching a steady state is obtained to serve as mutant steady state data, and the digital mutant cell model reaching the steady state is taken as a steady state digital mutant cell model;
constructing a mutant intervention digital cell model according to the steady-state digital mutant cell model and the exogenous information;
the mutation intervention digital cell model carries out iterative simulation to obtain a biochemical component pool of the mutation intervention digital cell model reaching a steady state as mutation intervention data;
acquiring biochemical component pools of the candidate digital cell model after iterative simulation reaches a steady state as basic steady state data;
determining an intervention evaluation index according to the basic steady-state data, the mutation steady-state data and the mutation intervention data;
classifying according to the intervention evaluation index and the reference range of the intervention evaluation index; and when the classification result is consistent with the classification label, the candidate digital cell model is evaluated through verification of the mutation intervention sub-model.
Thus, the mutant intervention model can evaluate the ability of the candidate digital cell model to construct a digital mutant cell model, and the ability to return to normal cell types in response to external intervention. This ability reflects the onset and therapeutic course of some diseases (e.g., tumors), i.e., normal cells mutate into tumor cells that are mutated cells, and tumor therapy is achieved by drug intervention into the tumor cells. If the candidate digital cell model can be evaluated by a mutation intervention model, the candidate digital cell model can construct an individualized digital mutation cell model through individualized tumor cells of a patient, and individualized anti-tumor drug evaluation is carried out, so that the treatment effect is improved and the treatment process is accelerated. Of course, by adopting mutation information, exogenous information and intervention result information of the corresponding diseases to evaluate, the candidate digital cell model through evaluation also has good application prospect in the treatment of other diseases.
In embodiments of the present disclosure, a drug refers to a chemical component or combination of chemicals that has a therapeutic or intervention effect on a disease. For example, drugs may have different classifications for different diseases. In the field of antitumor, the drug may include, but is not limited to, targeted antitumor drugs, cytotoxic drugs, cellular immunomodulators, cellular epigenetic modulators, and the like. It is understood that not all drugs are required to be administered or have sufficient efficacy in embodiments of the present disclosure, and that some chemical components or compositions having the potential to be called drugs or having the potential to be a patent drug may be used as drugs in embodiments of the present disclosure. For example, if a drug for a particular disease is developed based on a digital cell model provided by embodiments of the present disclosure, the drug employed may be brand-new rather than clinically validated or approved, e.g., a potentially useful and unverified compound may be employed, and specific validation may be performed based on the digital cell model provided by embodiments of the present disclosure to determine the compound as a possible therapeutic effect in treating the particular disease, thereby promoting or accelerating the development or screening of therapeutic drugs for the disease.
Alternatively, the mutation information is constructed by the actual mutant cells, for example by tumor cells.
In one embodiment of the present disclosure, mutation information may be constructed by mutating multiple sets of cytological data of the cell. In this example, the mutation information may be a multi-set of chemical databases of mutated cells.
In one embodiment of the disclosure, the method for constructing a digital cell model further includes:
establishing a mutation intervention sub-model according to an actual sample, wherein the actual sample comprises data obtained by in-vitro drug susceptibility experiments by using a specific type of mutant cells or data obtained according to clinical processes;
wherein biochemical component information within the specific type of mutant cell or biochemical component information of the mutant cell involved in a clinical process is used to determine component mutation information in the mutation information; the drug information used in the in vitro drug sensitivity experiment or the drug information related to the clinical process is used for determining exogenous component information in the intervention result information; the results of the in vitro drug susceptibility test or the clinical procedure are used to determine a classification signature (e.g., a valid signature or a non-valid signature) of the intervention outcome information.
In one example, the particular type of mutant cell is a tumor cell; the drug information is targeted anti-tumor drug information.
Optionally, in the iterative simulation process of the digital mutant cell model, when the digital mutant cell model presents the same cell type in the continuous multiple simulations, the digital mutant cell model reaches a steady state, the current digital mutant cell model is used as a steady state digital mutant cell model, and the biochemical component pool of the current digital mutant cell model is used as mutation steady state data.
Optionally, in the iterative simulation process of the mutation intervention digital cell model, when the mutation intervention digital cell model presents the same cell type in continuous multiple simulations, the mutation intervention digital cell model reaches a steady state, and a biochemical component pool of the current mutation intervention digital cell model is used as mutation intervention data.
Optionally, constructing a mutant intervention digital cell model from the steady state digital mutant cell model and the exogenous information comprises:
adding exogenous component information in the exogenous information to a biochemical component pool of a steady-state digital mutant cell model to form a biochemical component pool of a mutant intervention digital cell model; and adding the exogenous component related process equation set in the exogenous information to a signal path unit of the steady-state digital mutant cell model to form a signal path unit of the mutant intervention digital cell model.
In one embodiment of the present disclosure, the exogenous information may be one drug sample information. In other words, drug sample information may be constructed from a digital drug library and actual sample information. In one example, the digital drug library includes data sets of different drugs; the data set for each drug is constructed based on at least one of drug action targets, approved indications, safe dose ranges, pharmacokinetic parameters, activity parameters, side effects. A drug sample information may be formed as an exogenous information for the mutant intervention sub-model based on the drug used in the actual sample and its concentration or dose, as well as the dataset of drugs in the digital drug library.
In one embodiment of the present disclosure, the intervention evaluation index comprises a phenotype reversion score;
determining an intervention assessment index from the basal steady-state data, the mutant steady-state data, and the mutant intervention data comprises:
determining a normal cell phenotype index based on the concentration of the biochemical component as a marker in the basal steady state data;
determining a mutant cell phenotype index based on the concentration of the biochemical component as a marker in the mutant steady state data;
Determining a phenotype index of the cell after intervention based on the concentration of the biochemical component as a marker in the mutant intervention data;
determining a phenotypic abnormality index based on the difference between the mutant cellular phenotype index and the normal cellular phenotype index;
determining a phenotype-reversing index based on the difference between the cellular phenotype index and the mutant cellular phenotype index after the intervention;
determining a phenotype reversion score based on the phenotype reversion index and the phenotype abnormality index;
when the phenotype reversion score does not meet the reference range of phenotype reversion scores, then the candidate digital cell model is not assessed by verification of the mutant intervention sub-model. Thus, candidate digital cell models evaluated by the mutant intervention sub-model have a higher degree of fit when modeling mutant cells and responses of mutant cells to drugs.
Referring to fig. 8, the embodiment of the present disclosure further provides a device for constructing a digital cell model UA, including:
a building module UA1 configured to build an initial digital cell model based on biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
A simulation module UA2 configured to perform iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells;
a judging module UA3 configured to judge whether the digital cell model reaches a steady state or a failure state in the iterative simulation process; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
In one embodiment of the disclosure, the determining module UA3 is further configured to update the initial digital cell model when the digital cell model reaches a failure state, until the digital cell model reaches a steady state, during the iterative simulation.
Further, the judging module UA3 is further configured to evaluate the state of the digital cell model according to the process evaluation model library, and judge whether the digital cell model reaches the steady state or the failure state.
In one embodiment of the disclosure, the digital cell model constructing apparatus UA further includes a verification module UA4, where the verification module UA4 is configured to determine, after the digital cell model reaches a steady state in the iterative simulation process, a current initial digital cell model as a candidate digital cell model; performing verification and evaluation on the candidate digital cell model according to the verification model library; updating the initial digital cell model when the candidate digital cell model does not satisfy the validation model library.
Referring to fig. 9, the embodiment of the present disclosure further provides a digital cell system, where the digital cell system includes the device UA for constructing a digital cell model, and further includes:
a digital cell engine M1 for running a digital cell model;
the data analysis engine M2 is used for constructing a multi-group chemical database of the mutant cells according to the multi-group chemical data of the mutant cells;
a digital drug library M3 for providing drug information;
a data mapping engine M4 for mapping drug information in the multiple sets of chemical databases and/or the digital drug library to the digital cell model;
and the drug effect analysis engine M5 is used for predicting the drug effect according to the operation result of the digital cell engine.
In one example, the construction means UA of the digital cell model of the digital cell system may acquire a digital cell model, for example a normal digital cell model of a normal cell, which may be run in the digital cell engine M1. The data analysis engine M2 may receive the multiple sets of chemical data for the mutant cells and construct a multiple set of chemical databases for the mutant cells. For example, the data analysis engine M2 may receive multiple sets of data for tumor cells of a tumor patient and construct a personalized multiple sets of data base for tumor cells of the tumor patient based on the multiple sets of data. After the digital cell engine M1 runs the normal digital cell model until the digital cell model reaches a steady state, the data mapping engine M4 can map the multiple-genetics database constructed by the data analysis engine M2 to the normal digital cell model reaching the steady state to obtain a digital mutant cell model; after the digital mutant cell model runs in the digital cell engine M1 to reach a steady state, the data mapping engine M4 can map the drug information in the digital drug library M3 into the digital mutant cell model reaching the steady state to obtain a mutant cell model of digital drug intervention; after the digital cell model with the intervention of the digital medicine is operated in the digital cell engine M1 to reach a steady state, the medicine effect analysis engine M5 judges the treatment effect of the medicine on the mutant cell according to the operation result of the digital cell engine M1.
In one example, the simulation module of the digital cell model constructing apparatus UA may be the same module as the data analysis engine M2 when performing iterative simulation.
In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method for constructing a digital cell model is also provided.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, various aspects of the disclosure may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 1000 according to such an embodiment of the present disclosure is described below with reference to fig. 10. The electronic device 1000 shown in fig. 9 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 10, the electronic device 1000 is embodied in the form of a general purpose computing device. Components of electronic device 1000 may include, but are not limited to: the at least one processing unit 1010, the at least one memory unit 1020, a bus 1030 connecting the various system components (including the memory unit 1020 and the processing unit 1010), and a display unit 1040.
Wherein the storage unit stores program code that is executable by the processing unit 1010 such that the processing unit 1010 performs steps according to various exemplary embodiments of the present disclosure described in the above section of the present specification.
The memory unit 1020 may include readable media in the form of volatile memory units such as Random Access Memory (RAM) 10201 and/or cache memory unit 10202, and may further include Read Only Memory (ROM) 10203.
The storage unit 1020 may also include a program/utility 10204 having a set (at least one) of program modules 10205, such program modules 10205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 1030 may be representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 1000 can also communicate with one or more external devices 1100 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 1000, and/or with any device (e.g., router, modem, etc.) that enables the electronic device 1000 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 1050. Also, electronic device 1000 can communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 1060. As shown, the network adapter 1060 communicates with other modules of the electronic device 1000 over the bus 1030. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 1000, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a terminal device, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible implementations, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the disclosure as described in the "exemplary methods" section of this specification, when the program product is run on the terminal device.
A program product for implementing the above-described method according to an embodiment of the present disclosure may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Furthermore, the above-described figures are only schematic illustrations of processes included in the method according to the exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily appreciated that the processes shown in the above figures do not indicate or limit the temporal order of these processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, for example, among a plurality of modules.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (16)

1. A method of constructing a digital cell model, comprising:
constructing an initial digital cell model based on the biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
performing iterative simulation on the initial digital cell model to simulate a biochemical process occurring in the biological cells;
in the iterative simulation process, judging whether the digital cell model reaches a steady state or a failure state; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
2. The method for constructing a digital cell model according to claim 1, wherein the determination is made as to whether the digital cell model has reached a steady state or a failure state:
And when the digital cell model is subjected to iterative simulation, evaluating the state of the digital cell model according to a process evaluation model library, and judging whether the digital cell model reaches a steady state or a failure state.
3. The method of constructing a digital cell model according to claim 2, wherein the process evaluation model library comprises a first process evaluation model comprising legal concentration ranges of a plurality of biochemical components;
the evaluating the state of the digital cell model according to the process evaluation model library comprises the following steps:
in the process of iterative simulation of the digital cell model, evaluating the biochemical component pool according to the first process evaluation model; and when the concentration of at least one biochemical component in the biochemical component tank exceeds the legal concentration range of the biochemical component, the digital cell model reaches a failure state.
4. The method for constructing a digital cell model according to claim 2, wherein the process evaluation model library includes a second process evaluation model including an upper limit of the number of iterations of the digital cell model iterative simulation;
evaluating the state of the digital cell model according to the process evaluation model library comprises:
In the process of iterative simulation of the digital cell model, evaluating the iterative simulation times of the digital cell model according to the second process evaluation model; and when the iteration simulation times of the digital cell model reach the upper limit of the iteration times, the digital cell model reaches a failure state.
5. The method for constructing a digital cell model according to claim 2, wherein the process evaluation model library includes a third process evaluation model including a concentration variation trend of at least one biochemical component as a marker in an iterative simulation process;
evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, evaluating historical data of a biochemical component pool according to the third process evaluation model; if the historical data of the biochemical component pool does not meet the third process evaluation model, the digital cell model reaches a failure state.
6. The method for constructing a digital cell model according to claim 2, wherein the process evaluation model library includes a fourth process evaluation model including a trend of variation in concentration relation among a plurality of biochemical components as markers in an iterative simulation process;
Evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, historical data of a biochemical component pool is evaluated according to the fourth process evaluation model; if the historical data of the biochemical component pool does not meet the fourth process evaluation model, the digital cell model reaches a failure state.
7. The method of constructing a digital cell model according to claim 2, wherein the process assessment model library comprises a fifth process assessment model comprising at least one cell type model, each of the cell type models comprising a reference range of cell phenotype indices; the cytophenotype index is correlated with the concentration of the biochemical component that is the marker;
evaluating the state of the digital cell model according to the process evaluation model library comprises:
in the process of iterative simulation of the digital cell model, evaluating whether the digital cell model meets at least one cell type model after each simulation according to the fifth process evaluation model; and if the digital cell model meets the same cell type model in the continuous repeated iterative simulation process, the digital cell model is in a steady state.
8. The method of constructing a digital cell model according to claim 7, wherein evaluating whether the digital cell model satisfies at least one cell type model after any one simulation comprises:
determining a simulated cell phenotype index according to the simulated biochemical component pool;
determining a cell type model which is satisfied by the cell phenotype index according to the simulated cell phenotype index and a reference range of the cell phenotype index of each cell type model.
9. The method of constructing a digital cell model according to claim 7, wherein the cell phenotype index comprises a plurality of cell phenotype parameters, any one of which is related to a concentration of a biochemical component as a marker; the reference range for any one of the cellular phenotype indices includes a plurality of reference ranges for the cellular phenotype parameters.
10. The method of constructing a digital cell model according to claim 9, wherein the cellular phenotype parameters comprise a plurality of survival phenotype parameters, proliferation phenotype parameters, apoptosis phenotype parameters, migration phenotype parameters, invasion phenotype parameters, clone formation phenotype parameters, autophagy phenotype parameters, angiogenesis phenotype parameters, epithelial cell interstitial transformation phenotype parameters.
11. The method of constructing a digital cell model according to claim 7, wherein the fifth process evaluation model includes a cell type model for simulating normal cells.
12. The method of constructing a digital cell model according to claim 7, wherein the fifth process evaluation model comprises a plurality of cell type models, and wherein one cell type model is labeled as a target cell type model;
when the digital cell model meets the target cell type model in the continuous repeated iteration simulation process, the digital cell model is in a steady state;
and when the digital cell model meets the same cell type model except the target cell type model in the continuous repeated iteration simulation process, adding a cell type label to the current initial digital cell model.
13. A digital cell model building apparatus comprising:
a construction module configured to construct an initial digital cell model based on the biochemical information; the digital cell model comprises a biochemical component pool and a plurality of signal path units; the biochemical component pool comprises a plurality of biochemical component information, and the biochemical component information comprises concentration and/or position information of biochemical components; the signal path unit is used for simulating a signal path of the biological cell; the signal path unit comprises at least one biochemical reaction module, and the biochemical reaction module is used for simulating a biochemical process occurring in the signal path unit by using a biochemical process equation set;
The simulation module is configured to perform iterative simulation on the initial digital cell model so as to simulate a biochemical process occurring in biological cells until the digital cell model reaches a steady state;
the judging module is configured to judge whether the digital cell model reaches a steady state or a failure state in the iterative simulation process; updating the initial digital cell model when the digital cell model reaches a failure state; and after the digital cell model reaches a steady state, determining a target digital cell model according to the current initial digital cell model.
14. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the method of constructing a digital cell model according to any one of claims 1 to 12.
15. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the method of constructing a digital cell model of any one of claims 1 to 16 via execution of the executable instructions.
16. A digital cell system comprising the digital cell model building apparatus of claim 13, further comprising:
A digital cell engine for running a digital cell model;
the data analysis engine is used for constructing a multi-group chemical database of the mutant cells according to the multi-group chemical data of the mutant cells;
a digital drug library for providing drug information;
a data mapping engine for mapping drug information in the multiple sets of chemical databases and/or the digital drug library to the digital cell model;
and the pharmacodynamic analysis engine is used for predicting the curative effect of the medicine according to the operation result of the digital cell engine.
CN202210616258.9A 2022-05-31 2022-05-31 Method and device for constructing digital cell model, medium, equipment and system Pending CN117198381A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210616258.9A CN117198381A (en) 2022-05-31 2022-05-31 Method and device for constructing digital cell model, medium, equipment and system
PCT/CN2022/115811 WO2023231202A1 (en) 2022-05-31 2022-08-30 Method and apparatus for constructing digital cell model, medium, device, and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210616258.9A CN117198381A (en) 2022-05-31 2022-05-31 Method and device for constructing digital cell model, medium, equipment and system

Publications (1)

Publication Number Publication Date
CN117198381A true CN117198381A (en) 2023-12-08

Family

ID=88985667

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210616258.9A Pending CN117198381A (en) 2022-05-31 2022-05-31 Method and device for constructing digital cell model, medium, equipment and system

Country Status (1)

Country Link
CN (1) CN117198381A (en)

Similar Documents

Publication Publication Date Title
Jin et al. Inference and analysis of cell-cell communication using CellChat
Pei et al. WGCNA application to proteomic and metabolomic data analysis
Gonçalves et al. Pan-cancer proteomic map of 949 human cell lines
Wang et al. Boolean modeling in systems biology: an overview of methodology and applications
Vlasblom et al. Markov clustering versus affinity propagation for the partitioning of protein interaction graphs
Lee et al. Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection
EP2864919B1 (en) Systems and methods for generating biomarker signatures with integrated dual ensemble and generalized simulated annealing techniques
KR101606160B1 (en) Interaction prediction device, interaction prediction method, and program
Heringa Local weighting schemes for protein multiple sequence alignment
Pritykin et al. Simple topological features reflect dynamics and modularity in protein interaction networks
US8068994B2 (en) Method for analyzing biological networks
CA3154621A1 (en) Single cell rna-seq data processing
Crowell et al. Learning and imputation for mass-spec bias reduction (LIMBR)
Erten et al. Phylogenetic analysis of modularity in protein interaction networks
Angaroni et al. J-SPACE: a Julia package for the simulation of spatial models of cancer evolution and of sequencing experiments
Yosef et al. A complex-centric view of protein network evolution
CN117198381A (en) Method and device for constructing digital cell model, medium, equipment and system
CN117198380A (en) Method and device for constructing digital cell model, medium, equipment and system
Lucas et al. Cross-study projections of genomic biomarkers: an evaluation in cancer genomics
CN117198388A (en) Method and device for constructing digital mutant cell model, medium, equipment and system
CN117198387A (en) Drug efficacy prediction method and device, medium and equipment
Wheelock et al. Forecasting labels under distribution-shift for machine-guided sequence design
WO2023231203A1 (en) Drug efficacy prediction method and apparatus based on digital cell model, medium, and device
WO2023231202A9 (en) Method and apparatus for constructing digital cell model, medium, device, and system
Prokop et al. Systems biology in biotech & pharma: A changing paradigm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination